Using learning of behavior to test multi-agent systems
Testing is an important part of the software development process. While the first goal of testing obviously is to establish that a system is doing what it was build to do, figuring out if a system additionally does things that it is not supposed to do is a task that becomes more and more important. For example, we see more and more systems deployed that interact with users that might be interested in harming the system or the environment in which a system is deployed. And such users will try to find unexpected system behavior and then to find ways how to use such behavior for their purposes. In competitive multi-agent systems we also face this situation in form of agents that try to maximize their utility using whatever means possible (or at least everything that is not explicitly forbidden).
In multi-agent systems one of the aims of many cooperation concepts is to produce so-called emergent behavior, which means that it is not easy to determine the behavior of the whole system when just looking at the individual behavior of its components (i.e. agents). And establishing that there is no emergent misbehavior becomes very important!
Determining that a system will not violate the usually rather vague specifications with regard to what should not happen is a very difficult task. In fact, theoretical results already tell us that in general it is not possible to predict if even very well-defined events will ever happen in a system (although for special cases and special events this can be possible). This means that the best we can do by testing is to try to be one step ahead of those who want to abuse our system. This puts a lot on the shoulders of a tester and there is an urgent need for support of these testers in form of tools.
We think that learning of cooperative behavior for agents offers the basis for such testing tools that help a tester to find emergent misbehavior of a system. By seeing users and other systems a system to test interacts with as agents (the Agattack agents in the picture below) for which we want to find a cooperative behavior (in their interactions with the system, represented by the Agtested agents below, and its environment, Env in the picture below) that results in the tested system showing an unwanted behavior, we can apply our previous ideas to create tools that help the testers and make the testing quality a little bit more predictable. The following picture shows the general set-up of a learning system to test a system consisting of several agents:
The Agbyst agents represent systems that the system to test interacts with that we either can not or do not want to substitute by our attack agents.
The general cycle our testing system goes through is as follows:
There are several problems to solve, if we want to use learning of behavior for testing of systems. Since we use agents to act as users of the system, we need to determine a suitable agent architecture. In our current experimental systems, we are using simple sequences of actions that are performed by the agent in the given order whenever it has to perform an action.
The second problem we have to solve is how to create strategies for the agents, so that over time we create strategies that bring us nearer and nearer to breaking the tested system. Here we use so-called evolutionary learning, that works on a set of agent strategies (each element of the set contains exactly one strategy for each Agattack agent). Initially, these strategies are created randomly out of the available actions for each agent. Then each element in the set is evaluated resulting in a so-called fitness-value. Elements with higher fitness values then have a higher probability to be selected as "parents" for creating new elements (which resembles the recombination of DNA in nature, hence the name evolutionary learning). This is repeated until we either break the tested system or our alloted time for testing is over.
The third problem we have to deal with is how to evaluate our attack agents (respectively the elements of our evolutionary learning method). Following the ideas we used in the OLEMAS system, we look at the tested system after each interaction with the attack agents and we evaluate how near the tested system is to show the unwanted behavior. These evaluations are then added up and the sum is our fitness measure.
For more details about our work on testing systems for unwanted (emergent) behavior, including applications in testing computer game AIs, harbour security policies, self-organizing, self-adapting emergent multi-agent systems, and ad-hoc wireless networks, please refer to the papers cited on our bibliography page. A list of the persons that are or were involved in this research can be found here. You might also be interested in our work on the OLEMAS system.
Last Change: 5/12/2013