Sphericall: A Human/Artificial Intelligence interaction experience

— Multi-agent systems are now wide spread in scientific works and in industrial applications. Few applications deal with the Human/Multi-agent system interaction. Multi-agent systems are characterized by individual entities, called agents, in interaction with each other and with their environment. Multi-agent systems are generally classified into complex systems categories since the global emerging phenomenon cannot be predicted even if every component is well known. The systems developed in this paper are named reactive because they behave using simple interaction models. In the reactive approach, the issue of Human/system interaction is hard to cope with and is scarcely exposed in literature. This paper presents Sphericall, an application aimed at studying Human/Complex System interactions and based on two physics inspired multi-agent systems interacting together. The Sphericall device is composed of a tactile screen and a spherical world where agents evolve. This paper presents both the technical background of Sphericall project and a feedback taken from the demonstration performed during OFFF Festival in La Villette (Paris).


I. INTRODUCTION
ULTI-AGENT systems are now widespread in scientific works and in industrial applications. They are characterized by individual entities, called agents, in interaction with each other and with their environment. Each agent is autonomous. It behaves following a set of rules that can be based on a complex representation of individual goals (cognitive agents) or based on simple stimulus/response local actions (reactive agents). In this context, local phenomena (interaction, behaviours...) lead together to a global system response that can be defined as intelligent. Multi-agent systems are generally classified into complex systems. The emerging phenomena cannot be predicted even if every component is well known.
Few articles deal with the Human/Reactive multi-agent system interaction issue. However, some recent works that deal with this issue in various contexts such as Human activity recognition [22] or Human/multiple robots interactions definition [23], start to appear. This scarce representation of this issue in literature is mainly due to the complex character of these kinds of systems where the global emergent properties are not easily predictable. In these kinds of applications, the main problem is to determine at which the level (local or global) the Human/agency interaction must take place. The local Human/agent interaction is easy to set up but its influence on the agency is hard to determine/predict. A global Human/agency interaction is hard to put into practice but is more easily predictable. Moreover, this interaction can be direct, modification of the agent behaviours, or indirect by modifying the environment perceived by the agents.
Sphericall has been developed to study the link between the Human being and the agency. It can be considered as a Human/Artificial Intelligence interaction experience, which puts the focus on several sensitive abilities (visual, tactile, and hearing).
The Sphericall device is composed of two main elements: • A tactile surface aimed at modifying the music (effect, volume, pan,...) diffused to the intelligent system.
• The work of art, as itself, which emerges from interaction between music, which is controlled by a Human, and a reactive multi-agent system.
Agents, spread on a sphere, are autonomous entities which build/destroy skyscrapers, organic trees... depending on their musical perception. The artist can influence, but not totally control, the work of art by modifying the sounds and the music, which is sent to the system. This paper presents both the technical background of the Sphericall project and a feedback taken from the demonstration performed during OFFF Festival (Online Flash-Film-Festival) and from a poll made among students, which have the habit of manipulating multi-agent systems.
The paper is structured as follows. First, section II draws a state of the art of multi-agent and of the Human/multi-agent system interaction issue. Then, section III will present the technical aspects of Sphericall project, dealing with the interactive interface on the one side and with the intelligent system on the other. Then, section IV exhibits results obtained after OFFF festival demonstration in La Villette (France). Finally, section V concludes by giving some future work.

II. STATE OF THE ART
A. Multi-agent systems Since a couple of decades multi-agent systems have been used in a wide range of problem solving, modelling and simulation applications. These approaches are characterized by their capability to solve complex problems, while maintaining functional and conceptual simplicity of involved entities called agents. In many cases, multi-agent based approaches exhibit effectiveness in various fields such as life simulation [24], crowd simulation, robots cooperation [25] or vehicle control related to devices such as obstacle avoidance systems. The multi-agent systems design generally focuses on agents' definition (internal states, perception and behaviour,...) and/or on the interactions between agents and their environment using biological [19], [26], [27], [28] or physical inspiration sources [29], [30], [31]. One can find two main trends in multi-agent design: the cognitive and the reactive approaches. The cognitive approaches focus mainly on the agent definition and design. In this context, each agent is defined with high level reasoning capabilities and interacts with its mates in using high-level interactions such as explicit communication for instance. Among these approaches one can cite the consensus methods [41] or the belief-desire-intention (BDI) agents as used in [42]. Cognitive agent systems rely generally on a small number of agents. By contrast, reactive agent approaches are based on numerous agents, with small cognitive abilities (generally based on simple stimulus-response behaviours), and interacting intensively with each other and with their evolving area named environment. The role of the environment and its characteristics (dynamics, topology,...) are crucial in reactive approaches. As it has been explained in [32], [33], [34] the environment plays a key role in reactive multi-agent systems. Indeed, a reactive agent can neither handle a representation of the global goal of the system nor compute a solution to reach it. The environment can thus be considered as the place where the system computes, builds and communicates. Then, one can say that the intelligence of the system is not contained into the population of agents but emerges from the numerous interactions between agents and with their environment. This notion of emergence is central in reactive multi-agent systems and explains the interest of such systems for complex system control, observation or simulation. In [35], a system is defined to present emergent properties when phenomena appear dynamically on a macroscopic point of view as a result of interactions between system components at microscopic level.
Moreover one can find several definition of emergence from the nominal emergence to the weak emergence and the strong emergence [36]. The main problem encountered is linked to the evaluation, measurement and prediction of emerging organization and/or properties. On the Human/system interactions point of view, the notion of emergence is the key element. Indeed, the challenge of designing a control interface for complex system relies on the ability to propose to the user an abstract interface, which enables him to manipulate and to understand the evolution of the system without knowing the interaction that occurs at microscopic level.

B. The Human/Multi-agent system interaction issue
The Human/multi-agent system interaction problem, and more generally, the Human/complex system interaction problem is a tough issue, which has been dealt with for a couple of years [1]. In multi-agent systems, one can consider two different categories depending on the reactive/cognitive aspect of the considered agents as described in the previous paragraph. The Human interaction, from the cognitive agent point of view, is more natural and easy to analyse. Since the cognitive approach tends to design agents which behave using high-level reasoning, decisional and/or perceptive abilities, it is then logical to consider the behaviour of the interacting Human at the same level of intelligence as one agent in [2]. Another way to specify the Human/Agency interaction is to consider the Human as a supervisor able to interpret the information furnished by each agent [3] or to translate Human gestures into control primitives [4]. The key indicator in such systems is the fan-out of a Human-agents team as defined by Olsen and Wood in [37], [38] to be the number of agents that a Human can control simultaneously. The examples, found in literature, deal mainly with Human-multiple robot interaction/control [23], [39]. In this context the fan-out for a Human/robots team can reach 18 homogenous robots [40].
In the reactive approach this issue is harder to cope with, since the number of agents involved can be as many as hundreds of elements. Indeed, the reactive multi-agent systems are based on numerous agents, the behaviours of which are triggered by numerous interactions. Generally, such systems are considered to be complex as referred to the definition given in [8]. Thus, it's hard to interact with the system because its complex nature makes its understanding impossible even if all local aspects are well known. In this situation, the external interaction has to be linked to the emergent properties because the influence is not directly measurable. In [9], several interaction strategies are defined. The Human/complex system interactions can be made by explicit control or by implicit cooperation. Explicit cooperation correspond to direct interactions with the local element of the system such as agents' behaviours or agent-agent interaction mechanisms. Implicit cooperation can be considered to indirect interaction through modification of the agents' environment. The feedback of these interactions is always made through global and indirect indicators. Finally, [10] studies the relation that can be brought to Humans by swarm systems.
Thus, one can separate the interaction effectors and the feedback representation on the one side and the complex system on the other. Effectors and feedbacks are abstractions of the real system for a better Human understandability. For instance, when driving a car, we manipulate abstract effectors (wheel, pedals...), which have a direct or indirect influence on the global system (engine, gearbox, wheels, tyres...). In this example, the feedback is made through a Human perception of the car behaviour. Following this two-side separation concept, the device presented in this paper is split into a tactile device, which plays the abstract effector role and the Sphere, which represents a visual feedback of what happens in the multiagent system.

III. PRINCIPLE
As previously said, Sphericall is composed of two devices.
 A tactile device, based on a multipoint capacitive screen. This screen can be considered as a mixing interface used by the Human so as to interact indirectly with the agency by modifying music characteristics (volume, pan...).  A video screen representing a 3D sphere, which is the work of art built thanks to Human/multi-agent system interactions. The next sections will describe in detail these two elements.

A. Interactive Interface 1) Technical tools
The tactile interactive interface is based on two libraries developed by Tharsis Software: SimpleSound and SimpleUI.
SimpleSound is a library aimed at managing sound devices. It provides programming elements to develop real time mixing tools. Thanks to this library several audio files can be read at the same time (In this case, the audio files are merged into an audio group). Their characteristics (volume level, pan...) can be modified during the reading of audio files as it can be made with a classical hardware or a software-mixing console. In addition, effects and information filters can be added. Information filters allow specific information on the signal such as output level, Fourier transform, band pass... to be obtained. SimpleUI is a graphic library developed by Tharsis Software (see http://www.tharsis-software.com/ for more details) and based on OpenSceneGraph (OSG). This library allows adding, removing and manipulating various types of widgets such as buttons, images... For this project a physical layer, using Box2d, has been added in order to provide widgets with coherent physical behaviours such as inertia, collision management...

2) Appearance and behaviours
In the designed mixing interface, a circle represents each channel. Channel circles are grouped into a Group Channel. The volume of a circle is linked to its vertical position, its horizontal position defining the stereo position of the audio source (pan left/right). A short touch on a circle triggers the activate on/off function. Each group Channel has its own colour (blue and green for keyboards, bass and drums, pink and orange for the orchestra and the voices). The final interface used for the demonstration is composed of 21 channels spread into 5 groups. The circle can interact with each other through collisions. Thus, one can send one group in the direction of another. When the collision occurs the groups react as snooker balls, which collide each other and involve changes in volume and pan position. The same interaction can be made with channel circles inside each group (cf. Figure  1). Sound effects are represented by little coloured square buttons. The activation of them is the same as the one for the circles. The position of the square button in the interface field is linked to two parameters specific to each effect. Finally, four classical buttons have been placed at the top left corner of the interface. These are for general purpose such as the rebooting of the Sphere and/or the rebooting of the mixing interface, sound effects visible on/off toggle, and 8-band equalizer on/off toggle (cf. Figure 2).

3) Comparison with similar devices
The appearance of the sound control part can appear to be similar to some commercial tactile mixers such as Line6 StageScape or digital audio workstation tablet interfaces (V-Control, AC-7 Core...). However, these are generally a transposition, within a tactile screen of the functionalities of a standard mixer. In some exceptions, as in [15] for instance, the tactile mixer is coupled with a haptic device enabling the user to "sense" the sound.
The key difference in our proposal is the fact that the mixer already includes a multi-agent system. Each mobile element is an agent and behaves following interaction rules with other agents. For the moment the interactions between mixer-agents are simple collisions, but one can imagine changing them to use other interaction models such as gravitation-based repulsions. In this case, the interaction model will lead to an emergent behaviour of the channels and the groups similar to satellite orbits and involving influences on the diffused sound.
For the moment, we decided to use simple collision to make the mixer easier to use. Hence, the influence on the sound can still be considered as the product of the direct Human interaction (as in a regular mixer).
B. Sphere world 1) Environment Instead of using classical planar environment for this experiment, we chose to provide to agents a spherical environment. This kind of environment is not widespread in agent related work because it requires the expression of influence forces, distances,... into spherical coordinate system which is not necessarily adequate in agents systems.
Since all agents move on the surface of the sphere, their coordinates consist only in a couple of angles q and f , r being always equal to sphere radius. (cf. Figure 3). The gravity relies then only on the variations of r . Thus, every element (perceptions, acceleration, speed, position...) is defined using a spherical coordinate system. For the localisation of the elements, and for the frustum culling, a QuadTree has been developed to manage the (θ, Φ) plane. (cf. Figure 4). This structure is generally used for 2D worlds. The main interest, in this application, is to allow a localisation of any entity with a logarithmic complexity. Moreover, even while maintaining a 3D representation of the world, the computation cost is very low since everything is computed as in a 2D world. Of course, the choice of such an environment implies several drawbacks. First of all, the management of the values of the angle on the limits of the cosinus and sinus functions make the continuity of the world hard to maintain when computing agents' movements. Besides, even if there is a bijection between the sphere and the (θ, Φ) plane, it is required to define a transformation function to translate measurements made on the plane into their equivalent in the sphere world. Figure 5 represents the sphere agency organization using a RIO (Role, Interaction, Organization) diagram as defined in [21]. This diagram represents the different roles that can be played by agent (μ, γ, β, δ roles) and the interactions between these. The next paragraphs detail these elements.

 μ role
This role corresponds to the musician's role. Each musician is linked to an audio channel and emits the sound of it into the sphere world. This role can be considered as the link between the sound world (the mixing console) and the visual world (the sphere).
The agents which play this role, are attracted by other μ agents of the same mixing group. By contrast, all other agents, including μ agents of other mixing groups, are repulsed by them.
 γ role This role corresponds to an organic builder role. Agents, which play this role, build organic structures (vegetable) into the sphere world. This role is sensitive to one specific μ role (i.e. one specific sound channel) by which it is attracted. The behaviour is similar to fireflies. A gauge is fed by the sounds that came from the associated musician. The nearest the musician is to the γ agent, the more the gauge is fed. When the gauge reaches its maximum value, an organic structure is built. During the construction of the structure, the γ agent is inactive. After this, the agent disappears and let the place to a new γ agent created randomly on the sphere.
The agents, which endorse this role, are attracted by the organic structure and repulsed by β agents (defined in the next item) and by their constructions (buildings).

 β role
This role is similar to the role of γ. The main differences are the following: 1. The structures built are big buildings similar to skyscrapers. 2. Agents, which endorse the β role, are repulsed by both β and γ agents.
 δ role This role corresponds to destructors. Agents, which endorse this role, are attracted by skyscrapers, which they destroy when they are on them. When there are no buildings left, δ agents move randomly on the sphere.
In order to obtain good visual results, β agents are associated to bass, keyboard and drum sounds. Voices and strings are associated to γ agents. Hundreds of agents of each type are created to obtain the results shown in figures 7 and 8.

 Interactions
This section described in detail the different interactions used between agents. After this description, a summary of all interactions used in the sphere world is made in table 1.

 Attraction
The attraction law is a standard linear equation. The more the attracted agents are near to each other the less the attraction is important. This law is described by the following equation: (1) This equation represents the attraction force applied to agent A i due to the presence of agent A j . In this equation β is a scalar multiplier, m Ai and m Aj are respectively the mass of agent A i and A j .

 Repulsion
Repulsion can be treated as a negative gravitationnal force between two weighted elements. As with natural gravitational force, repulsion depends on the 1/r 2 value, where r is the distance between agents.
The following equation shows the analytic expression of the repulsion force applied to agent A j taking into account the influence of agent A i . α is a scalar multiplier that takes into account the environmental gravitational constant and the proportion of attraction compared with the other forces.
In practice, since the agents' environment is virtual, this constant allows us to tune the importance of the repulsion behaviour relative to the other forces. In this equation, m i and m j are respectively the weight of the agents A i and A j . (2)

3) Resolving dynamical equations
The position, speed and acceleration for each agent are computed in a continuous world.
The agents' dynamical characteristics are computed following the laws of the classical Newtonian physics. Each behaviour, applied to an agent, corresponds to a force, which influences its movement. The behaviour is selected according to the role endorsed by the agent and the roles of its nearest mates.
By applying the fundamental law of dynamics, we can compute the acceleration of each agent (cf. equation 3). Here, represents acceleration, m the agent's mass, and the force resulting from behaviour b. (3) Introducing a fluid friction force defined, and integrating twice we obtain the following equations:

C. Software implementation
The software implementation has been made under C++ following the class diagram presented in figure 6. Each agent involved in the sphere world inherits from the abstract class Agent, which defines the live() method. This method corresponds to the behaviour of the agent. Its purpose is to compute the equations (3) to (6). This method is overloaded in each specific agent so as to embed specific characteristics such as the forces involved by the role. The scheduler class is a thread loop that calls the live() method of each agent one after the other. The agent are linked to the Environment class which manage the positions of the agents on the sphere. The GUI part (not detailed in the class diagram) corresponds to the set of classes aimed at managing the graphical interface of the sphere. The link between the sphere and the tactile interface is made through the μ agents, which are associated to audio channels. They have state values named pitch and level, reachable by γ and β agents. Depending on these values, γ and β agents will react if it corresponds to their behaviours. A low pitch value is associated to low frequencies, triggering β agents behaviour and a high pitch value is associated to high frequencies so as to trigger γ agents behaviours. The level value is used to feed the gauge of the agents.
On the dynamical point of view, the live() method starts by sending the position of its associated agent to the environment. As an answer, the environment sends back a list of the nearest agents with their characteristics (position, type, pitch,…). Using this list, the agent chooses the forces to be applied and computes its acceleration, speed and position. Then, it updates its position in the environment. The scheduler can now loop on other agents.
The link between the sphere and the tactile device is asynchronous. The thread of the tactile device updates the pitch and the level values of μ agents each time it is possible depending on the music timeline. The time schedule of the sphere world is faster than the music time schedule so as to ensure a better reactivity of the sphere.

A. OFFF Festival
Since 2001, OFFF (http://www.offf.ws/) festival has been held in Barcelona, becoming the globally recognized and trendsetting event it is today. OFFF Festival was initially the Online Flash-Film-Festival. After 3 years of existence, it became the International festival for the post-digital creation culture but kept the short initial designation. OFFF is spreading the work of a generation of creators that are breaking all kind of limits, those separating the commercial arena from the worlds of art and design; music from illustration, or ink and chalk from pixels. Artists, those have grown with the web and receive inspiration from digital tools, even when their canvas is not the screen came to the festival.

B. Sphericall demonstration 1) Global feeling
Our set fits perfectly with the general appearance of the festival area. The design of the device and the appearance of the Sphere are very attractive to the audience. The public doesn't hesitate to manipulate the device. The feedback on the mixing console use and on appearance is very good. The casual users succeed in manipulating the device easily and seem to adapt quickly to the relationship between the audio part and the mixing device. The use of the circular shaped buttons, which can collide with each other, adds an entertaining aspect as compared to the classical use of a mixing console. After a couple of minutes, the question on the link between the mixing console and the sphere arrives. Indeed, the link between the manipulation of the mixing console and the appearance of the sphere is not as direct as the link between the sound and the mixing console part. The relationship between these two components has thus to be explained. After a short explanation of the whole system, the casual users return on the table so as to try to figure out the side effects that occur on the sphere when manipulating the sounds. We estimate that almost 85% of the users found the interface easy to use even if in 70% of the cases they took more than 10 minutes to understand the relationships between the music controller and its effects on the sphere well. After 10 minutes, all the users were able to play with the sphere making abstraction from the tactile interface. After this, the user no longer looked at the mixing console but stared at the sphere world. If some effects are natural and easy to find (bass levels...) some other are subtler and need a deep investment in the use of the system.
From the technical point of view, the questions we encountered concern mainly the agents and their characteristics as compared to other techniques. Some artists, having already the habit of using interfaces such as Processing (http://processing.org/), openFrameworks (http://www.openframeworks.cc/) or Cinder seem to be very interested by the concept we have developed.

2) Analyse of the users' behaviour
The main innovation is in the way the user can interact indirectly with the system. By controlling, via this simple interface, the music and sounds produced, the user is actually linked with the whole artificial intelligence of the system, and, like a conductor, smoothly leads how the agents will act -and interact -thus how the scene is rendered. This is quite different from a standard "visualisation" plugin, where most of the time the colours and shapes rendered are directly calculated from the sound waveform.
The user faces a two-level interaction: as he may be used to, he directly hears the changes he makes in the music, but he also focuses on the consequences of his choices. This is different from a real-time strategy video game, where he knows how to control each unit, and expects them to behave exactly as he orders or from a passive 3D visualisation plugin, where everything is computed. His choices directly influence the behaviours of the agents, but without dictating them: the global result can be guided, but never predicted.
There is a permanent curiosity lightened in the user: it's a new approach for building interactions between Humans and computers, which leaves, when necessary, some parts of the decision process to the computer. We can for instance think about an interface with intelligent and independent components, which adapt to the user choices and habits.
The result obtained visually is the interaction between the Human and the Artificial Intelligence (AI) of the system. This experiments shows that, even without training sessions, the Human player is able to interact with a complex system provided the interaction device is simple enough. Moreover, the interaction device has to be based on notions and feelings already experienced by the user in another context. In our application, the visual result is obtained making the user play with sounds and not directly with the parameters of the AI. So as to have more details on the use of the Sphericall device, other experiments were made with a set of students who used to manipulate multi-agent systems. We firstly proposed to the students a direct control through agents parameters manipulations. In this situation the control is less easy and the students, despite their knowledge in multi-agent system, had some difficulties to well understand the implication of each parameter change. By contrast, using the tactile device and the sound feedback, untrained users were able to easily manipulate the system. After this experiment, the students had filled out a short questionnaire. The goal of this questionnaire was to rate the easiness of the interface in terms of understandability of the link between manipulators and sphere. The questions were the following: 1. Is the manipulation of the agents parameters easy to understand? 2. Is the link between the parameters and the sphere appearance easy to understand? 3. Is the manipulation of the mixing control device easy to understand? 4. Is the link between the mixing control device and the sphere easy to understand? 5. Are the modifications of sphere appearance logical in relation with the change performed on the sound device? 6. Which kind of control do you prefer? Students had to give an answer between 1 and 5 for the first 5 questions. (1 corresponds to fully disagree and 5 to fully agree). The results obtained with a set of 35 students are presented in table 2.  Table 2 shows clearly that not only the mixing console is easier to manipulate but also that it allows students to better understand the correlation between the sphere world and their manipulations. Of course, for question #5, more than 90% of the students prefer the mixing console to the direct parameter manipulation. These results show that the mixing console device helps the user to better understand the complex world of the sphere. In most of the cases, the user better understands the system with the abstraction as compared to the whole explanation of the entire system. Consequently, providing a well-chosen abstract interface makes the task of understanding the complex system easier. The example chosen there is a little biased because it is based on elements that are based on common knowledge and easy to understand. However, we think that this experiment gives interesting enough results to be explored in other fields more deeply. This paper presented Sphericall, an application aimed at studying Human/agency interactions. The Sphericall device, composed of a tactile screen and a sphere world where agents evolve, has been deployed during the OFFF festival in La Villette. The two devices are developed based on the multiagent paradigm. The tactile device differs from commercial tactile mixers on the fact that the result in music control is obtained taking into account both user manipulations and interaction behaviours of graphical elements. This tactile mixer can be considered as an abstraction of the complex world of the Sphere. The Sphere as itself is represented in 3D and allows the result of Human/System interactions to be shown. This deployment was a public success and allows having a great feedback on the deployment of such a device. The application is intuitive enough to permit a non-scientific public to interact with the artificial intelligence. Indeed, it's hard to handle the complexity of such systems. The solution presented in this paper relies on an interface aimed at translating the complexity of the Sphere world into a more easily understandable effector unit. The feedback, as itself, is made through the Sphere representation. On the artistic point of view the results obtained were really appreciated by the public. A movie of this event is available at http://www.youtube.com/watch?v=iDEkBE6Cbz8.
We now plan to use the knowledge acquired through this experiment to other application fields such as authority sharing in complex decision systems. The two main targets we plan to deal with are the following: (1) Trying to increase the fan-out of Human-robot team using abstract multimodal interfaces such as the one used in Sphericall. To that way, we will focus our research work on the nature of the representation of the data and on the observation/interpretation of the Human behaviour. We are now exploring interfaces based on natural gesture recognition. (2) Trying to enable the manipulation of big databases using Sphericall-like interfaces. The main issues encountered are linked to the representation/manipulation of the data and to the introduction of queries using an abstract interface.