Limits...
Projective simulation for artificial intelligence.

Briegel HJ, De las Cuevas G - Sci Rep (2012)

Bottom Line: During simulation, the clips are screened for specific features which trigger factual action of the agent.The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning.Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

View Article: PubMed Central - PubMed

Affiliation: Institut für Theoretische Physik, Universität Innsbruck, Technikerstrasse 25, A-6020 Innsbruck, Austria. hans.briegel@uibk.ac.at

ABSTRACT
We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

Show MeSH

Related in: MedlinePlus

Learning curve for enlarged percept space, with color as an additional percept category.In the first period, the symbols seen by the agent have the same color (e.g. red), while at time step n = 200 the color of the symbols suddenly changes (e.g. blue), and the agent has to learn the meaning of the symbol with the new color. Unlike Figure 5, there is no inversion of strategies, and thus no increased adaptation time. The agent simply has not seen symbols with the new color before, and thus has to learn them from scratch. Ensemble average over 1000 runs with error bars indicating one standard deviation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3351754&req=5

f6: Learning curve for enlarged percept space, with color as an additional percept category.In the first period, the symbols seen by the agent have the same color (e.g. red), while at time step n = 200 the color of the symbols suddenly changes (e.g. blue), and the agent has to learn the meaning of the symbol with the new color. Unlike Figure 5, there is no inversion of strategies, and thus no increased adaptation time. The agent simply has not seen symbols with the new color before, and thus has to learn them from scratch. Ensemble average over 1000 runs with error bars indicating one standard deviation.

Mentions: Note that the existence of an adaptation period in Figure 5 (after time step n = 250) relates to the fact that symbols which the agent had already learnt, suddenly invert their meaning in terms of the reward function. So the learnt behavior will, with high probability, lead to unrewarded actions. A different situation is of course given, if the agent is confronted with a new symbol that it had not perceived before. In Figure 6, we have enlarged the percept space and introduced color as an additional percept category. In terms of the invasion game, this means that the attacker can announce its next move by using symbols of different shapes and colors. In the first period, the symbols seen by the agent have a specific color (red), while at n = 250 the color suddenly changes (blue), and the agent has to learn the meaning of the symbols with the new color. Note that, unlike Figure 5, there is now no inversion of strategies, and thus no increased adaptation time. The agent simply has never seen blue symbols before, and has to learn their meaning from scratch.


Projective simulation for artificial intelligence.

Briegel HJ, De las Cuevas G - Sci Rep (2012)

Learning curve for enlarged percept space, with color as an additional percept category.In the first period, the symbols seen by the agent have the same color (e.g. red), while at time step n = 200 the color of the symbols suddenly changes (e.g. blue), and the agent has to learn the meaning of the symbol with the new color. Unlike Figure 5, there is no inversion of strategies, and thus no increased adaptation time. The agent simply has not seen symbols with the new color before, and thus has to learn them from scratch. Ensemble average over 1000 runs with error bars indicating one standard deviation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3351754&req=5

f6: Learning curve for enlarged percept space, with color as an additional percept category.In the first period, the symbols seen by the agent have the same color (e.g. red), while at time step n = 200 the color of the symbols suddenly changes (e.g. blue), and the agent has to learn the meaning of the symbol with the new color. Unlike Figure 5, there is no inversion of strategies, and thus no increased adaptation time. The agent simply has not seen symbols with the new color before, and thus has to learn them from scratch. Ensemble average over 1000 runs with error bars indicating one standard deviation.
Mentions: Note that the existence of an adaptation period in Figure 5 (after time step n = 250) relates to the fact that symbols which the agent had already learnt, suddenly invert their meaning in terms of the reward function. So the learnt behavior will, with high probability, lead to unrewarded actions. A different situation is of course given, if the agent is confronted with a new symbol that it had not perceived before. In Figure 6, we have enlarged the percept space and introduced color as an additional percept category. In terms of the invasion game, this means that the attacker can announce its next move by using symbols of different shapes and colors. In the first period, the symbols seen by the agent have a specific color (red), while at n = 250 the color suddenly changes (blue), and the agent has to learn the meaning of the symbols with the new color. Note that, unlike Figure 5, there is now no inversion of strategies, and thus no increased adaptation time. The agent simply has never seen blue symbols before, and has to learn their meaning from scratch.

Bottom Line: During simulation, the clips are screened for specific features which trigger factual action of the agent.The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning.Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

View Article: PubMed Central - PubMed

Affiliation: Institut für Theoretische Physik, Universität Innsbruck, Technikerstrasse 25, A-6020 Innsbruck, Austria. hans.briegel@uibk.ac.at

ABSTRACT
We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

Show MeSH
Related in: MedlinePlus