Limits...
Projective simulation for artificial intelligence.

Briegel HJ, De las Cuevas G - Sci Rep (2012)

Bottom Line: During simulation, the clips are screened for specific features which trigger factual action of the agent.The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning.Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

View Article: PubMed Central - PubMed

Affiliation: Institut für Theoretische Physik, Universität Innsbruck, Technikerstrasse 25, A-6020 Innsbruck, Austria. hans.briegel@uibk.ac.at

ABSTRACT
We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

Show MeSH

Related in: MedlinePlus

Episodic memory that is built up by the defender-agent in Figure 3, if the attacker follows the static strategy to move one door to the left (right) after showing the symbol  ().The “emotion tags” at each of the transitions in the network indicate the associated feedback that is stored in the memory’s evaluation system. Informally, emotion tags can be seen as remembered rewards for previous actions. They help the agent to evaluate the result of a simulation and to translate it into real action. If a clip transition in the simulation leads subsequently to a rewarded action, the state of its tag is set (or confirmed) to , and the transition probability in the next simulation is amplified. Otherwise the tag is set to  and the transition probability is attenuated (or simply not amplified).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3351754&req=5

f4: Episodic memory that is built up by the defender-agent in Figure 3, if the attacker follows the static strategy to move one door to the left (right) after showing the symbol ().The “emotion tags” at each of the transitions in the network indicate the associated feedback that is stored in the memory’s evaluation system. Informally, emotion tags can be seen as remembered rewards for previous actions. They help the agent to evaluate the result of a simulation and to translate it into real action. If a clip transition in the simulation leads subsequently to a rewarded action, the state of its tag is set (or confirmed) to , and the transition probability in the next simulation is amplified. Otherwise the tag is set to and the transition probability is attenuated (or simply not amplified).

Mentions: Suppose that the attacker indicates with the symbols , that it will move one door to the left, or to the right, respectively. Then, the episodic memory that will be built up by the agent has the graph structure as shown in Figure 4.


Projective simulation for artificial intelligence.

Briegel HJ, De las Cuevas G - Sci Rep (2012)

Episodic memory that is built up by the defender-agent in Figure 3, if the attacker follows the static strategy to move one door to the left (right) after showing the symbol  ().The “emotion tags” at each of the transitions in the network indicate the associated feedback that is stored in the memory’s evaluation system. Informally, emotion tags can be seen as remembered rewards for previous actions. They help the agent to evaluate the result of a simulation and to translate it into real action. If a clip transition in the simulation leads subsequently to a rewarded action, the state of its tag is set (or confirmed) to , and the transition probability in the next simulation is amplified. Otherwise the tag is set to  and the transition probability is attenuated (or simply not amplified).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3351754&req=5

f4: Episodic memory that is built up by the defender-agent in Figure 3, if the attacker follows the static strategy to move one door to the left (right) after showing the symbol ().The “emotion tags” at each of the transitions in the network indicate the associated feedback that is stored in the memory’s evaluation system. Informally, emotion tags can be seen as remembered rewards for previous actions. They help the agent to evaluate the result of a simulation and to translate it into real action. If a clip transition in the simulation leads subsequently to a rewarded action, the state of its tag is set (or confirmed) to , and the transition probability in the next simulation is amplified. Otherwise the tag is set to and the transition probability is attenuated (or simply not amplified).
Mentions: Suppose that the attacker indicates with the symbols , that it will move one door to the left, or to the right, respectively. Then, the episodic memory that will be built up by the agent has the graph structure as shown in Figure 4.

Bottom Line: During simulation, the clips are screened for specific features which trigger factual action of the agent.The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning.Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

View Article: PubMed Central - PubMed

Affiliation: Institut für Theoretische Physik, Universität Innsbruck, Technikerstrasse 25, A-6020 Innsbruck, Austria. hans.briegel@uibk.ac.at

ABSTRACT
We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

Show MeSH
Related in: MedlinePlus