Limits...
Projective simulation for artificial intelligence.

Briegel HJ, De las Cuevas G - Sci Rep (2012)

Bottom Line: During simulation, the clips are screened for specific features which trigger factual action of the agent.The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning.Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

View Article: PubMed Central - PubMed

Affiliation: Institut für Theoretische Physik, Universität Innsbruck, Technikerstrasse 25, A-6020 Innsbruck, Austria. hans.briegel@uibk.ac.at

ABSTRACT
We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

Show MeSH

Related in: MedlinePlus

Learning time τ0.9 as a function of /S/ for different values of the reflection parameter R.We observe a linear dependence of τ0.9 on /S/ with a slope determined by R. Ensemble average over 10000 runs, γ = 0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3351754&req=5

f9: Learning time τ0.9 as a function of /S/ for different values of the reflection parameter R.We observe a linear dependence of τ0.9 on /S/ with a slope determined by R. Ensemble average over 10000 runs, γ = 0.

Mentions: As a figure of merit we have looked at the learning time τ = τ0.9, which we define as the time the agent needs to achieve a certain blocking efficiency (for which we choose 90% of the maximum achievable value). We find that learning time increases linearly in both /S/ and /A/, (i.e. quadratically in N, if we set N = /A/ = /S/). The same scaling can be observed if we apply standard learning algorithms like Q-learning or AHC1 to the invasion game35. In Figure 9, the scaling of the learning time is shown for different values of R. Besides the linear scaling with /S/, it can be seen how reflections in clip space, as part of the simulation, speed up the learning process.


Projective simulation for artificial intelligence.

Briegel HJ, De las Cuevas G - Sci Rep (2012)

Learning time τ0.9 as a function of /S/ for different values of the reflection parameter R.We observe a linear dependence of τ0.9 on /S/ with a slope determined by R. Ensemble average over 10000 runs, γ = 0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3351754&req=5

f9: Learning time τ0.9 as a function of /S/ for different values of the reflection parameter R.We observe a linear dependence of τ0.9 on /S/ with a slope determined by R. Ensemble average over 10000 runs, γ = 0.
Mentions: As a figure of merit we have looked at the learning time τ = τ0.9, which we define as the time the agent needs to achieve a certain blocking efficiency (for which we choose 90% of the maximum achievable value). We find that learning time increases linearly in both /S/ and /A/, (i.e. quadratically in N, if we set N = /A/ = /S/). The same scaling can be observed if we apply standard learning algorithms like Q-learning or AHC1 to the invasion game35. In Figure 9, the scaling of the learning time is shown for different values of R. Besides the linear scaling with /S/, it can be seen how reflections in clip space, as part of the simulation, speed up the learning process.

Bottom Line: During simulation, the clips are screened for specific features which trigger factual action of the agent.The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning.Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

View Article: PubMed Central - PubMed

Affiliation: Institut für Theoretische Physik, Universität Innsbruck, Technikerstrasse 25, A-6020 Innsbruck, Austria. hans.briegel@uibk.ac.at

ABSTRACT
We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.

Show MeSH
Related in: MedlinePlus