Limits...
Which is the best intrinsic motivation signal for learning multiple skills?

Santucci VG, Baldassarre G, Mirolli M - Front Neurorobot (2013)

Bottom Line: We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks.We compare the results of different versions of the system driven by several different intrinsic motivation signals.The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Computational Embodied Neuroscience, Isituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche Roma, Italy ; School of Computing and Mathematics, University of Plymouth Plymouth, UK.

ABSTRACT
Humans and other biological agents are able to autonomously learn and cache different skills in the absence of any biological pressure or any assigned task. In this respect, Intrinsic Motivations (i.e., motivations not connected to reward-related stimuli) play a cardinal role in animal learning, and can be considered as a fundamental tool for developing more autonomous and more adaptive artificial agents. In this work, we provide an exhaustive analysis of a scarcely investigated problem: which kind of IM reinforcement signal is the most suitable for driving the acquisition of multiple skills in the shortest time? To this purpose we implemented an artificial agent with a hierarchical architecture that allows to learn and cache different skills. We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks. We compare the results of different versions of the system driven by several different intrinsic motivation signals. The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

No MeSH data available.


Related in: MedlinePlus

Average performance and selections in TD condition.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3824099&req=5

Figure 14: Average performance and selections in TD condition.

Mentions: As mentioned in section 2.3.2, together with the different PEI signals we also tested another CB-IM signal provided by TD-error of the selected expert. As previously described, the average performance of TD condition is similar to those of other CB-IM conditions with PEI signal (except for TP, which is the best performer). However, if we look at the average performance on 20 replications (consecutive and including the best replication of this condition) we can see that when driven by the TD signal the system reaches a performance that is similar or even better than those of the other conditions (except for TP) in their best learning rate condition (confront Figure 14, left, with Figures 12, 13, top). Indeed, if we look at the average selections (Figure 14, right), we can see that TD signal is able to generate a sequence of selections that are connected to the competence progress of the system, although less than the one provided by the TP mechanism.


Which is the best intrinsic motivation signal for learning multiple skills?

Santucci VG, Baldassarre G, Mirolli M - Front Neurorobot (2013)

Average performance and selections in TD condition.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3824099&req=5

Figure 14: Average performance and selections in TD condition.
Mentions: As mentioned in section 2.3.2, together with the different PEI signals we also tested another CB-IM signal provided by TD-error of the selected expert. As previously described, the average performance of TD condition is similar to those of other CB-IM conditions with PEI signal (except for TP, which is the best performer). However, if we look at the average performance on 20 replications (consecutive and including the best replication of this condition) we can see that when driven by the TD signal the system reaches a performance that is similar or even better than those of the other conditions (except for TP) in their best learning rate condition (confront Figure 14, left, with Figures 12, 13, top). Indeed, if we look at the average selections (Figure 14, right), we can see that TD signal is able to generate a sequence of selections that are connected to the competence progress of the system, although less than the one provided by the TP mechanism.

Bottom Line: We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks.We compare the results of different versions of the system driven by several different intrinsic motivation signals.The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Computational Embodied Neuroscience, Isituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche Roma, Italy ; School of Computing and Mathematics, University of Plymouth Plymouth, UK.

ABSTRACT
Humans and other biological agents are able to autonomously learn and cache different skills in the absence of any biological pressure or any assigned task. In this respect, Intrinsic Motivations (i.e., motivations not connected to reward-related stimuli) play a cardinal role in animal learning, and can be considered as a fundamental tool for developing more autonomous and more adaptive artificial agents. In this work, we provide an exhaustive analysis of a scarcely investigated problem: which kind of IM reinforcement signal is the most suitable for driving the acquisition of multiple skills in the shortest time? To this purpose we implemented an artificial agent with a hierarchical architecture that allows to learn and cache different skills. We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks. We compare the results of different versions of the system driven by several different intrinsic motivation signals. The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

No MeSH data available.


Related in: MedlinePlus