Limits...
Which is the best intrinsic motivation signal for learning multiple skills?

Santucci VG, Baldassarre G, Mirolli M - Front Neurorobot (2013)

Bottom Line: We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks.We compare the results of different versions of the system driven by several different intrinsic motivation signals.The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Computational Embodied Neuroscience, Isituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche Roma, Italy ; School of Computing and Mathematics, University of Plymouth Plymouth, UK.

ABSTRACT
Humans and other biological agents are able to autonomously learn and cache different skills in the absence of any biological pressure or any assigned task. In this respect, Intrinsic Motivations (i.e., motivations not connected to reward-related stimuli) play a cardinal role in animal learning, and can be considered as a fundamental tool for developing more autonomous and more adaptive artificial agents. In this work, we provide an exhaustive analysis of a scarcely investigated problem: which kind of IM reinforcement signal is the most suitable for driving the acquisition of multiple skills in the shortest time? To this purpose we implemented an artificial agent with a hierarchical architecture that allows to learn and cache different skills. We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks. We compare the results of different versions of the system driven by several different intrinsic motivation signals. The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

No MeSH data available.


Related in: MedlinePlus

Average performance on the 4 learnable tasks (top) and average selection probability for the associated expert (bottom) in the best condition (with respect to the learning rate) of SAP-TD-PEI, SP-TD-PEI, TP-PEI.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3824099&req=5

Figure 13: Average performance on the 4 learnable tasks (top) and average selection probability for the associated expert (bottom) in the best condition (with respect to the learning rate) of SAP-TD-PEI, SP-TD-PEI, TP-PEI.

Mentions: SAP-TD-PEI and SP-TD-PEI (Figure 13, left and center) are able to cancel the signal deriving from the rapidly learnt easier tasks, but at the same time they present the problem we found with the PE: these mechanisms can be too fast in canceling the IM signal, determining a decrease in the probability of selecting the complex tasks even if the system has still competence to acquire. This is confirmed by looking at data of SAP-TD-PEI condition, where the PEI signal for task 4 is drastically decreased around 200,000 trials, when the system has reach an average performance on that task of only about 80%.


Which is the best intrinsic motivation signal for learning multiple skills?

Santucci VG, Baldassarre G, Mirolli M - Front Neurorobot (2013)

Average performance on the 4 learnable tasks (top) and average selection probability for the associated expert (bottom) in the best condition (with respect to the learning rate) of SAP-TD-PEI, SP-TD-PEI, TP-PEI.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3824099&req=5

Figure 13: Average performance on the 4 learnable tasks (top) and average selection probability for the associated expert (bottom) in the best condition (with respect to the learning rate) of SAP-TD-PEI, SP-TD-PEI, TP-PEI.
Mentions: SAP-TD-PEI and SP-TD-PEI (Figure 13, left and center) are able to cancel the signal deriving from the rapidly learnt easier tasks, but at the same time they present the problem we found with the PE: these mechanisms can be too fast in canceling the IM signal, determining a decrease in the probability of selecting the complex tasks even if the system has still competence to acquire. This is confirmed by looking at data of SAP-TD-PEI condition, where the PEI signal for task 4 is drastically decreased around 200,000 trials, when the system has reach an average performance on that task of only about 80%.

Bottom Line: We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks.We compare the results of different versions of the system driven by several different intrinsic motivation signals.The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Computational Embodied Neuroscience, Isituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche Roma, Italy ; School of Computing and Mathematics, University of Plymouth Plymouth, UK.

ABSTRACT
Humans and other biological agents are able to autonomously learn and cache different skills in the absence of any biological pressure or any assigned task. In this respect, Intrinsic Motivations (i.e., motivations not connected to reward-related stimuli) play a cardinal role in animal learning, and can be considered as a fundamental tool for developing more autonomous and more adaptive artificial agents. In this work, we provide an exhaustive analysis of a scarcely investigated problem: which kind of IM reinforcement signal is the most suitable for driving the acquisition of multiple skills in the shortest time? To this purpose we implemented an artificial agent with a hierarchical architecture that allows to learn and cache different skills. We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks. We compare the results of different versions of the system driven by several different intrinsic motivation signals. The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

No MeSH data available.


Related in: MedlinePlus