Limits...
Reward-Modulated Hebbian Plasticity as Leverage for Partially Embodied Control in Compliant Robotics.

Burms J, Caluwaerts K, Dambre J - Front Neurorobot (2015)

Bottom Line: Our results demonstrate the universal applicability of reward-modulated Hebbian learning.Furthermore, they demonstrate the robustness of systems trained with the learning rule.This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

View Article: PubMed Central - PubMed

Affiliation: Computing Systems Laboratory (Reservoir Team), Electronics and Information Systems Department (ELIS), Ghent University , Ghent , Belgium.

ABSTRACT
In embodied computation (or morphological computation), part of the complexity of motor control is offloaded to the body dynamics. We demonstrate that a simple Hebbian-like learning rule can be used to train systems with (partial) embodiment, and can be extended outside of the scope of traditional neural networks. To this end, we apply the learning rule to optimize the connection weights of recurrent neural networks with different topologies and for various tasks. We then apply this learning rule to a simulated compliant tensegrity robot by optimizing static feedback controllers that directly exploit the dynamics of the robot body. This leads to partially embodied controllers, i.e., hybrid controllers that naturally integrate the computations that are performed by the robot body into a neural network architecture. Our results demonstrate the universal applicability of reward-modulated Hebbian learning. Furthermore, they demonstrate the robustness of systems trained with the learning rule. This study strengthens our belief that compliant robots should or can be seen as computational units, instead of dumb hardware that needs a complex controller. This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

No MeSH data available.


Related in: MedlinePlus

Robustness of the learning rule for the writing task. Initially, a feed forward controller is optimized. We then simulate a failure by making a single actuator follow its initial trajectory from trial 1000 onward. At the same time, the learning rule starts learning a set of feedback weights to compensate for the actuator failure. Similarly, we simulate a two actuator failure after trial 2000. At time 3000, we simulate a failure of an actuator directly attached to the end-effector. The top row shows the results for writing an “a” character, while the bottom row shows the results for a “b.” The left column show the results when the feedback includes the spring forces and the square of the spring forces, while the right column only includes the spring forces. The plots show the mean, maximum, minimum, and SD of the reward over 30 runs.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4538293&req=5

Figure 14: Robustness of the learning rule for the writing task. Initially, a feed forward controller is optimized. We then simulate a failure by making a single actuator follow its initial trajectory from trial 1000 onward. At the same time, the learning rule starts learning a set of feedback weights to compensate for the actuator failure. Similarly, we simulate a two actuator failure after trial 2000. At time 3000, we simulate a failure of an actuator directly attached to the end-effector. The top row shows the results for writing an “a” character, while the bottom row shows the results for a “b.” The left column show the results when the feedback includes the spring forces and the square of the spring forces, while the right column only includes the spring forces. The plots show the mean, maximum, minimum, and SD of the reward over 30 runs.

Mentions: Finally, we simulate actuator failures. The results of these experiments are presented in Figure 14. As could be expected, the performance immediately drops significantly after each failure. By applying the RMH learning rule to the feedback controller, the system is able to recover from the various failures. To investigate the stability of the learning rule, each experiment was performed 30 times, with similar results.


Reward-Modulated Hebbian Plasticity as Leverage for Partially Embodied Control in Compliant Robotics.

Burms J, Caluwaerts K, Dambre J - Front Neurorobot (2015)

Robustness of the learning rule for the writing task. Initially, a feed forward controller is optimized. We then simulate a failure by making a single actuator follow its initial trajectory from trial 1000 onward. At the same time, the learning rule starts learning a set of feedback weights to compensate for the actuator failure. Similarly, we simulate a two actuator failure after trial 2000. At time 3000, we simulate a failure of an actuator directly attached to the end-effector. The top row shows the results for writing an “a” character, while the bottom row shows the results for a “b.” The left column show the results when the feedback includes the spring forces and the square of the spring forces, while the right column only includes the spring forces. The plots show the mean, maximum, minimum, and SD of the reward over 30 runs.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4538293&req=5

Figure 14: Robustness of the learning rule for the writing task. Initially, a feed forward controller is optimized. We then simulate a failure by making a single actuator follow its initial trajectory from trial 1000 onward. At the same time, the learning rule starts learning a set of feedback weights to compensate for the actuator failure. Similarly, we simulate a two actuator failure after trial 2000. At time 3000, we simulate a failure of an actuator directly attached to the end-effector. The top row shows the results for writing an “a” character, while the bottom row shows the results for a “b.” The left column show the results when the feedback includes the spring forces and the square of the spring forces, while the right column only includes the spring forces. The plots show the mean, maximum, minimum, and SD of the reward over 30 runs.
Mentions: Finally, we simulate actuator failures. The results of these experiments are presented in Figure 14. As could be expected, the performance immediately drops significantly after each failure. By applying the RMH learning rule to the feedback controller, the system is able to recover from the various failures. To investigate the stability of the learning rule, each experiment was performed 30 times, with similar results.

Bottom Line: Our results demonstrate the universal applicability of reward-modulated Hebbian learning.Furthermore, they demonstrate the robustness of systems trained with the learning rule.This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

View Article: PubMed Central - PubMed

Affiliation: Computing Systems Laboratory (Reservoir Team), Electronics and Information Systems Department (ELIS), Ghent University , Ghent , Belgium.

ABSTRACT
In embodied computation (or morphological computation), part of the complexity of motor control is offloaded to the body dynamics. We demonstrate that a simple Hebbian-like learning rule can be used to train systems with (partial) embodiment, and can be extended outside of the scope of traditional neural networks. To this end, we apply the learning rule to optimize the connection weights of recurrent neural networks with different topologies and for various tasks. We then apply this learning rule to a simulated compliant tensegrity robot by optimizing static feedback controllers that directly exploit the dynamics of the robot body. This leads to partially embodied controllers, i.e., hybrid controllers that naturally integrate the computations that are performed by the robot body into a neural network architecture. Our results demonstrate the universal applicability of reward-modulated Hebbian learning. Furthermore, they demonstrate the robustness of systems trained with the learning rule. This study strengthens our belief that compliant robots should or can be seen as computational units, instead of dumb hardware that needs a complex controller. This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

No MeSH data available.


Related in: MedlinePlus