Limits...
Reward-Modulated Hebbian Plasticity as Leverage for Partially Embodied Control in Compliant Robotics.

Burms J, Caluwaerts K, Dambre J - Front Neurorobot (2015)

Bottom Line: Our results demonstrate the universal applicability of reward-modulated Hebbian learning.Furthermore, they demonstrate the robustness of systems trained with the learning rule.This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

View Article: PubMed Central - PubMed

Affiliation: Computing Systems Laboratory (Reservoir Team), Electronics and Information Systems Department (ELIS), Ghent University , Ghent , Belgium.

ABSTRACT
In embodied computation (or morphological computation), part of the complexity of motor control is offloaded to the body dynamics. We demonstrate that a simple Hebbian-like learning rule can be used to train systems with (partial) embodiment, and can be extended outside of the scope of traditional neural networks. To this end, we apply the learning rule to optimize the connection weights of recurrent neural networks with different topologies and for various tasks. We then apply this learning rule to a simulated compliant tensegrity robot by optimizing static feedback controllers that directly exploit the dynamics of the robot body. This leads to partially embodied controllers, i.e., hybrid controllers that naturally integrate the computations that are performed by the robot body into a neural network architecture. Our results demonstrate the universal applicability of reward-modulated Hebbian learning. Furthermore, they demonstrate the robustness of systems trained with the learning rule. This study strengthens our belief that compliant robots should or can be seen as computational units, instead of dumb hardware that needs a complex controller. This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

No MeSH data available.


Related in: MedlinePlus

Comparison of the default and the decorrelated RMH learning rules. The evolution of the average reward for the 3-bit decoder task is shown in the presence of uncorrelated (left) and correlated (right) noise. The plot is generated using ten runs with different random initializations and input sequences.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4538293&req=5

Figure 12: Comparison of the default and the decorrelated RMH learning rules. The evolution of the average reward for the 3-bit decoder task is shown in the presence of uncorrelated (left) and correlated (right) noise. The plot is generated using ten runs with different random initializations and input sequences.

Mentions: Figure 12 compares the default and the decorrelation learning rules by plotting the average reward evolution for both the original task 2 setup and the setup with correlated noise. Again, we performed 10 runs with different random initializations and input sequences to generate each curve. In the left panel, we observe that in the context of uncorrelated noise, both learning rules give virtually identical results. In the case of correlated exploration noise though, the decorrelation learning rule does much better than the default one. The default RMH rule is able to learn at the start of a run, but quickly falls back and eventually settles on a suboptimal result. In contrast to this, the decorrelating version of the learning rule exhibits a very healthy learning curve and is only slightly affected by the fact that the exploratory noise is correlated.


Reward-Modulated Hebbian Plasticity as Leverage for Partially Embodied Control in Compliant Robotics.

Burms J, Caluwaerts K, Dambre J - Front Neurorobot (2015)

Comparison of the default and the decorrelated RMH learning rules. The evolution of the average reward for the 3-bit decoder task is shown in the presence of uncorrelated (left) and correlated (right) noise. The plot is generated using ten runs with different random initializations and input sequences.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4538293&req=5

Figure 12: Comparison of the default and the decorrelated RMH learning rules. The evolution of the average reward for the 3-bit decoder task is shown in the presence of uncorrelated (left) and correlated (right) noise. The plot is generated using ten runs with different random initializations and input sequences.
Mentions: Figure 12 compares the default and the decorrelation learning rules by plotting the average reward evolution for both the original task 2 setup and the setup with correlated noise. Again, we performed 10 runs with different random initializations and input sequences to generate each curve. In the left panel, we observe that in the context of uncorrelated noise, both learning rules give virtually identical results. In the case of correlated exploration noise though, the decorrelation learning rule does much better than the default one. The default RMH rule is able to learn at the start of a run, but quickly falls back and eventually settles on a suboptimal result. In contrast to this, the decorrelating version of the learning rule exhibits a very healthy learning curve and is only slightly affected by the fact that the exploratory noise is correlated.

Bottom Line: Our results demonstrate the universal applicability of reward-modulated Hebbian learning.Furthermore, they demonstrate the robustness of systems trained with the learning rule.This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

View Article: PubMed Central - PubMed

Affiliation: Computing Systems Laboratory (Reservoir Team), Electronics and Information Systems Department (ELIS), Ghent University , Ghent , Belgium.

ABSTRACT
In embodied computation (or morphological computation), part of the complexity of motor control is offloaded to the body dynamics. We demonstrate that a simple Hebbian-like learning rule can be used to train systems with (partial) embodiment, and can be extended outside of the scope of traditional neural networks. To this end, we apply the learning rule to optimize the connection weights of recurrent neural networks with different topologies and for various tasks. We then apply this learning rule to a simulated compliant tensegrity robot by optimizing static feedback controllers that directly exploit the dynamics of the robot body. This leads to partially embodied controllers, i.e., hybrid controllers that naturally integrate the computations that are performed by the robot body into a neural network architecture. Our results demonstrate the universal applicability of reward-modulated Hebbian learning. Furthermore, they demonstrate the robustness of systems trained with the learning rule. This study strengthens our belief that compliant robots should or can be seen as computational units, instead of dumb hardware that needs a complex controller. This link between compliant robotics and neural networks is also the main reason for our search for simple universal learning rules for both neural networks and robotics.

No MeSH data available.


Related in: MedlinePlus