Limits...
Learning the microstructure of successful behavior.

Charlesworth JD, Tumer EC, Warren TL, Brainard MS - Nat. Neurosci. (2011)

Bottom Line: However, successful performance of many motor skills, such as speech articulation, also requires learning behavioral trajectories that vary continuously over time.A simple principle predicted the detailed structure of learning: birds learned to produce the average of the behavioral trajectories associated with successful outcomes.This learning rule accurately predicted the structure of learning at a millisecond timescale, demonstrating that the nervous system records fine-grained details of successful behavior and uses this information to guide learning.

View Article: PubMed Central - PubMed

Affiliation: W M Keck Center for Integrative Neuroscience, University of California, San Francisco, California, USA. jcharles@phy.ucsf.edu

ABSTRACT
Reinforcement signals indicating success or failure are known to alter the probability of selecting between distinct actions. However, successful performance of many motor skills, such as speech articulation, also requires learning behavioral trajectories that vary continuously over time. Here, we investigated how temporally discrete reinforcement signals shape a continuous behavioral trajectory, the fundamental frequency of adult Bengalese finch song. We provided reinforcement contingent on fundamental frequency performance only at one point in the song. Learned changes to fundamental frequency were maximal at this point, but also extended both earlier and later in the fundamental frequency trajectory. A simple principle predicted the detailed structure of learning: birds learned to produce the average of the behavioral trajectories associated with successful outcomes. This learning rule accurately predicted the structure of learning at a millisecond timescale, demonstrating that the nervous system records fine-grained details of successful behavior and uses this information to guide learning.

Show MeSH
The microstructure of successful variation predicts learninga. To describe the natural pattern of FF variation, we calculated temporally precise representations of FF for baseline performances of the targeted syllable. The left panel shows the mean spectrogram of this syllable at baseline. In the middle panel, temporally precise FF representations (black traces) are overlaid on spectrograms of the first harmonic for two baseline performances of the syllable. The right panel depicts temporally precise FF representations for 50 baseline performances, expressed as percent deviations from the mean. b. We predicted learning from the baseline structure of FF variation by computing the average of the baseline FF variants that avoid aversive reinforcement in a simulation of the experiment. Simulations included information about the contingency time (blue arrowhead) and threshold for avoiding aversive reinforcement (upper tip of blue arrowhead) for a given experiment (see Methods). In this example, the simulation indicated that the red trajectories would avoid aversive reinforcement and the gray trajectories would receive aversive reinforcement. c. Predicted learning (red) compared to actual learning (black) in this example experiment. d. Average predicted (red) and actual (black) learning trajectories across all experiments (n=28). Gray shading denotes ± s.e.m. for actual learning. All traces are aligned to the contingency time.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3045469&req=5

Figure 2: The microstructure of successful variation predicts learninga. To describe the natural pattern of FF variation, we calculated temporally precise representations of FF for baseline performances of the targeted syllable. The left panel shows the mean spectrogram of this syllable at baseline. In the middle panel, temporally precise FF representations (black traces) are overlaid on spectrograms of the first harmonic for two baseline performances of the syllable. The right panel depicts temporally precise FF representations for 50 baseline performances, expressed as percent deviations from the mean. b. We predicted learning from the baseline structure of FF variation by computing the average of the baseline FF variants that avoid aversive reinforcement in a simulation of the experiment. Simulations included information about the contingency time (blue arrowhead) and threshold for avoiding aversive reinforcement (upper tip of blue arrowhead) for a given experiment (see Methods). In this example, the simulation indicated that the red trajectories would avoid aversive reinforcement and the gray trajectories would receive aversive reinforcement. c. Predicted learning (red) compared to actual learning (black) in this example experiment. d. Average predicted (red) and actual (black) learning trajectories across all experiments (n=28). Gray shading denotes ± s.e.m. for actual learning. All traces are aligned to the contingency time.

Mentions: We evaluated the following simple model of learning: learn to produce the average of successful behavioral variants. For each experiment, the predictions of this model were calculated by computing the average of baseline fundamental frequency trajectories that avoid aversive reinforcement in an experimental simulation. We performed a simulation on baseline patterns of variation, as opposed to using the actual trials that the bird experienced during learning, to avoid the bias of including any of the actual structure of learning in the predictions. To simulate learning, we first normalized baseline fundamental frequency trajectories to yield residual trajectories expressed as percent deviations from the mean (Fig. 2a). Next, we simulated which of these trajectories would escape aversive reinforcement (Fig. 2b, red traces) given the contingency time and threshold for escaping reinforcement (see Methods). Our model predicts that the learned trajectory is the average of the trajectories that escape (i.e. the average of the red traces in Fig. 2b); we evaluated the model by comparing this predicted trajectory (Fig. 2c, red trace) to the actual learned trajectory (Fig. 2c, black trace).


Learning the microstructure of successful behavior.

Charlesworth JD, Tumer EC, Warren TL, Brainard MS - Nat. Neurosci. (2011)

The microstructure of successful variation predicts learninga. To describe the natural pattern of FF variation, we calculated temporally precise representations of FF for baseline performances of the targeted syllable. The left panel shows the mean spectrogram of this syllable at baseline. In the middle panel, temporally precise FF representations (black traces) are overlaid on spectrograms of the first harmonic for two baseline performances of the syllable. The right panel depicts temporally precise FF representations for 50 baseline performances, expressed as percent deviations from the mean. b. We predicted learning from the baseline structure of FF variation by computing the average of the baseline FF variants that avoid aversive reinforcement in a simulation of the experiment. Simulations included information about the contingency time (blue arrowhead) and threshold for avoiding aversive reinforcement (upper tip of blue arrowhead) for a given experiment (see Methods). In this example, the simulation indicated that the red trajectories would avoid aversive reinforcement and the gray trajectories would receive aversive reinforcement. c. Predicted learning (red) compared to actual learning (black) in this example experiment. d. Average predicted (red) and actual (black) learning trajectories across all experiments (n=28). Gray shading denotes ± s.e.m. for actual learning. All traces are aligned to the contingency time.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3045469&req=5

Figure 2: The microstructure of successful variation predicts learninga. To describe the natural pattern of FF variation, we calculated temporally precise representations of FF for baseline performances of the targeted syllable. The left panel shows the mean spectrogram of this syllable at baseline. In the middle panel, temporally precise FF representations (black traces) are overlaid on spectrograms of the first harmonic for two baseline performances of the syllable. The right panel depicts temporally precise FF representations for 50 baseline performances, expressed as percent deviations from the mean. b. We predicted learning from the baseline structure of FF variation by computing the average of the baseline FF variants that avoid aversive reinforcement in a simulation of the experiment. Simulations included information about the contingency time (blue arrowhead) and threshold for avoiding aversive reinforcement (upper tip of blue arrowhead) for a given experiment (see Methods). In this example, the simulation indicated that the red trajectories would avoid aversive reinforcement and the gray trajectories would receive aversive reinforcement. c. Predicted learning (red) compared to actual learning (black) in this example experiment. d. Average predicted (red) and actual (black) learning trajectories across all experiments (n=28). Gray shading denotes ± s.e.m. for actual learning. All traces are aligned to the contingency time.
Mentions: We evaluated the following simple model of learning: learn to produce the average of successful behavioral variants. For each experiment, the predictions of this model were calculated by computing the average of baseline fundamental frequency trajectories that avoid aversive reinforcement in an experimental simulation. We performed a simulation on baseline patterns of variation, as opposed to using the actual trials that the bird experienced during learning, to avoid the bias of including any of the actual structure of learning in the predictions. To simulate learning, we first normalized baseline fundamental frequency trajectories to yield residual trajectories expressed as percent deviations from the mean (Fig. 2a). Next, we simulated which of these trajectories would escape aversive reinforcement (Fig. 2b, red traces) given the contingency time and threshold for escaping reinforcement (see Methods). Our model predicts that the learned trajectory is the average of the trajectories that escape (i.e. the average of the red traces in Fig. 2b); we evaluated the model by comparing this predicted trajectory (Fig. 2c, red trace) to the actual learned trajectory (Fig. 2c, black trace).

Bottom Line: However, successful performance of many motor skills, such as speech articulation, also requires learning behavioral trajectories that vary continuously over time.A simple principle predicted the detailed structure of learning: birds learned to produce the average of the behavioral trajectories associated with successful outcomes.This learning rule accurately predicted the structure of learning at a millisecond timescale, demonstrating that the nervous system records fine-grained details of successful behavior and uses this information to guide learning.

View Article: PubMed Central - PubMed

Affiliation: W M Keck Center for Integrative Neuroscience, University of California, San Francisco, California, USA. jcharles@phy.ucsf.edu

ABSTRACT
Reinforcement signals indicating success or failure are known to alter the probability of selecting between distinct actions. However, successful performance of many motor skills, such as speech articulation, also requires learning behavioral trajectories that vary continuously over time. Here, we investigated how temporally discrete reinforcement signals shape a continuous behavioral trajectory, the fundamental frequency of adult Bengalese finch song. We provided reinforcement contingent on fundamental frequency performance only at one point in the song. Learned changes to fundamental frequency were maximal at this point, but also extended both earlier and later in the fundamental frequency trajectory. A simple principle predicted the detailed structure of learning: birds learned to produce the average of the behavioral trajectories associated with successful outcomes. This learning rule accurately predicted the structure of learning at a millisecond timescale, demonstrating that the nervous system records fine-grained details of successful behavior and uses this information to guide learning.

Show MeSH