Limits...
Human neural learning depends on reward prediction errors in the blocking paradigm.

Tobler PN, O'doherty JP, Dolan RJ, Schultz W - J. Neurophysiol. (2005)

Bottom Line: Here, a novel stimulus is blocked from learning when it is associated with a fully predicted outcome, presumably because the occurrence of the outcome fails to produce a prediction error.The medial orbitofrontal cortex and the ventral putamen showed significantly lower responses to blocked, compared with nonblocked, reward-predicting stimuli.These data suggest that learning in primary reward structures in the human brain correlates with prediction errors in a manner that complies with principles of formal learning theory.

View Article: PubMed Central - PubMed

Affiliation: Department of Anatomy, University of Cambridge, Cambridge CB2 3DY, UK. pnt21@cam.ac.uk

ABSTRACT
Learning occurs when an outcome deviates from expectation (prediction error). According to formal learning theory, the defining paradigm demonstrating the role of prediction errors in learning is the blocking test. Here, a novel stimulus is blocked from learning when it is associated with a fully predicted outcome, presumably because the occurrence of the outcome fails to produce a prediction error. We investigated the role of prediction errors in human reward-directed learning using a blocking paradigm and measured brain activation with functional magnetic resonance imaging. Participants showed blocking of behavioral learning with juice rewards as predicted by learning theory. The medial orbitofrontal cortex and the ventral putamen showed significantly lower responses to blocked, compared with nonblocked, reward-predicting stimuli. In reward-predicting control situations, deactivation in orbitofrontal cortex and ventral putamen occurred at the time of unpredicted reward omissions. Responses in discrete parts of orbitofrontal cortex correlated with the degree of behavioral learning during, and after, the learning phase. These data suggest that learning in primary reward structures in the human brain correlates with prediction errors in a manner that complies with principles of formal learning theory.

Show MeSH

Related in: MedlinePlus

Differential activations at the time of the reward in ventral striatum during reduction of prediction errors during learning. A: regions showing significantly better fits (P < 0.001) with modeled asymptotic decreases in activation during conditioning in BY+ compared with AX+ trials. Thus prediction-error related striatal activation at the time of reward decreased more in BY+ compared with AX+ trials during learning. B: bar plots showing contrast estimates (dimensionless) corresponding to the average fit of AX+ and BY+ with an asymptotically decreasing learning function. Error bars correspond to 95% confidence intervals. This analysis was performed only in subjects that showed blocking behaviorally.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2637603&req=5

f4: Differential activations at the time of the reward in ventral striatum during reduction of prediction errors during learning. A: regions showing significantly better fits (P < 0.001) with modeled asymptotic decreases in activation during conditioning in BY+ compared with AX+ trials. Thus prediction-error related striatal activation at the time of reward decreased more in BY+ compared with AX+ trials during learning. B: bar plots showing contrast estimates (dimensionless) corresponding to the average fit of AX+ and BY+ with an asymptotically decreasing learning function. Error bars correspond to 95% confidence intervals. This analysis was performed only in subjects that showed blocking behaviorally.

Mentions: During learning, a gradual (asymptotic) decrease of prediction error occurs at the time of the gradually better predicted reward (Rescorla and Wagner 1972; Sutton and Barto 1981). We specifically investigated whether brain activations would show better fits with asymptotic decreases in BY+ compared with AX+ trials as differential learning progressed. We found that in the 15 subjects showing blocking behaviorally, activation in the ventral striatum fitted better for BY+ than AX+ trials with an asymptotically decreasing learning function, corresponding to gradually reduced prediction error responses (Fig. 4; −15/−3/−12; z = 3.98).


Human neural learning depends on reward prediction errors in the blocking paradigm.

Tobler PN, O'doherty JP, Dolan RJ, Schultz W - J. Neurophysiol. (2005)

Differential activations at the time of the reward in ventral striatum during reduction of prediction errors during learning. A: regions showing significantly better fits (P < 0.001) with modeled asymptotic decreases in activation during conditioning in BY+ compared with AX+ trials. Thus prediction-error related striatal activation at the time of reward decreased more in BY+ compared with AX+ trials during learning. B: bar plots showing contrast estimates (dimensionless) corresponding to the average fit of AX+ and BY+ with an asymptotically decreasing learning function. Error bars correspond to 95% confidence intervals. This analysis was performed only in subjects that showed blocking behaviorally.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2637603&req=5

f4: Differential activations at the time of the reward in ventral striatum during reduction of prediction errors during learning. A: regions showing significantly better fits (P < 0.001) with modeled asymptotic decreases in activation during conditioning in BY+ compared with AX+ trials. Thus prediction-error related striatal activation at the time of reward decreased more in BY+ compared with AX+ trials during learning. B: bar plots showing contrast estimates (dimensionless) corresponding to the average fit of AX+ and BY+ with an asymptotically decreasing learning function. Error bars correspond to 95% confidence intervals. This analysis was performed only in subjects that showed blocking behaviorally.
Mentions: During learning, a gradual (asymptotic) decrease of prediction error occurs at the time of the gradually better predicted reward (Rescorla and Wagner 1972; Sutton and Barto 1981). We specifically investigated whether brain activations would show better fits with asymptotic decreases in BY+ compared with AX+ trials as differential learning progressed. We found that in the 15 subjects showing blocking behaviorally, activation in the ventral striatum fitted better for BY+ than AX+ trials with an asymptotically decreasing learning function, corresponding to gradually reduced prediction error responses (Fig. 4; −15/−3/−12; z = 3.98).

Bottom Line: Here, a novel stimulus is blocked from learning when it is associated with a fully predicted outcome, presumably because the occurrence of the outcome fails to produce a prediction error.The medial orbitofrontal cortex and the ventral putamen showed significantly lower responses to blocked, compared with nonblocked, reward-predicting stimuli.These data suggest that learning in primary reward structures in the human brain correlates with prediction errors in a manner that complies with principles of formal learning theory.

View Article: PubMed Central - PubMed

Affiliation: Department of Anatomy, University of Cambridge, Cambridge CB2 3DY, UK. pnt21@cam.ac.uk

ABSTRACT
Learning occurs when an outcome deviates from expectation (prediction error). According to formal learning theory, the defining paradigm demonstrating the role of prediction errors in learning is the blocking test. Here, a novel stimulus is blocked from learning when it is associated with a fully predicted outcome, presumably because the occurrence of the outcome fails to produce a prediction error. We investigated the role of prediction errors in human reward-directed learning using a blocking paradigm and measured brain activation with functional magnetic resonance imaging. Participants showed blocking of behavioral learning with juice rewards as predicted by learning theory. The medial orbitofrontal cortex and the ventral putamen showed significantly lower responses to blocked, compared with nonblocked, reward-predicting stimuli. In reward-predicting control situations, deactivation in orbitofrontal cortex and ventral putamen occurred at the time of unpredicted reward omissions. Responses in discrete parts of orbitofrontal cortex correlated with the degree of behavioral learning during, and after, the learning phase. These data suggest that learning in primary reward structures in the human brain correlates with prediction errors in a manner that complies with principles of formal learning theory.

Show MeSH
Related in: MedlinePlus