Limits...
Value learning and arousal in the extinction of probabilistic rewards: the role of dopamine in a modified temporal difference model.

Song MR, Fellous JM - PLoS ONE (2014)

Bottom Line: Although temporal difference models capture key characteristics of dopaminergic activity during the extinction of deterministic rewards, they have been less successful at simulating the extinction of probabilistic rewards.Our simulations propose that arousal allows the probability of reward to have lasting effects on the updating of reward value, which slows the extinction of low probability rewards.These predictions were supported by pharmacological experiments in rats.

View Article: PubMed Central - PubMed

Affiliation: Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.

ABSTRACT
Because most rewarding events are probabilistic and changing, the extinction of probabilistic rewards is important for survival. It has been proposed that the extinction of probabilistic rewards depends on arousal and the amount of learning of reward values. Midbrain dopamine neurons were suggested to play a role in both arousal and learning reward values. Despite extensive research on modeling dopaminergic activity in reward learning (e.g. temporal difference models), few studies have been done on modeling its role in arousal. Although temporal difference models capture key characteristics of dopaminergic activity during the extinction of deterministic rewards, they have been less successful at simulating the extinction of probabilistic rewards. By adding an arousal signal to a temporal difference model, we were able to simulate the extinction of probabilistic rewards and its dependence on the amount of learning. Our simulations propose that arousal allows the probability of reward to have lasting effects on the updating of reward value, which slows the extinction of low probability rewards. Using this model, we predicted that, by signaling the prediction error, dopamine determines the learned reward value that has to be extinguished during extinction and participates in regulating the size of the arousal signal that controls the learning rate. These predictions were supported by pharmacological experiments in rats.

Show MeSH

Related in: MedlinePlus

Model predictions of the effects of enhancing and reducing the magnitude of the prediction error.A: Effect of enhancing and reducing the magnitude of the prediction error on learned reward value at the end of the acquisition phase. The number of acquisition trials was 30 and the decay factor of arousal η was 0.97. B: Simulated effect of drugs enhancing/reducing the prediction error on the rate of extinction. C: Number of trials until extinction in B was normalized to the extinction of 100% reward probability in the same drug/dose condition. The normalization made it easier to compare the effect of drugs enhancing/reducing the prediction error on the shape of the extinction-probability curve.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3935866&req=5

pone-0089494-g006: Model predictions of the effects of enhancing and reducing the magnitude of the prediction error.A: Effect of enhancing and reducing the magnitude of the prediction error on learned reward value at the end of the acquisition phase. The number of acquisition trials was 30 and the decay factor of arousal η was 0.97. B: Simulated effect of drugs enhancing/reducing the prediction error on the rate of extinction. C: Number of trials until extinction in B was normalized to the extinction of 100% reward probability in the same drug/dose condition. The normalization made it easier to compare the effect of drugs enhancing/reducing the prediction error on the shape of the extinction-probability curve.

Mentions: Our model predicted that drugs that would reduce the size of the dopaminergic prediction error signal would result in an underestimation of reward value whereas drugs that would enhance the size of the dopaminergic prediction error signal would cause an overestimation (Fig. 6A). Reward underestimation and overestimation were further found to hasten and slow down extinction, respectively (Fig. 6B). This is because the amount of reward value that has to be extinguished is increased by overestimation but decreased by underestimation. Since low reward probabilities give only a small amount of reward value that has to be extinguished, the extinction of low probability rewards is particularly sensitive to the magnitude of learned value. Thus, the effect of enhancing and reducing the size of the prediction error on the rate of extinction was more prominent in low probabilities than on higher ones.


Value learning and arousal in the extinction of probabilistic rewards: the role of dopamine in a modified temporal difference model.

Song MR, Fellous JM - PLoS ONE (2014)

Model predictions of the effects of enhancing and reducing the magnitude of the prediction error.A: Effect of enhancing and reducing the magnitude of the prediction error on learned reward value at the end of the acquisition phase. The number of acquisition trials was 30 and the decay factor of arousal η was 0.97. B: Simulated effect of drugs enhancing/reducing the prediction error on the rate of extinction. C: Number of trials until extinction in B was normalized to the extinction of 100% reward probability in the same drug/dose condition. The normalization made it easier to compare the effect of drugs enhancing/reducing the prediction error on the shape of the extinction-probability curve.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3935866&req=5

pone-0089494-g006: Model predictions of the effects of enhancing and reducing the magnitude of the prediction error.A: Effect of enhancing and reducing the magnitude of the prediction error on learned reward value at the end of the acquisition phase. The number of acquisition trials was 30 and the decay factor of arousal η was 0.97. B: Simulated effect of drugs enhancing/reducing the prediction error on the rate of extinction. C: Number of trials until extinction in B was normalized to the extinction of 100% reward probability in the same drug/dose condition. The normalization made it easier to compare the effect of drugs enhancing/reducing the prediction error on the shape of the extinction-probability curve.
Mentions: Our model predicted that drugs that would reduce the size of the dopaminergic prediction error signal would result in an underestimation of reward value whereas drugs that would enhance the size of the dopaminergic prediction error signal would cause an overestimation (Fig. 6A). Reward underestimation and overestimation were further found to hasten and slow down extinction, respectively (Fig. 6B). This is because the amount of reward value that has to be extinguished is increased by overestimation but decreased by underestimation. Since low reward probabilities give only a small amount of reward value that has to be extinguished, the extinction of low probability rewards is particularly sensitive to the magnitude of learned value. Thus, the effect of enhancing and reducing the size of the prediction error on the rate of extinction was more prominent in low probabilities than on higher ones.

Bottom Line: Although temporal difference models capture key characteristics of dopaminergic activity during the extinction of deterministic rewards, they have been less successful at simulating the extinction of probabilistic rewards.Our simulations propose that arousal allows the probability of reward to have lasting effects on the updating of reward value, which slows the extinction of low probability rewards.These predictions were supported by pharmacological experiments in rats.

View Article: PubMed Central - PubMed

Affiliation: Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.

ABSTRACT
Because most rewarding events are probabilistic and changing, the extinction of probabilistic rewards is important for survival. It has been proposed that the extinction of probabilistic rewards depends on arousal and the amount of learning of reward values. Midbrain dopamine neurons were suggested to play a role in both arousal and learning reward values. Despite extensive research on modeling dopaminergic activity in reward learning (e.g. temporal difference models), few studies have been done on modeling its role in arousal. Although temporal difference models capture key characteristics of dopaminergic activity during the extinction of deterministic rewards, they have been less successful at simulating the extinction of probabilistic rewards. By adding an arousal signal to a temporal difference model, we were able to simulate the extinction of probabilistic rewards and its dependence on the amount of learning. Our simulations propose that arousal allows the probability of reward to have lasting effects on the updating of reward value, which slows the extinction of low probability rewards. Using this model, we predicted that, by signaling the prediction error, dopamine determines the learned reward value that has to be extinguished during extinction and participates in regulating the size of the arousal signal that controls the learning rate. These predictions were supported by pharmacological experiments in rats.

Show MeSH
Related in: MedlinePlus