Limits...
A network model of basal ganglia for understanding the roles of dopamine and serotonin in reward-punishment-risk based decision making.

Balasubramani PP, Chakravarthy VS, Ravindran B, Moustafa AA - Front Comput Neurosci (2015)

Bottom Line: Our prior model of the BG was an abstract model that did not incorporate anatomical and cellular-level data.Though the existence of MSNs that co-express D1R and D2R are reported by various experimental studies, prior existing computational models did not include them.Starting from the assumption that 5HT modulates all MSNs, our study predicts significant modulatory effects of 5HT on D2R and co-expressing D1R-D2R MSNs which in turn explains the multifarious functions of 5HT in the BG.

View Article: PubMed Central - PubMed

Affiliation: Department of Biotechnology, Indian Institute of Technology Madras Chennai, India.

ABSTRACT
There is significant evidence that in addition to reward-punishment based decision making, the Basal Ganglia (BG) contributes to risk-based decision making (Balasubramani et al., 2014). Despite this evidence, little is known about the computational principles and neural correlates of risk computation in this subcortical system. We have previously proposed a reinforcement learning (RL)-based model of the BG that simulates the interactions between dopamine (DA) and serotonin (5HT) in a diverse set of experimental studies including reward, punishment and risk based decision making (Balasubramani et al., 2014). Starting with the classical idea that the activity of mesencephalic DA represents reward prediction error, the model posits that serotoninergic activity in the striatum controls risk-prediction error. Our prior model of the BG was an abstract model that did not incorporate anatomical and cellular-level data. In this work, we expand the earlier model into a detailed network model of the BG and demonstrate the joint contributions of DA-5HT in risk and reward-punishment sensitivity. At the core of the proposed network model is the following insight regarding cellular correlates of value and risk computation. Just as DA D1 receptor (D1R) expressing medium spiny neurons (MSNs) of the striatum were thought to be the neural substrates for value computation, we propose that DA D1R and D2R co-expressing MSNs are capable of computing risk. Though the existence of MSNs that co-express D1R and D2R are reported by various experimental studies, prior existing computational models did not include them. Ours is the first model that accounts for the computational possibilities of these co-expressing D1R-D2R MSNs, and describes how DA and 5HT mediate activity in these classes of neurons (D1R-, D2R-, D1R-D2R- MSNs). Starting from the assumption that 5HT modulates all MSNs, our study predicts significant modulatory effects of 5HT on D2R and co-expressing D1R-D2R MSNs which in turn explains the multifarious functions of 5HT in the BG. The experiments simulated in the present study relates 5HT to risk sensitivity and reward-punishment learning. Furthermore, our model is shown to capture reward-punishment and risk based decision making impairment in Parkinson's Disease (PD). The model predicts that optimizing 5HT levels along with DA medications might be essential for improving the patients' reward-punishment learning deficits.

No MeSH data available.


Related in: MedlinePlus

The reward punishment sensitivity obtained by simulated (Sims)- PD and controls model to explain the experiment (Expt) of Bodi et al. (2009). Error bars represent the standard error (SE) with N = 100 (N = number of simulation instances). The Sims matches the Expt value distribution closely, and are not found to be significantly different.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4469836&req=5

Figure 5: The reward punishment sensitivity obtained by simulated (Sims)- PD and controls model to explain the experiment (Expt) of Bodi et al. (2009). Error bars represent the standard error (SE) with N = 100 (N = number of simulation instances). The Sims matches the Expt value distribution closely, and are not found to be significantly different.

Mentions: The simulation studies presented so far are performed under controlled conditions. This section simulates a study related to reward-punishment learning that involved PD patients. Bodi et al. (2009) used a probabilistic classification task for assessing reward-punishment learning under the different medication conditions of PD patients. The medications used in the study were a mix of DA agonists (Pramipexole and Ropinirole) and L-Dopa. The task was as follows: one of four random fractal images (I1–I4) were presented. In response to each image, the subject had to press on one of two buttons—A or B–on a keypad. Stimuli I1 and I2 was always associated with reward (+25 points), while I3, I4 was associated with loss/punishment (−25 points). The probability of reward or punishment outcome depended on the button (A or B) that the subject pressed in response to viewing an image. The reward/punishment probabilities associated with two responses, for each of the four stimuli, are summarized in Table 7. There are 160 trials administered in 4 blocks. Experiments were performed on controls, never-medicated (PD-OFF) and recently-medicated PD (PD-ON) patients. The study (Bodi et al., 2009) showed that the never-medicated patients were more sensitive to punishment than the recently-medicated patients and controls. On the other hand, the recently-medicated patients outperformed the never-medicated patients and controls on reward learning tasks (Figure 5). The optimal decision (as shown in the Figure 5) is the selection of A for I1 and I3, and B for I2 and I4.


A network model of basal ganglia for understanding the roles of dopamine and serotonin in reward-punishment-risk based decision making.

Balasubramani PP, Chakravarthy VS, Ravindran B, Moustafa AA - Front Comput Neurosci (2015)

The reward punishment sensitivity obtained by simulated (Sims)- PD and controls model to explain the experiment (Expt) of Bodi et al. (2009). Error bars represent the standard error (SE) with N = 100 (N = number of simulation instances). The Sims matches the Expt value distribution closely, and are not found to be significantly different.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4469836&req=5

Figure 5: The reward punishment sensitivity obtained by simulated (Sims)- PD and controls model to explain the experiment (Expt) of Bodi et al. (2009). Error bars represent the standard error (SE) with N = 100 (N = number of simulation instances). The Sims matches the Expt value distribution closely, and are not found to be significantly different.
Mentions: The simulation studies presented so far are performed under controlled conditions. This section simulates a study related to reward-punishment learning that involved PD patients. Bodi et al. (2009) used a probabilistic classification task for assessing reward-punishment learning under the different medication conditions of PD patients. The medications used in the study were a mix of DA agonists (Pramipexole and Ropinirole) and L-Dopa. The task was as follows: one of four random fractal images (I1–I4) were presented. In response to each image, the subject had to press on one of two buttons—A or B–on a keypad. Stimuli I1 and I2 was always associated with reward (+25 points), while I3, I4 was associated with loss/punishment (−25 points). The probability of reward or punishment outcome depended on the button (A or B) that the subject pressed in response to viewing an image. The reward/punishment probabilities associated with two responses, for each of the four stimuli, are summarized in Table 7. There are 160 trials administered in 4 blocks. Experiments were performed on controls, never-medicated (PD-OFF) and recently-medicated PD (PD-ON) patients. The study (Bodi et al., 2009) showed that the never-medicated patients were more sensitive to punishment than the recently-medicated patients and controls. On the other hand, the recently-medicated patients outperformed the never-medicated patients and controls on reward learning tasks (Figure 5). The optimal decision (as shown in the Figure 5) is the selection of A for I1 and I3, and B for I2 and I4.

Bottom Line: Our prior model of the BG was an abstract model that did not incorporate anatomical and cellular-level data.Though the existence of MSNs that co-express D1R and D2R are reported by various experimental studies, prior existing computational models did not include them.Starting from the assumption that 5HT modulates all MSNs, our study predicts significant modulatory effects of 5HT on D2R and co-expressing D1R-D2R MSNs which in turn explains the multifarious functions of 5HT in the BG.

View Article: PubMed Central - PubMed

Affiliation: Department of Biotechnology, Indian Institute of Technology Madras Chennai, India.

ABSTRACT
There is significant evidence that in addition to reward-punishment based decision making, the Basal Ganglia (BG) contributes to risk-based decision making (Balasubramani et al., 2014). Despite this evidence, little is known about the computational principles and neural correlates of risk computation in this subcortical system. We have previously proposed a reinforcement learning (RL)-based model of the BG that simulates the interactions between dopamine (DA) and serotonin (5HT) in a diverse set of experimental studies including reward, punishment and risk based decision making (Balasubramani et al., 2014). Starting with the classical idea that the activity of mesencephalic DA represents reward prediction error, the model posits that serotoninergic activity in the striatum controls risk-prediction error. Our prior model of the BG was an abstract model that did not incorporate anatomical and cellular-level data. In this work, we expand the earlier model into a detailed network model of the BG and demonstrate the joint contributions of DA-5HT in risk and reward-punishment sensitivity. At the core of the proposed network model is the following insight regarding cellular correlates of value and risk computation. Just as DA D1 receptor (D1R) expressing medium spiny neurons (MSNs) of the striatum were thought to be the neural substrates for value computation, we propose that DA D1R and D2R co-expressing MSNs are capable of computing risk. Though the existence of MSNs that co-express D1R and D2R are reported by various experimental studies, prior existing computational models did not include them. Ours is the first model that accounts for the computational possibilities of these co-expressing D1R-D2R MSNs, and describes how DA and 5HT mediate activity in these classes of neurons (D1R-, D2R-, D1R-D2R- MSNs). Starting from the assumption that 5HT modulates all MSNs, our study predicts significant modulatory effects of 5HT on D2R and co-expressing D1R-D2R MSNs which in turn explains the multifarious functions of 5HT in the BG. The experiments simulated in the present study relates 5HT to risk sensitivity and reward-punishment learning. Furthermore, our model is shown to capture reward-punishment and risk based decision making impairment in Parkinson's Disease (PD). The model predicts that optimizing 5HT levels along with DA medications might be essential for improving the patients' reward-punishment learning deficits.

No MeSH data available.


Related in: MedlinePlus