Limits...
Dual learning processes underlying human decision-making in reversal learning tasks: functional significance and evidence from the model fit to human behavior.

Bai Y, Katahira K, Ohira H - Front Psychol (2014)

Bottom Line: Humans are capable of correcting their actions based on actions performed in the past, and this ability enables them to adapt to a changing environment.The computational field of reinforcement learning (RL) has provided a powerful explanation for understanding such processes.In the present study, we used computer simulation in a reversal learning task to address functional significance in a probabilistic reversal learning task.

View Article: PubMed Central - PubMed

Affiliation: Department of Psychology, Graduate School of Environmental Studies, Nagoya University Nagoya, Japan.

ABSTRACT
Humans are capable of correcting their actions based on actions performed in the past, and this ability enables them to adapt to a changing environment. The computational field of reinforcement learning (RL) has provided a powerful explanation for understanding such processes. Recently, the dual learning system, modeled as a hybrid model that incorporates value update based on reward-prediction error and learning rate modulation based on the surprise signal, has gained attention as a model for explaining various neural signals. However, the functional significance of the hybrid model has not been established. In the present study, we used computer simulation in a reversal learning task to address functional significance in a probabilistic reversal learning task. The hybrid model was found to perform better than the standard RL model in a large parameter setting. These results suggest that the hybrid model is more robust against the mistuning of parameters compared with the standard RL model when decision-makers continue to learn stimulus-reward contingencies, which can create abrupt changes. The parameter fitting results also indicated that the hybrid model fit better than the standard RL model for more than 50% of the participants, which suggests that the hybrid model has more explanatory power for the behavioral data than the standard RL model.

No MeSH data available.


Proportion of parameter regions in which the performance exceeds the median of the standard Q-learning for different task difficulties. The proportion is defined as the fraction of the parameter set of α0 and β with a rate of advantageous choice that is larger than the median (across parameter combinations of α0 and β) of that of the standard Q learning rate (η = 0). The task difficulty is a measure of how difficult it is to distinguish a good choice from a bad choice; in this case, it took three levels (reward/loss frequency ratios of 80:20, 70:30, and 60:40; easier tasks have a higher ratio).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4129443&req=5

Figure 2: Proportion of parameter regions in which the performance exceeds the median of the standard Q-learning for different task difficulties. The proportion is defined as the fraction of the parameter set of α0 and β with a rate of advantageous choice that is larger than the median (across parameter combinations of α0 and β) of that of the standard Q learning rate (η = 0). The task difficulty is a measure of how difficult it is to distinguish a good choice from a bad choice; in this case, it took three levels (reward/loss frequency ratios of 80:20, 70:30, and 60:40; easier tasks have a higher ratio).

Mentions: We also investigated the performance of models for different degrees of difficulty. Figure 2 indicates the proportion of parameter regions whose performance exceeds the median (across an entire parameter set) of the standard Q-learning for different task difficulty. The results indicated that the hybrid model did not necessarily outperform the standard Q-learning model at the low degree of difficulty (80–20%); however, the hybrid model outperformed the standard Q-learning model at a high degree of difficulty (60–40%).


Dual learning processes underlying human decision-making in reversal learning tasks: functional significance and evidence from the model fit to human behavior.

Bai Y, Katahira K, Ohira H - Front Psychol (2014)

Proportion of parameter regions in which the performance exceeds the median of the standard Q-learning for different task difficulties. The proportion is defined as the fraction of the parameter set of α0 and β with a rate of advantageous choice that is larger than the median (across parameter combinations of α0 and β) of that of the standard Q learning rate (η = 0). The task difficulty is a measure of how difficult it is to distinguish a good choice from a bad choice; in this case, it took three levels (reward/loss frequency ratios of 80:20, 70:30, and 60:40; easier tasks have a higher ratio).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4129443&req=5

Figure 2: Proportion of parameter regions in which the performance exceeds the median of the standard Q-learning for different task difficulties. The proportion is defined as the fraction of the parameter set of α0 and β with a rate of advantageous choice that is larger than the median (across parameter combinations of α0 and β) of that of the standard Q learning rate (η = 0). The task difficulty is a measure of how difficult it is to distinguish a good choice from a bad choice; in this case, it took three levels (reward/loss frequency ratios of 80:20, 70:30, and 60:40; easier tasks have a higher ratio).
Mentions: We also investigated the performance of models for different degrees of difficulty. Figure 2 indicates the proportion of parameter regions whose performance exceeds the median (across an entire parameter set) of the standard Q-learning for different task difficulty. The results indicated that the hybrid model did not necessarily outperform the standard Q-learning model at the low degree of difficulty (80–20%); however, the hybrid model outperformed the standard Q-learning model at a high degree of difficulty (60–40%).

Bottom Line: Humans are capable of correcting their actions based on actions performed in the past, and this ability enables them to adapt to a changing environment.The computational field of reinforcement learning (RL) has provided a powerful explanation for understanding such processes.In the present study, we used computer simulation in a reversal learning task to address functional significance in a probabilistic reversal learning task.

View Article: PubMed Central - PubMed

Affiliation: Department of Psychology, Graduate School of Environmental Studies, Nagoya University Nagoya, Japan.

ABSTRACT
Humans are capable of correcting their actions based on actions performed in the past, and this ability enables them to adapt to a changing environment. The computational field of reinforcement learning (RL) has provided a powerful explanation for understanding such processes. Recently, the dual learning system, modeled as a hybrid model that incorporates value update based on reward-prediction error and learning rate modulation based on the surprise signal, has gained attention as a model for explaining various neural signals. However, the functional significance of the hybrid model has not been established. In the present study, we used computer simulation in a reversal learning task to address functional significance in a probabilistic reversal learning task. The hybrid model was found to perform better than the standard RL model in a large parameter setting. These results suggest that the hybrid model is more robust against the mistuning of parameters compared with the standard RL model when decision-makers continue to learn stimulus-reward contingencies, which can create abrupt changes. The parameter fitting results also indicated that the hybrid model fit better than the standard RL model for more than 50% of the participants, which suggests that the hybrid model has more explanatory power for the behavioral data than the standard RL model.

No MeSH data available.