Limits...
Modeling the value of strategic actions in the superior colliculus.

Thevarajah D, Webb R, Ferrall C, Dorris MC - Front Behav Neurosci (2010)

Bottom Line: Further, SC activity predicted upcoming choices during the strategic task and upcoming reaction times during the instructed task.Finally, we found that neuronal activity in both tasks correlated with an established learning model, the Experience Weighted Attraction model of action valuation (Camerer and Ho, 1999).Collectively, our results provide evidence that action values hypothesized by learning models are represented in the motor planning regions of the brain in a manner that could be used to select strategic actions.

View Article: PubMed Central - PubMed

Affiliation: Department of Physiology, Centre for Neuroscience Studies and Canadian Institutes of Health Research Group in Sensory-Motor Systems, Queen's University Kingston, ON, Canada.

ABSTRACT
In learning models of strategic game play, an agent constructs a valuation (action value) over possible future choices as a function of past actions and rewards. Choices are then stochastic functions of these action values. Our goal is to uncover a neural signal that correlates with the action value posited by behavioral learning models. We measured activity from neurons in the superior colliculus (SC), a midbrain region involved in planning saccadic eye movements, while monkeys performed two saccade tasks. In the strategic task, monkeys competed against a computer in a saccade version of the mixed-strategy game "matching-pennies". In the instructed task, saccades were elicited through explicit instruction rather than free choices. In both tasks neuronal activity and behavior were shaped by past actions and rewards with more recent events exerting a larger influence. Further, SC activity predicted upcoming choices during the strategic task and upcoming reaction times during the instructed task. Finally, we found that neuronal activity in both tasks correlated with an established learning model, the Experience Weighted Attraction model of action valuation (Camerer and Ho, 1999). Collectively, our results provide evidence that action values hypothesized by learning models are represented in the motor planning regions of the brain in a manner that could be used to select strategic actions.

No MeSH data available.


Strategic Task.  (A) Each panel represents successive visual displays presented to the monkey. Red circles represent the central fixation point and choice targets respectively. In the third panel, arrows indicate the monkeys possible saccadic choices. One of the saccade targets was always placed in the center of the neuron's response field (i.e., in target) as indicted with the dashed circle. The out target was placed in the opposite hemifield. The red square indicates the choice of the computer opponent. (B) Time-line of strategic task. Grey shaded region indicates the 50 ms epoch during which SCi preparetory activity was sampled for neuronal analyses. The stimuli setup and time-line were identical for the instructed task (not shown) except that only one target was presented per trial and the red square surrounded the target only on rewarded trials.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2821176&req=5

Figure 1: Strategic Task. (A) Each panel represents successive visual displays presented to the monkey. Red circles represent the central fixation point and choice targets respectively. In the third panel, arrows indicate the monkeys possible saccadic choices. One of the saccade targets was always placed in the center of the neuron's response field (i.e., in target) as indicted with the dashed circle. The out target was placed in the opposite hemifield. The red square indicates the choice of the computer opponent. (B) Time-line of strategic task. Grey shaded region indicates the 50 ms epoch during which SCi preparetory activity was sampled for neuronal analyses. The stimuli setup and time-line were identical for the instructed task (not shown) except that only one target was presented per trial and the red square surrounded the target only on rewarded trials.

Mentions: Monkeys competed in a saccadic version of the repeated mixed-strategy game matching-pennies against an adaptive computer opponent (Figure 1). Each trial, both the subject and computer reveal a strategy in or out. The monkey, pre-designated the “matcher”, wins if their strategies match, and the computer, pre-designated the “non-matcher”, wins if their strategies differ. The unique Minimax/Nash Equilibrium in mixed strategies is for each player to play in and out with equal probability (von Neumann and Morgenstern, 1947; Nash, 1951), though our analysis does not require that equilibrium play is achieved. Because our experimental setup limits the ability for the monkey to suffer a loss we replaced a loss with a withholding of reward, though the equilibrium remains unchanged. The payoff matrix is given in Figure 2 and has been previously studied experimentally in humans (Mookherjee and Sopher, 1994) and monkeys (Lee et al., 2004; Thevarajah et al., 2009).


Modeling the value of strategic actions in the superior colliculus.

Thevarajah D, Webb R, Ferrall C, Dorris MC - Front Behav Neurosci (2010)

Strategic Task.  (A) Each panel represents successive visual displays presented to the monkey. Red circles represent the central fixation point and choice targets respectively. In the third panel, arrows indicate the monkeys possible saccadic choices. One of the saccade targets was always placed in the center of the neuron's response field (i.e., in target) as indicted with the dashed circle. The out target was placed in the opposite hemifield. The red square indicates the choice of the computer opponent. (B) Time-line of strategic task. Grey shaded region indicates the 50 ms epoch during which SCi preparetory activity was sampled for neuronal analyses. The stimuli setup and time-line were identical for the instructed task (not shown) except that only one target was presented per trial and the red square surrounded the target only on rewarded trials.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2821176&req=5

Figure 1: Strategic Task. (A) Each panel represents successive visual displays presented to the monkey. Red circles represent the central fixation point and choice targets respectively. In the third panel, arrows indicate the monkeys possible saccadic choices. One of the saccade targets was always placed in the center of the neuron's response field (i.e., in target) as indicted with the dashed circle. The out target was placed in the opposite hemifield. The red square indicates the choice of the computer opponent. (B) Time-line of strategic task. Grey shaded region indicates the 50 ms epoch during which SCi preparetory activity was sampled for neuronal analyses. The stimuli setup and time-line were identical for the instructed task (not shown) except that only one target was presented per trial and the red square surrounded the target only on rewarded trials.
Mentions: Monkeys competed in a saccadic version of the repeated mixed-strategy game matching-pennies against an adaptive computer opponent (Figure 1). Each trial, both the subject and computer reveal a strategy in or out. The monkey, pre-designated the “matcher”, wins if their strategies match, and the computer, pre-designated the “non-matcher”, wins if their strategies differ. The unique Minimax/Nash Equilibrium in mixed strategies is for each player to play in and out with equal probability (von Neumann and Morgenstern, 1947; Nash, 1951), though our analysis does not require that equilibrium play is achieved. Because our experimental setup limits the ability for the monkey to suffer a loss we replaced a loss with a withholding of reward, though the equilibrium remains unchanged. The payoff matrix is given in Figure 2 and has been previously studied experimentally in humans (Mookherjee and Sopher, 1994) and monkeys (Lee et al., 2004; Thevarajah et al., 2009).

Bottom Line: Further, SC activity predicted upcoming choices during the strategic task and upcoming reaction times during the instructed task.Finally, we found that neuronal activity in both tasks correlated with an established learning model, the Experience Weighted Attraction model of action valuation (Camerer and Ho, 1999).Collectively, our results provide evidence that action values hypothesized by learning models are represented in the motor planning regions of the brain in a manner that could be used to select strategic actions.

View Article: PubMed Central - PubMed

Affiliation: Department of Physiology, Centre for Neuroscience Studies and Canadian Institutes of Health Research Group in Sensory-Motor Systems, Queen's University Kingston, ON, Canada.

ABSTRACT
In learning models of strategic game play, an agent constructs a valuation (action value) over possible future choices as a function of past actions and rewards. Choices are then stochastic functions of these action values. Our goal is to uncover a neural signal that correlates with the action value posited by behavioral learning models. We measured activity from neurons in the superior colliculus (SC), a midbrain region involved in planning saccadic eye movements, while monkeys performed two saccade tasks. In the strategic task, monkeys competed against a computer in a saccade version of the mixed-strategy game "matching-pennies". In the instructed task, saccades were elicited through explicit instruction rather than free choices. In both tasks neuronal activity and behavior were shaped by past actions and rewards with more recent events exerting a larger influence. Further, SC activity predicted upcoming choices during the strategic task and upcoming reaction times during the instructed task. Finally, we found that neuronal activity in both tasks correlated with an established learning model, the Experience Weighted Attraction model of action valuation (Camerer and Ho, 1999). Collectively, our results provide evidence that action values hypothesized by learning models are represented in the motor planning regions of the brain in a manner that could be used to select strategic actions.

No MeSH data available.