Limits...
A neuronal network model for context-dependence of pitch change perception.

Huang C, Englitz B, Shamma S, Rinzel J - Front Comput Neurosci (2015)

Bottom Line: We developed a recurrent, firing-rate network model, which detects frequency-change-direction of successively played stimuli and successfully accounts for the context-dependent perception demonstrated in behavioral experiments.The model's network architecture and slow facilitating inhibition emerge as predictions of neuronal mechanisms for these perceptual dynamics.Since the model structure does not depend on the specific stimuli, we show that it generalizes to other contextual effects and stimulus types.

View Article: PubMed Central - PubMed

Affiliation: Courant Institute of Mathematical Sciences, New York University New York, NY, USA.

ABSTRACT
Many natural stimuli have perceptual ambiguities that can be cognitively resolved by the surrounding context. In audition, preceding context can bias the perception of speech and non-speech stimuli. Here, we develop a neuronal network model that can account for how context affects the perception of pitch change between a pair of successive complex tones. We focus especially on an ambiguous comparison-listeners experience opposite percepts (either ascending or descending) for an ambiguous tone pair depending on the spectral location of preceding context tones. We developed a recurrent, firing-rate network model, which detects frequency-change-direction of successively played stimuli and successfully accounts for the context-dependent perception demonstrated in behavioral experiments. The model consists of two tonotopically organized, excitatory populations, E up and E down, that respond preferentially to ascending or descending stimuli in pitch, respectively. These preferences are generated by an inhibitory population that provides inhibition asymmetric in frequency to the two populations; context dependence arises from slow facilitation of inhibition. We show that contextual influence depends on the spectral distribution of preceding tones and the tuning width of inhibitory neurons. Further, we demonstrate, using phase-space analysis, how the facilitated inhibition from previous stimuli and the waning inhibition from the just-preceding tone shape the competition between the E up and E down populations. In sum, our model accounts for contextual influences on the pitch change perception of an ambiguous tone pair by introducing a novel decoding strategy based on direction-selective units. The model's network architecture and slow facilitating inhibition emerge as predictions of neuronal mechanisms for these perceptual dynamics. Since the model structure does not depend on the specific stimuli, we show that it generalizes to other contextual effects and stimulus types.

No MeSH data available.


Related in: MedlinePlus

Neuronal model responses for two successive Shepard tones mimic human perception. (A,B) The spatiotemporal activity of the excitatory neurons (Eup in A, Edown in B) in response to a Shepard tone pair (T1 = 6 st, T2 = 9 st) is represented by their firing rates with the vertical axis corresponding to the PC of a unit's CF (see text). Each Shepard tone has a duration of 100 ms, with a 50 ms pause between tones. Firing rate is normalized between 0 and 1. (C,D) The synaptic input received by each neuron is shown for the Eup(C) and the Edown(D) populations. Although the early excitatory inputs are symmetric, the later inhibitory inputs are asymmetric, based on the asymmetric footprint from the inhibitory to excitatory units. (E) The response difference between Eup and Edown varies with PC interval between T1 and T2 consistently with human perception (Shepard, 1964; Chambers and Pressnitzer, 2014). The mean relative population activity differences D (Equation 6)during T2 are plotted as a function of the difference in pitch class between T2 and T1 (T2-T1). The response difference decreases with the pause between the tones [50 ms (blue), 100 ms (green), 200 ms (red)], decreasing steeper for static inhibitory synapses (solid) than for facilitating synapses (dashed).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4526807&req=5

Figure 3: Neuronal model responses for two successive Shepard tones mimic human perception. (A,B) The spatiotemporal activity of the excitatory neurons (Eup in A, Edown in B) in response to a Shepard tone pair (T1 = 6 st, T2 = 9 st) is represented by their firing rates with the vertical axis corresponding to the PC of a unit's CF (see text). Each Shepard tone has a duration of 100 ms, with a 50 ms pause between tones. Firing rate is normalized between 0 and 1. (C,D) The synaptic input received by each neuron is shown for the Eup(C) and the Edown(D) populations. Although the early excitatory inputs are symmetric, the later inhibitory inputs are asymmetric, based on the asymmetric footprint from the inhibitory to excitatory units. (E) The response difference between Eup and Edown varies with PC interval between T1 and T2 consistently with human perception (Shepard, 1964; Chambers and Pressnitzer, 2014). The mean relative population activity differences D (Equation 6)during T2 are plotted as a function of the difference in pitch class between T2 and T1 (T2-T1). The response difference decreases with the pause between the tones [50 ms (blue), 100 ms (green), 200 ms (red)], decreasing steeper for static inhibitory synapses (solid) than for facilitating synapses (dashed).

Mentions: We first consider the model's response to two Shepard tones (T1 and T2) without a pre-test sequence (Figure 3). Human listeners perceive relative steps of 1–5 semitones (st) as ascending, steps of 7–12 st (or equivalently -1 to -5 st) as descending, and a step of 6 st (tritone) as ambiguous (Shepard, 1964; Deutsch, 1986; Repp, 1997). Since the model is homogeneous along the frequency axis, we assume T1 = 6 st. At the onset of T1, both Eup and Edown have high firing rates (Figures 3A,B) with positive recurrent excitatory inputs centered around the network site for the PC of T1. This activity diminishes with time and its profile becomes asymmetric as inhibition develops (somewhat slower time scale) and suppresses lower frequency units in Eup and higher frequency units in Edown (Figures 3C,D). The post-stimulus (residual) inhibitory current decays with time constant 30 ms after the offset of T1. Hence, at the onset of T2 (PC = 9 st), Edown at the PC of T2 is inhibited while Eup is not, which gives Eup an advantage in competing with Edown for the model's prediction of pitch change percept. The positive difference (D) in response to T2 indicates an ascending percept, consistent with human perception for such a 3 st step change (Shepard, 1964; Chambers and Pressnitzer, 2014).


A neuronal network model for context-dependence of pitch change perception.

Huang C, Englitz B, Shamma S, Rinzel J - Front Comput Neurosci (2015)

Neuronal model responses for two successive Shepard tones mimic human perception. (A,B) The spatiotemporal activity of the excitatory neurons (Eup in A, Edown in B) in response to a Shepard tone pair (T1 = 6 st, T2 = 9 st) is represented by their firing rates with the vertical axis corresponding to the PC of a unit's CF (see text). Each Shepard tone has a duration of 100 ms, with a 50 ms pause between tones. Firing rate is normalized between 0 and 1. (C,D) The synaptic input received by each neuron is shown for the Eup(C) and the Edown(D) populations. Although the early excitatory inputs are symmetric, the later inhibitory inputs are asymmetric, based on the asymmetric footprint from the inhibitory to excitatory units. (E) The response difference between Eup and Edown varies with PC interval between T1 and T2 consistently with human perception (Shepard, 1964; Chambers and Pressnitzer, 2014). The mean relative population activity differences D (Equation 6)during T2 are plotted as a function of the difference in pitch class between T2 and T1 (T2-T1). The response difference decreases with the pause between the tones [50 ms (blue), 100 ms (green), 200 ms (red)], decreasing steeper for static inhibitory synapses (solid) than for facilitating synapses (dashed).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4526807&req=5

Figure 3: Neuronal model responses for two successive Shepard tones mimic human perception. (A,B) The spatiotemporal activity of the excitatory neurons (Eup in A, Edown in B) in response to a Shepard tone pair (T1 = 6 st, T2 = 9 st) is represented by their firing rates with the vertical axis corresponding to the PC of a unit's CF (see text). Each Shepard tone has a duration of 100 ms, with a 50 ms pause between tones. Firing rate is normalized between 0 and 1. (C,D) The synaptic input received by each neuron is shown for the Eup(C) and the Edown(D) populations. Although the early excitatory inputs are symmetric, the later inhibitory inputs are asymmetric, based on the asymmetric footprint from the inhibitory to excitatory units. (E) The response difference between Eup and Edown varies with PC interval between T1 and T2 consistently with human perception (Shepard, 1964; Chambers and Pressnitzer, 2014). The mean relative population activity differences D (Equation 6)during T2 are plotted as a function of the difference in pitch class between T2 and T1 (T2-T1). The response difference decreases with the pause between the tones [50 ms (blue), 100 ms (green), 200 ms (red)], decreasing steeper for static inhibitory synapses (solid) than for facilitating synapses (dashed).
Mentions: We first consider the model's response to two Shepard tones (T1 and T2) without a pre-test sequence (Figure 3). Human listeners perceive relative steps of 1–5 semitones (st) as ascending, steps of 7–12 st (or equivalently -1 to -5 st) as descending, and a step of 6 st (tritone) as ambiguous (Shepard, 1964; Deutsch, 1986; Repp, 1997). Since the model is homogeneous along the frequency axis, we assume T1 = 6 st. At the onset of T1, both Eup and Edown have high firing rates (Figures 3A,B) with positive recurrent excitatory inputs centered around the network site for the PC of T1. This activity diminishes with time and its profile becomes asymmetric as inhibition develops (somewhat slower time scale) and suppresses lower frequency units in Eup and higher frequency units in Edown (Figures 3C,D). The post-stimulus (residual) inhibitory current decays with time constant 30 ms after the offset of T1. Hence, at the onset of T2 (PC = 9 st), Edown at the PC of T2 is inhibited while Eup is not, which gives Eup an advantage in competing with Edown for the model's prediction of pitch change percept. The positive difference (D) in response to T2 indicates an ascending percept, consistent with human perception for such a 3 st step change (Shepard, 1964; Chambers and Pressnitzer, 2014).

Bottom Line: We developed a recurrent, firing-rate network model, which detects frequency-change-direction of successively played stimuli and successfully accounts for the context-dependent perception demonstrated in behavioral experiments.The model's network architecture and slow facilitating inhibition emerge as predictions of neuronal mechanisms for these perceptual dynamics.Since the model structure does not depend on the specific stimuli, we show that it generalizes to other contextual effects and stimulus types.

View Article: PubMed Central - PubMed

Affiliation: Courant Institute of Mathematical Sciences, New York University New York, NY, USA.

ABSTRACT
Many natural stimuli have perceptual ambiguities that can be cognitively resolved by the surrounding context. In audition, preceding context can bias the perception of speech and non-speech stimuli. Here, we develop a neuronal network model that can account for how context affects the perception of pitch change between a pair of successive complex tones. We focus especially on an ambiguous comparison-listeners experience opposite percepts (either ascending or descending) for an ambiguous tone pair depending on the spectral location of preceding context tones. We developed a recurrent, firing-rate network model, which detects frequency-change-direction of successively played stimuli and successfully accounts for the context-dependent perception demonstrated in behavioral experiments. The model consists of two tonotopically organized, excitatory populations, E up and E down, that respond preferentially to ascending or descending stimuli in pitch, respectively. These preferences are generated by an inhibitory population that provides inhibition asymmetric in frequency to the two populations; context dependence arises from slow facilitation of inhibition. We show that contextual influence depends on the spectral distribution of preceding tones and the tuning width of inhibitory neurons. Further, we demonstrate, using phase-space analysis, how the facilitated inhibition from previous stimuli and the waning inhibition from the just-preceding tone shape the competition between the E up and E down populations. In sum, our model accounts for contextual influences on the pitch change perception of an ambiguous tone pair by introducing a novel decoding strategy based on direction-selective units. The model's network architecture and slow facilitating inhibition emerge as predictions of neuronal mechanisms for these perceptual dynamics. Since the model structure does not depend on the specific stimuli, we show that it generalizes to other contextual effects and stimulus types.

No MeSH data available.


Related in: MedlinePlus