Limits...
Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations(1,2,3).

Carney LH, Li T, McDonough JM - eNeuro (2015)

Bottom Line: Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners.The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features.The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits.

View Article: PubMed Central - HTML - PubMed

Affiliation: Departments of Biomedical Engineering, and Neurobiology & Anatomy, University of Rochester , Rochester, New York 14642.

ABSTRACT
Current models for neural coding of vowels are typically based on linear descriptions of the auditory periphery, and fail at high sound levels and in background noise. These models rely on either auditory nerve discharge rates or phase locking to temporal fine structure. However, both discharge rates and phase locking saturate at moderate to high sound levels, and phase locking is degraded in the CNS at middle to high frequencies. The fact that speech intelligibility is robust over a wide range of sound levels is problematic for codes that deteriorate as the sound level increases. Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners. The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features. The model also includes the properties of the auditory midbrain, where discharge rates are tuned to amplitude fluctuation rates. The nonlinear peripheral response features create contrasts in the amplitudes of low-frequency neural rate fluctuations across the population. These patterns of fluctuations result in a response profile in the midbrain that encodes vowel formants over a wide range of levels and in background noise. The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits. This model provides information for understanding the structure of cross-linguistic vowel spaces, and suggests strategies for automatic formant detection and speech enhancement for listeners with hearing loss.

No MeSH data available.


Related in: MedlinePlus

Models for modulation tuning in IC cells. A, Time waveform of the vowel /æ/ (had). B, The SFIE model (Nelson and Carney, 2004) for midbrain cells with BP MTFs (blue cell). An extension of the SFIE model is illustrated by the red cell, which is excited by ascending inputs and inhibited by the bandpass SFIE cell. This model cell simulates the relatively common low-pass or band-reject MTFs (see Fig. 3), and is referred to as the LPBR model. C, Model AN population response (Zilany et al., 2009, 2014). D, Population response of the BP IC model; BP neurons with BFs near F1 and F2 (arrows at right) have decreased responses (Fig. 1F). E, The LPBR model has peaks in the population rate profile near F1 and F2 (Fig. 1G).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4596011&req=5

Figure 2: Models for modulation tuning in IC cells. A, Time waveform of the vowel /æ/ (had). B, The SFIE model (Nelson and Carney, 2004) for midbrain cells with BP MTFs (blue cell). An extension of the SFIE model is illustrated by the red cell, which is excited by ascending inputs and inhibited by the bandpass SFIE cell. This model cell simulates the relatively common low-pass or band-reject MTFs (see Fig. 3), and is referred to as the LPBR model. C, Model AN population response (Zilany et al., 2009, 2014). D, Population response of the BP IC model; BP neurons with BFs near F1 and F2 (arrows at right) have decreased responses (Fig. 1F). E, The LPBR model has peaks in the population rate profile near F1 and F2 (Fig. 1G).

Mentions: A phenomenological model of AN responses that includes several key nonlinearities, including rate saturation, adaptation, and synchrony capture (Zilany et al., 2009, 2014) provided the inputs to the models for two types of midbrain neurons (Fig. 2A). IC cells with BP MTFs were simulated using the same-frequency inhibition-excitation (SFIE) model (Nelson and Carney, 2004), which explains tuning for the amplitude modulation frequency by the interaction of excitatory and inhibitory inputs with different dynamics. IC cells with low-pass, band-reject (LPBR), or high-pass MTFs were simulated using an extension of the SFIE model; the LPBR model received excitatory input from the brainstem and inhibitory input from bandpass cells (Fig. 2B). Time-varying input rate functions to each model cell were convolved with α functions representing excitatory or inhibitory postsynaptic responses. The decay time constants of the α functions and the delays associated with synaptic responses were varied to produce MTFs tuned to different amplitude modulation frequencies (Nelson and Carney, 2004).


Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations(1,2,3).

Carney LH, Li T, McDonough JM - eNeuro (2015)

Models for modulation tuning in IC cells. A, Time waveform of the vowel /æ/ (had). B, The SFIE model (Nelson and Carney, 2004) for midbrain cells with BP MTFs (blue cell). An extension of the SFIE model is illustrated by the red cell, which is excited by ascending inputs and inhibited by the bandpass SFIE cell. This model cell simulates the relatively common low-pass or band-reject MTFs (see Fig. 3), and is referred to as the LPBR model. C, Model AN population response (Zilany et al., 2009, 2014). D, Population response of the BP IC model; BP neurons with BFs near F1 and F2 (arrows at right) have decreased responses (Fig. 1F). E, The LPBR model has peaks in the population rate profile near F1 and F2 (Fig. 1G).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4596011&req=5

Figure 2: Models for modulation tuning in IC cells. A, Time waveform of the vowel /æ/ (had). B, The SFIE model (Nelson and Carney, 2004) for midbrain cells with BP MTFs (blue cell). An extension of the SFIE model is illustrated by the red cell, which is excited by ascending inputs and inhibited by the bandpass SFIE cell. This model cell simulates the relatively common low-pass or band-reject MTFs (see Fig. 3), and is referred to as the LPBR model. C, Model AN population response (Zilany et al., 2009, 2014). D, Population response of the BP IC model; BP neurons with BFs near F1 and F2 (arrows at right) have decreased responses (Fig. 1F). E, The LPBR model has peaks in the population rate profile near F1 and F2 (Fig. 1G).
Mentions: A phenomenological model of AN responses that includes several key nonlinearities, including rate saturation, adaptation, and synchrony capture (Zilany et al., 2009, 2014) provided the inputs to the models for two types of midbrain neurons (Fig. 2A). IC cells with BP MTFs were simulated using the same-frequency inhibition-excitation (SFIE) model (Nelson and Carney, 2004), which explains tuning for the amplitude modulation frequency by the interaction of excitatory and inhibitory inputs with different dynamics. IC cells with low-pass, band-reject (LPBR), or high-pass MTFs were simulated using an extension of the SFIE model; the LPBR model received excitatory input from the brainstem and inhibitory input from bandpass cells (Fig. 2B). Time-varying input rate functions to each model cell were convolved with α functions representing excitatory or inhibitory postsynaptic responses. The decay time constants of the α functions and the delays associated with synaptic responses were varied to produce MTFs tuned to different amplitude modulation frequencies (Nelson and Carney, 2004).

Bottom Line: Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners.The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features.The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits.

View Article: PubMed Central - HTML - PubMed

Affiliation: Departments of Biomedical Engineering, and Neurobiology & Anatomy, University of Rochester , Rochester, New York 14642.

ABSTRACT
Current models for neural coding of vowels are typically based on linear descriptions of the auditory periphery, and fail at high sound levels and in background noise. These models rely on either auditory nerve discharge rates or phase locking to temporal fine structure. However, both discharge rates and phase locking saturate at moderate to high sound levels, and phase locking is degraded in the CNS at middle to high frequencies. The fact that speech intelligibility is robust over a wide range of sound levels is problematic for codes that deteriorate as the sound level increases. Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners. The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features. The model also includes the properties of the auditory midbrain, where discharge rates are tuned to amplitude fluctuation rates. The nonlinear peripheral response features create contrasts in the amplitudes of low-frequency neural rate fluctuations across the population. These patterns of fluctuations result in a response profile in the midbrain that encodes vowel formants over a wide range of levels and in background noise. The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits. This model provides information for understanding the structure of cross-linguistic vowel spaces, and suggests strategies for automatic formant detection and speech enhancement for listeners with hearing loss.

No MeSH data available.


Related in: MedlinePlus