Limits...
Neuronal oscillations and speech perception: critical-band temporal envelopes are the essence.

Ghitza O, Giraud AL, Poeppel D - Front Hum Neurosci (2013)

Bottom Line: A RECENT OPINION ARTICLE (NEURAL OSCILLATIONS IN SPEECH: do not be enslaved by the envelope.The authors criticize, in particular, what they see as an over-emphasis of the role of temporal speech envelope information, and an over-emphasis of entrainment to the input rhythm while neglecting the role of top-down processes in modulating the entrainment of neuronal oscillations.Here we respond to these arguments, referring to the phenomenological model of Ghitza (2011), taken as a representative of the criticized approach.

View Article: PubMed Central - PubMed

Affiliation: Biomedical Engineering, Boston University Boston, MA, USA.

ABSTRACT
A RECENT OPINION ARTICLE (NEURAL OSCILLATIONS IN SPEECH: do not be enslaved by the envelope. Obleser et al., 2012) questions the validity of a class of speech perception models inspired by the possible role of neuronal oscillations in decoding speech (e.g., Ghitza, 2011; Giraud and Poeppel, 2012). The authors criticize, in particular, what they see as an over-emphasis of the role of temporal speech envelope information, and an over-emphasis of entrainment to the input rhythm while neglecting the role of top-down processes in modulating the entrainment of neuronal oscillations. Here we respond to these arguments, referring to the phenomenological model of Ghitza (2011), taken as a representative of the criticized approach.

No MeSH data available.


Top panel. A 1 s long FM stimulus with a 1 KHz carrier, modulated by a 5 Hz sinusoid. Bottom Panels: Simulated Inner Hair Cell (IHC) responses, low-pass filtered to 50 Hz, at five successive center frequencies (CFs) surrounding the carrier location. The cochlear filters are modeled as linear gammatone filters and the IHC as a half-wave rectifier followed by a low-pass filter, representing the reduction of synchrony with CF. Note the re-generation of the modulating signal at the cochlear output.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3539830&req=5

Figure 1: Top panel. A 1 s long FM stimulus with a 1 KHz carrier, modulated by a 5 Hz sinusoid. Bottom Panels: Simulated Inner Hair Cell (IHC) responses, low-pass filtered to 50 Hz, at five successive center frequencies (CFs) surrounding the carrier location. The cochlear filters are modeled as linear gammatone filters and the IHC as a half-wave rectifier followed by a low-pass filter, representing the reduction of synchrony with CF. Note the re-generation of the modulating signal at the cochlear output.

Mentions: Consider the argument raised by Obleser et al., embodied in their Figure 1 (and is the catalyst for the title: “… don't be enslaved by the envelope”). How come, they ask, are peaks observed at the frequency of the modulating signal in both the EEG phase coherence and the EEG power, even though the envelope of the FM stimulus (their Figure 1A) is flat2? A theorem in the field of communications provides an analytic answer to this question. The theorem determines that if a signal φ(t) is a band-limited signal, and if the FM signal A·cos[φ(t)] is the input to a band-pass filter with a bandwidth in the order of the bandwidth of φ(t), then the filter's output has an envelope that is related to φ(t) (e.g., Rice, 1973)3. A corollary to this theorem [noticed by Ghitza (2001)] is that if the band-pass filter represents a cochlear filter, then the envelope information at the cochlear output (i.e., the information available to the brain) is some non-flat, non-linear function of φ(t)! (This corollary was later validated psychoacoustically, e.g., Gilbert and Lorenzi, 2006.) In Obleser et al. three FM stimuli were used, with 500 Hz wide complex carrier signals centered on one of three frequencies (800, 1000, and 1200 Hz), and with a modulating signal of 3 Hz. Since critical bands at these frequencies are 100–150 Hz wide, such signals, when presented to the listener ear, will result in critical-band outputs with non-flat temporal envelopes that are related to the 3 Hz modulation signal4. Figures 1 and 2 illustrate this phenomenon using a FM stimulus with a 1 KHz carrier modulated by a 5 Hz sinusoid, and a stimulus provided by Obleser et al. (2012, Figure 1A), respectively.


Neuronal oscillations and speech perception: critical-band temporal envelopes are the essence.

Ghitza O, Giraud AL, Poeppel D - Front Hum Neurosci (2013)

Top panel. A 1 s long FM stimulus with a 1 KHz carrier, modulated by a 5 Hz sinusoid. Bottom Panels: Simulated Inner Hair Cell (IHC) responses, low-pass filtered to 50 Hz, at five successive center frequencies (CFs) surrounding the carrier location. The cochlear filters are modeled as linear gammatone filters and the IHC as a half-wave rectifier followed by a low-pass filter, representing the reduction of synchrony with CF. Note the re-generation of the modulating signal at the cochlear output.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3539830&req=5

Figure 1: Top panel. A 1 s long FM stimulus with a 1 KHz carrier, modulated by a 5 Hz sinusoid. Bottom Panels: Simulated Inner Hair Cell (IHC) responses, low-pass filtered to 50 Hz, at five successive center frequencies (CFs) surrounding the carrier location. The cochlear filters are modeled as linear gammatone filters and the IHC as a half-wave rectifier followed by a low-pass filter, representing the reduction of synchrony with CF. Note the re-generation of the modulating signal at the cochlear output.
Mentions: Consider the argument raised by Obleser et al., embodied in their Figure 1 (and is the catalyst for the title: “… don't be enslaved by the envelope”). How come, they ask, are peaks observed at the frequency of the modulating signal in both the EEG phase coherence and the EEG power, even though the envelope of the FM stimulus (their Figure 1A) is flat2? A theorem in the field of communications provides an analytic answer to this question. The theorem determines that if a signal φ(t) is a band-limited signal, and if the FM signal A·cos[φ(t)] is the input to a band-pass filter with a bandwidth in the order of the bandwidth of φ(t), then the filter's output has an envelope that is related to φ(t) (e.g., Rice, 1973)3. A corollary to this theorem [noticed by Ghitza (2001)] is that if the band-pass filter represents a cochlear filter, then the envelope information at the cochlear output (i.e., the information available to the brain) is some non-flat, non-linear function of φ(t)! (This corollary was later validated psychoacoustically, e.g., Gilbert and Lorenzi, 2006.) In Obleser et al. three FM stimuli were used, with 500 Hz wide complex carrier signals centered on one of three frequencies (800, 1000, and 1200 Hz), and with a modulating signal of 3 Hz. Since critical bands at these frequencies are 100–150 Hz wide, such signals, when presented to the listener ear, will result in critical-band outputs with non-flat temporal envelopes that are related to the 3 Hz modulation signal4. Figures 1 and 2 illustrate this phenomenon using a FM stimulus with a 1 KHz carrier modulated by a 5 Hz sinusoid, and a stimulus provided by Obleser et al. (2012, Figure 1A), respectively.

Bottom Line: A RECENT OPINION ARTICLE (NEURAL OSCILLATIONS IN SPEECH: do not be enslaved by the envelope.The authors criticize, in particular, what they see as an over-emphasis of the role of temporal speech envelope information, and an over-emphasis of entrainment to the input rhythm while neglecting the role of top-down processes in modulating the entrainment of neuronal oscillations.Here we respond to these arguments, referring to the phenomenological model of Ghitza (2011), taken as a representative of the criticized approach.

View Article: PubMed Central - PubMed

Affiliation: Biomedical Engineering, Boston University Boston, MA, USA.

ABSTRACT
A RECENT OPINION ARTICLE (NEURAL OSCILLATIONS IN SPEECH: do not be enslaved by the envelope. Obleser et al., 2012) questions the validity of a class of speech perception models inspired by the possible role of neuronal oscillations in decoding speech (e.g., Ghitza, 2011; Giraud and Poeppel, 2012). The authors criticize, in particular, what they see as an over-emphasis of the role of temporal speech envelope information, and an over-emphasis of entrainment to the input rhythm while neglecting the role of top-down processes in modulating the entrainment of neuronal oscillations. Here we respond to these arguments, referring to the phenomenological model of Ghitza (2011), taken as a representative of the criticized approach.

No MeSH data available.