Limits...
Formant-frequency variation and informational masking of speech by extraneous formants: evidence against dynamic and speech-specific acoustical constraints.

Roberts B, Summers RJ, Bailey PJ - J Exp Psychol Hum Percept Perform (2014)

Bottom Line: The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility.Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs.Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

View Article: PubMed Central - PubMed

Affiliation: Psychology, School of Life and Health Sciences.

ABSTRACT
How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

Show MeSH

Related in: MedlinePlus

Stimuli for Experiment 2: Schematic illustrating the dichotic configuration used. The left ear receives F1 of the example sentence “The cat ran along;” the right ear receives F2 and F3. A scale factor of 50% was applied to the frequency contour of each target formant, relative to the geometric mean frequency of that formant track. The second-formant competitor (F2C), whose frequency contour was derived from that for F2 by inversion about the geometric mean, is presented in the same ear as F1. The depth of frequency variation in F2C was controlled relative to that for the unscaled target F2. Illustrated here are F2C frequency contours for the cases where the scale factor was 100% (dashed line), 50% (i.e., matching the scale factor for the target formants; solid line), and 0% (dotted line); for the 0% case, the frequency was constant at the geometric mean frequency. The full set of scale factors used ranged from 100% to 0%, in steps of 25%. Note that the amplitude contours (not shown here) of the target formants were always presented without adjustment. The amplitude contour of each F2C was set to a constant value corresponding to the RMS power of the amplitude contour for the target F2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4120706&req=5

fig3: Stimuli for Experiment 2: Schematic illustrating the dichotic configuration used. The left ear receives F1 of the example sentence “The cat ran along;” the right ear receives F2 and F3. A scale factor of 50% was applied to the frequency contour of each target formant, relative to the geometric mean frequency of that formant track. The second-formant competitor (F2C), whose frequency contour was derived from that for F2 by inversion about the geometric mean, is presented in the same ear as F1. The depth of frequency variation in F2C was controlled relative to that for the unscaled target F2. Illustrated here are F2C frequency contours for the cases where the scale factor was 100% (dashed line), 50% (i.e., matching the scale factor for the target formants; solid line), and 0% (dotted line); for the 0% case, the frequency was constant at the geometric mean frequency. The full set of scale factors used ranged from 100% to 0%, in steps of 25%. Note that the amplitude contours (not shown here) of the target formants were always presented without adjustment. The amplitude contour of each F2C was set to a constant value corresponding to the RMS power of the amplitude contour for the target F2.

Mentions: The stimuli comprised synthetic analogues of 42 sentences; these were a subset of the sentences used in Experiment 1. In the main part of the experiment, the target formants were presented in a dichotic configuration (left ear = F1; right ear = F2+F3; cf. Rand, 1974). Note that this arrangement has an advantage over that used by Remez et al. (1994), in that competitors can be added to the left-ear input without risk of appreciable energetic masking of any of the target formants (Roberts et al., 2010; Summers et al., 2010, 2012). Figure 3 illustrates the stimulus configuration used when the three target formants were accompanied by a competitor. In previous studies, we have demonstrated that there are no appreciable ear-dominance effects for sentence-length utterances in the context of the dichotic F2C paradigm (Roberts et al., 2010; Summers et al., 2010). Therefore, we did not counterbalance for ear of presentation in the dichotic configurations used in Experiments 2 and 3.


Formant-frequency variation and informational masking of speech by extraneous formants: evidence against dynamic and speech-specific acoustical constraints.

Roberts B, Summers RJ, Bailey PJ - J Exp Psychol Hum Percept Perform (2014)

Stimuli for Experiment 2: Schematic illustrating the dichotic configuration used. The left ear receives F1 of the example sentence “The cat ran along;” the right ear receives F2 and F3. A scale factor of 50% was applied to the frequency contour of each target formant, relative to the geometric mean frequency of that formant track. The second-formant competitor (F2C), whose frequency contour was derived from that for F2 by inversion about the geometric mean, is presented in the same ear as F1. The depth of frequency variation in F2C was controlled relative to that for the unscaled target F2. Illustrated here are F2C frequency contours for the cases where the scale factor was 100% (dashed line), 50% (i.e., matching the scale factor for the target formants; solid line), and 0% (dotted line); for the 0% case, the frequency was constant at the geometric mean frequency. The full set of scale factors used ranged from 100% to 0%, in steps of 25%. Note that the amplitude contours (not shown here) of the target formants were always presented without adjustment. The amplitude contour of each F2C was set to a constant value corresponding to the RMS power of the amplitude contour for the target F2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4120706&req=5

fig3: Stimuli for Experiment 2: Schematic illustrating the dichotic configuration used. The left ear receives F1 of the example sentence “The cat ran along;” the right ear receives F2 and F3. A scale factor of 50% was applied to the frequency contour of each target formant, relative to the geometric mean frequency of that formant track. The second-formant competitor (F2C), whose frequency contour was derived from that for F2 by inversion about the geometric mean, is presented in the same ear as F1. The depth of frequency variation in F2C was controlled relative to that for the unscaled target F2. Illustrated here are F2C frequency contours for the cases where the scale factor was 100% (dashed line), 50% (i.e., matching the scale factor for the target formants; solid line), and 0% (dotted line); for the 0% case, the frequency was constant at the geometric mean frequency. The full set of scale factors used ranged from 100% to 0%, in steps of 25%. Note that the amplitude contours (not shown here) of the target formants were always presented without adjustment. The amplitude contour of each F2C was set to a constant value corresponding to the RMS power of the amplitude contour for the target F2.
Mentions: The stimuli comprised synthetic analogues of 42 sentences; these were a subset of the sentences used in Experiment 1. In the main part of the experiment, the target formants were presented in a dichotic configuration (left ear = F1; right ear = F2+F3; cf. Rand, 1974). Note that this arrangement has an advantage over that used by Remez et al. (1994), in that competitors can be added to the left-ear input without risk of appreciable energetic masking of any of the target formants (Roberts et al., 2010; Summers et al., 2010, 2012). Figure 3 illustrates the stimulus configuration used when the three target formants were accompanied by a competitor. In previous studies, we have demonstrated that there are no appreciable ear-dominance effects for sentence-length utterances in the context of the dichotic F2C paradigm (Roberts et al., 2010; Summers et al., 2010). Therefore, we did not counterbalance for ear of presentation in the dichotic configurations used in Experiments 2 and 3.

Bottom Line: The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility.Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs.Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

View Article: PubMed Central - PubMed

Affiliation: Psychology, School of Life and Health Sciences.

ABSTRACT
How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

Show MeSH
Related in: MedlinePlus