Limits...
Formant-frequency variation and informational masking of speech by extraneous formants: evidence against dynamic and speech-specific acoustical constraints.

Roberts B, Summers RJ, Bailey PJ - J Exp Psychol Hum Percept Perform (2014)

Bottom Line: The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility.Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs.Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

View Article: PubMed Central - PubMed

Affiliation: Psychology, School of Life and Health Sciences.

ABSTRACT
How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

Show MeSH

Related in: MedlinePlus

Stimuli for Experiment 3: Schematic illustrating how the formant-frequency contour of the triangle-wave F2C was derived from that of the target F2. Note the use of a log frequency scale in this figure. Using the example sentence “The mud was brown,” the top panel depicts the formant-frequency contour of the target F2 (solid line), its geometric mean frequency (dotted line), and zero crossings relative to the geometric mean (circles). The middle panel depicts the F2C whose frequency contour was derived from that of F2 by inversion about the geometric mean (a plausibly speech-like variation); the bottom panel depicts the frequency contour for the corresponding triangle-wave F2C (not plausibly speech-like). The triangle-wave frequency contour was generated using the first four odd harmonics of the chosen period for the triangle-wave function; the number of half cycles corresponds to the number of zero crossings plus one. For illustrative purposes, the starting phase in this example was not set randomly but was instead chosen to produce a negative-going contour whose starting (and ending) frequency corresponded to the geometric mean frequency of the target F2 contour (dotted line). The full set of scale factors used to control the depth of formant-frequency variation in the F2C ranged from 100% to 0%, in steps of 25%. The amplitude contours (not shown here) for the target formants and F2Cs are the same as described for Experiment 2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4120706&req=5

fig5: Stimuli for Experiment 3: Schematic illustrating how the formant-frequency contour of the triangle-wave F2C was derived from that of the target F2. Note the use of a log frequency scale in this figure. Using the example sentence “The mud was brown,” the top panel depicts the formant-frequency contour of the target F2 (solid line), its geometric mean frequency (dotted line), and zero crossings relative to the geometric mean (circles). The middle panel depicts the F2C whose frequency contour was derived from that of F2 by inversion about the geometric mean (a plausibly speech-like variation); the bottom panel depicts the frequency contour for the corresponding triangle-wave F2C (not plausibly speech-like). The triangle-wave frequency contour was generated using the first four odd harmonics of the chosen period for the triangle-wave function; the number of half cycles corresponds to the number of zero crossings plus one. For illustrative purposes, the starting phase in this example was not set randomly but was instead chosen to produce a negative-going contour whose starting (and ending) frequency corresponded to the geometric mean frequency of the target F2 contour (dotted line). The full set of scale factors used to control the depth of formant-frequency variation in the F2C ranged from 100% to 0%, in steps of 25%. The amplitude contours (not shown here) for the target formants and F2Cs are the same as described for Experiment 2.

Mentions: This experiment used the same dichotic configuration as for Experiment 2 and was similar in overall design. The stimuli comprised synthetic analogues of 48 sentences; there was no overlap with the sentences used in the other experiments. A set of F2 competitors was created for each sentence in the main experiment. Figure 5 illustrates schematically the two types of frequency contour used for F2C in this experiment, and their relationship to that for the target F2. In one condition, the frequency contour of each F2C was created by inverting the frequency contour of the target F2 without rescaling (middle panel). For the other experimental conditions, F2C had the frequency contour of a triangle wave (bottom panel); the parameters for the triangle-wave frequency contour were chosen broadly to match the average rate and depth of variation for the target F2, and hence for its inverted F2C counterpart. In outline, the period was set in relation to zero crossings at the geometric mean frequency (see top panel) and the peak-to-trough range was matched to that of the target F2 on a log scale, but was made symmetrical by centering the range on the geometric mean. The triangle-wave contour was then rescaled to the desired depth. In greater detail, the process of generating the triangle-wave contours was as follows:


Formant-frequency variation and informational masking of speech by extraneous formants: evidence against dynamic and speech-specific acoustical constraints.

Roberts B, Summers RJ, Bailey PJ - J Exp Psychol Hum Percept Perform (2014)

Stimuli for Experiment 3: Schematic illustrating how the formant-frequency contour of the triangle-wave F2C was derived from that of the target F2. Note the use of a log frequency scale in this figure. Using the example sentence “The mud was brown,” the top panel depicts the formant-frequency contour of the target F2 (solid line), its geometric mean frequency (dotted line), and zero crossings relative to the geometric mean (circles). The middle panel depicts the F2C whose frequency contour was derived from that of F2 by inversion about the geometric mean (a plausibly speech-like variation); the bottom panel depicts the frequency contour for the corresponding triangle-wave F2C (not plausibly speech-like). The triangle-wave frequency contour was generated using the first four odd harmonics of the chosen period for the triangle-wave function; the number of half cycles corresponds to the number of zero crossings plus one. For illustrative purposes, the starting phase in this example was not set randomly but was instead chosen to produce a negative-going contour whose starting (and ending) frequency corresponded to the geometric mean frequency of the target F2 contour (dotted line). The full set of scale factors used to control the depth of formant-frequency variation in the F2C ranged from 100% to 0%, in steps of 25%. The amplitude contours (not shown here) for the target formants and F2Cs are the same as described for Experiment 2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4120706&req=5

fig5: Stimuli for Experiment 3: Schematic illustrating how the formant-frequency contour of the triangle-wave F2C was derived from that of the target F2. Note the use of a log frequency scale in this figure. Using the example sentence “The mud was brown,” the top panel depicts the formant-frequency contour of the target F2 (solid line), its geometric mean frequency (dotted line), and zero crossings relative to the geometric mean (circles). The middle panel depicts the F2C whose frequency contour was derived from that of F2 by inversion about the geometric mean (a plausibly speech-like variation); the bottom panel depicts the frequency contour for the corresponding triangle-wave F2C (not plausibly speech-like). The triangle-wave frequency contour was generated using the first four odd harmonics of the chosen period for the triangle-wave function; the number of half cycles corresponds to the number of zero crossings plus one. For illustrative purposes, the starting phase in this example was not set randomly but was instead chosen to produce a negative-going contour whose starting (and ending) frequency corresponded to the geometric mean frequency of the target F2 contour (dotted line). The full set of scale factors used to control the depth of formant-frequency variation in the F2C ranged from 100% to 0%, in steps of 25%. The amplitude contours (not shown here) for the target formants and F2Cs are the same as described for Experiment 2.
Mentions: This experiment used the same dichotic configuration as for Experiment 2 and was similar in overall design. The stimuli comprised synthetic analogues of 48 sentences; there was no overlap with the sentences used in the other experiments. A set of F2 competitors was created for each sentence in the main experiment. Figure 5 illustrates schematically the two types of frequency contour used for F2C in this experiment, and their relationship to that for the target F2. In one condition, the frequency contour of each F2C was created by inverting the frequency contour of the target F2 without rescaling (middle panel). For the other experimental conditions, F2C had the frequency contour of a triangle wave (bottom panel); the parameters for the triangle-wave frequency contour were chosen broadly to match the average rate and depth of variation for the target F2, and hence for its inverted F2C counterpart. In outline, the period was set in relation to zero crossings at the geometric mean frequency (see top panel) and the peak-to-trough range was matched to that of the target F2 on a log scale, but was made symmetrical by centering the range on the geometric mean. The triangle-wave contour was then rescaled to the desired depth. In greater detail, the process of generating the triangle-wave contours was as follows:

Bottom Line: The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility.Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs.Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

View Article: PubMed Central - PubMed

Affiliation: Psychology, School of Life and Health Sciences.

ABSTRACT
How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.

Show MeSH
Related in: MedlinePlus