Limits...
Discriminating Non-native Vowels on the Basis of Multimodal, Auditory or Visual Information: Effects on Infants' Looking Patterns and Discrimination.

Ter Schure S, Junge C, Boersma P - Front Psychol (2016)

Bottom Line: This study tested whether infants' phonological perception is shaped by distributions of visual speech as well as by distributions of auditory speech, by comparing learning from multimodal (i.e., auditory-visual), visual-only, or auditory-only information.We used eye tracking to measure effects of distribution and sensory modality on infants' discrimination of the contrast.We propose that by 8 months, infants' native vowel categories are established insofar that learning a novel contrast is supported by attention to additional information, such as visual articulations.

View Article: PubMed Central - PubMed

Affiliation: Linguistics, University of Amsterdam Amsterdam, Netherlands.

ABSTRACT
Infants' perception of speech sound contrasts is modulated by their language environment, for example by the statistical distributions of the speech sounds they hear. Infants learn to discriminate speech sounds better when their input contains a two-peaked frequency distribution of those speech sounds than when their input contains a one-peaked frequency distribution. Effects of frequency distributions on phonetic learning have been tested almost exclusively for auditory input. But auditory speech is usually accompanied by visual information, that is, by visible articulations. This study tested whether infants' phonological perception is shaped by distributions of visual speech as well as by distributions of auditory speech, by comparing learning from multimodal (i.e., auditory-visual), visual-only, or auditory-only information. Dutch 8-month-old infants were exposed to either a one-peaked or two-peaked distribution from a continuum of vowels that formed a contrast in English, but not in Dutch. We used eye tracking to measure effects of distribution and sensory modality on infants' discrimination of the contrast. Although there were no overall effects of distribution or modality, separate t-tests in each of the six training conditions demonstrated significant discrimination of the vowel contrast in the two-peaked multimodal condition. For the modalities where the mouth was visible (visual-only and multimodal) we further examined infant looking patterns for the dynamic speaker's face. Infants in the two-peaked multimodal condition looked longer at her mouth than infants in any of the three other conditions. We propose that by 8 months, infants' native vowel categories are established insofar that learning a novel contrast is supported by attention to additional information, such as visual articulations.

No MeSH data available.


Related in: MedlinePlus

Stills from the training videos. (A) and (B) are taken from video 1 and video 32 in the multimodal and visual conditions. (C) is taken from video 11 from the auditory condition, in which infants saw no visual articulation information.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4836047&req=5

Figure 1: Stills from the training videos. (A) and (B) are taken from video 1 and video 32 in the multimodal and visual conditions. (C) is taken from video 11 from the auditory condition, in which infants saw no visual articulation information.

Mentions: To create the visual vowel continuum, a female speaker of Southern British English was recorded while she repeated the syllables /fæp/ and /fεp/ in infant-directed speech. Facial expressions (distance between nose and eyebrows, mouth opening, lip width) were measured in pixels and instances of /fæp/ and /fεp/ were paired to find the best matching set of two videos. From those two videos, the vowel portion was spliced and exported as individual picture frames. These frames were imported two-by-two – first frame of [æ] with first frame of [ε], and so on – into the morphing software MorphX (Wennerberg, 2011). With linear interpolation a 30-step continuum was made between each set of frames, resulting in 32 videos: step 1 a clear instance of /æ/, step 2 slightly closer to /ε/, steps 16 and 17 ambiguous instances, and step 32 a clear instance of /ε/ (see Figure 1). A third video provided the /f_p/-context for the vowels. In a pilot experiment, it was established that native British English speakers (n = 11) could identify the two endpoint vowels in a categorization task on the basis of only visual articulatory information (mean proportion correct 0.65, range 0.54–0.75, significantly different from chance at 0.50 with SD = 0.07).


Discriminating Non-native Vowels on the Basis of Multimodal, Auditory or Visual Information: Effects on Infants' Looking Patterns and Discrimination.

Ter Schure S, Junge C, Boersma P - Front Psychol (2016)

Stills from the training videos. (A) and (B) are taken from video 1 and video 32 in the multimodal and visual conditions. (C) is taken from video 11 from the auditory condition, in which infants saw no visual articulation information.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4836047&req=5

Figure 1: Stills from the training videos. (A) and (B) are taken from video 1 and video 32 in the multimodal and visual conditions. (C) is taken from video 11 from the auditory condition, in which infants saw no visual articulation information.
Mentions: To create the visual vowel continuum, a female speaker of Southern British English was recorded while she repeated the syllables /fæp/ and /fεp/ in infant-directed speech. Facial expressions (distance between nose and eyebrows, mouth opening, lip width) were measured in pixels and instances of /fæp/ and /fεp/ were paired to find the best matching set of two videos. From those two videos, the vowel portion was spliced and exported as individual picture frames. These frames were imported two-by-two – first frame of [æ] with first frame of [ε], and so on – into the morphing software MorphX (Wennerberg, 2011). With linear interpolation a 30-step continuum was made between each set of frames, resulting in 32 videos: step 1 a clear instance of /æ/, step 2 slightly closer to /ε/, steps 16 and 17 ambiguous instances, and step 32 a clear instance of /ε/ (see Figure 1). A third video provided the /f_p/-context for the vowels. In a pilot experiment, it was established that native British English speakers (n = 11) could identify the two endpoint vowels in a categorization task on the basis of only visual articulatory information (mean proportion correct 0.65, range 0.54–0.75, significantly different from chance at 0.50 with SD = 0.07).

Bottom Line: This study tested whether infants' phonological perception is shaped by distributions of visual speech as well as by distributions of auditory speech, by comparing learning from multimodal (i.e., auditory-visual), visual-only, or auditory-only information.We used eye tracking to measure effects of distribution and sensory modality on infants' discrimination of the contrast.We propose that by 8 months, infants' native vowel categories are established insofar that learning a novel contrast is supported by attention to additional information, such as visual articulations.

View Article: PubMed Central - PubMed

Affiliation: Linguistics, University of Amsterdam Amsterdam, Netherlands.

ABSTRACT
Infants' perception of speech sound contrasts is modulated by their language environment, for example by the statistical distributions of the speech sounds they hear. Infants learn to discriminate speech sounds better when their input contains a two-peaked frequency distribution of those speech sounds than when their input contains a one-peaked frequency distribution. Effects of frequency distributions on phonetic learning have been tested almost exclusively for auditory input. But auditory speech is usually accompanied by visual information, that is, by visible articulations. This study tested whether infants' phonological perception is shaped by distributions of visual speech as well as by distributions of auditory speech, by comparing learning from multimodal (i.e., auditory-visual), visual-only, or auditory-only information. Dutch 8-month-old infants were exposed to either a one-peaked or two-peaked distribution from a continuum of vowels that formed a contrast in English, but not in Dutch. We used eye tracking to measure effects of distribution and sensory modality on infants' discrimination of the contrast. Although there were no overall effects of distribution or modality, separate t-tests in each of the six training conditions demonstrated significant discrimination of the vowel contrast in the two-peaked multimodal condition. For the modalities where the mouth was visible (visual-only and multimodal) we further examined infant looking patterns for the dynamic speaker's face. Infants in the two-peaked multimodal condition looked longer at her mouth than infants in any of the three other conditions. We propose that by 8 months, infants' native vowel categories are established insofar that learning a novel contrast is supported by attention to additional information, such as visual articulations.

No MeSH data available.


Related in: MedlinePlus