Limits...
Cue integration in categorical tasks: insights from audio-visual speech perception.

Bejjanki VR, Clayards M, Knill DC, Aslin RN - PLoS ONE (2011)

Bottom Line: Our results show that human performance during audio-visual phonemic labeling is qualitatively consistent with the behavior of a Bayes-optimal observer.Furthermore, we show that in our task, the sensory variability affecting the visual modality during cue-combination is not well estimated from single-cue performance, but can be estimated from multi-cue performance.The findings and computational principles described here represent a principled first step towards characterizing the mechanisms underlying human cue integration in categorical tasks.

View Article: PubMed Central - PubMed

Affiliation: Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, United States of America. vrao@bcs.rochester.edu

ABSTRACT
Previous cue integration studies have examined continuous perceptual dimensions (e.g., size) and have shown that human cue integration is well described by a normative model in which cues are weighted in proportion to their sensory reliability, as estimated from single-cue performance. However, this normative model may not be applicable to categorical perceptual dimensions (e.g., phonemes). In tasks defined over categorical perceptual dimensions, optimal cue weights should depend not only on the sensory variance affecting the perception of each cue but also on the environmental variance inherent in each task-relevant category. Here, we present a computational and experimental investigation of cue integration in a categorical audio-visual (articulatory) speech perception task. Our results show that human performance during audio-visual phonemic labeling is qualitatively consistent with the behavior of a Bayes-optimal observer. Specifically, we show that the participants in our task are sensitive, on a trial-by-trial basis, to the sensory uncertainty associated with the auditory and visual cues, during phonemic categorization. In addition, we show that while sensory uncertainty is a significant factor in determining cue weights, it is not the only one and participants' performance is consistent with an optimal model in which environmental, within category variability also plays a role in determining cue weights. Furthermore, we show that in our task, the sensory variability affecting the visual modality during cue-combination is not well estimated from single-cue performance, but can be estimated from multi-cue performance. The findings and computational principles described here represent a principled first step towards characterizing the mechanisms underlying human cue integration in categorical tasks.

Show MeSH

Related in: MedlinePlus

Variance affecting visual information during unimodal versus bimodal performance, for each participant.The y-axis represents the variance affecting visual information during task performance. The x-axis represents the 8 participants in our study. The blue bar, for each participant, represents the mean variance, across blur levels, affecting visual information during unimodal performance. The red bar, for each participant, represents the mean variance, across blur levels, affecting visual information during bimodal performance.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3102664&req=5

pone-0019812-g008: Variance affecting visual information during unimodal versus bimodal performance, for each participant.The y-axis represents the variance affecting visual information during task performance. The x-axis represents the 8 participants in our study. The blue bar, for each participant, represents the mean variance, across blur levels, affecting visual information during unimodal performance. The red bar, for each participant, represents the mean variance, across blur levels, affecting visual information during bimodal performance.

Mentions: Figure 8 shows the estimates of visual cue variance derived from the unimodal and bimodal conditions, for each of the 8 participants. It is immediately apparent from this figure that for participants 3 and 8, the variance affecting visual estimates during the cue-combination task was markedly higher than the variance affecting visual estimates during the single-cue task. The reason for this dramatic difference between the two estimates of variance is unclear. Regardless of the reason, however, it is important to note that our analysis allows us to objectively examine individual participants' behavior in each component of the experiment and to tag as outliers those who grossly failed to conform to the parameters of the experiment. As a result of the above analyses, and to ensure that subsequent analyses were not unduly biased by the behavior of these outlier subjects, we excluded their data from all subsequent analyses, reducing our sample size to 6.


Cue integration in categorical tasks: insights from audio-visual speech perception.

Bejjanki VR, Clayards M, Knill DC, Aslin RN - PLoS ONE (2011)

Variance affecting visual information during unimodal versus bimodal performance, for each participant.The y-axis represents the variance affecting visual information during task performance. The x-axis represents the 8 participants in our study. The blue bar, for each participant, represents the mean variance, across blur levels, affecting visual information during unimodal performance. The red bar, for each participant, represents the mean variance, across blur levels, affecting visual information during bimodal performance.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3102664&req=5

pone-0019812-g008: Variance affecting visual information during unimodal versus bimodal performance, for each participant.The y-axis represents the variance affecting visual information during task performance. The x-axis represents the 8 participants in our study. The blue bar, for each participant, represents the mean variance, across blur levels, affecting visual information during unimodal performance. The red bar, for each participant, represents the mean variance, across blur levels, affecting visual information during bimodal performance.
Mentions: Figure 8 shows the estimates of visual cue variance derived from the unimodal and bimodal conditions, for each of the 8 participants. It is immediately apparent from this figure that for participants 3 and 8, the variance affecting visual estimates during the cue-combination task was markedly higher than the variance affecting visual estimates during the single-cue task. The reason for this dramatic difference between the two estimates of variance is unclear. Regardless of the reason, however, it is important to note that our analysis allows us to objectively examine individual participants' behavior in each component of the experiment and to tag as outliers those who grossly failed to conform to the parameters of the experiment. As a result of the above analyses, and to ensure that subsequent analyses were not unduly biased by the behavior of these outlier subjects, we excluded their data from all subsequent analyses, reducing our sample size to 6.

Bottom Line: Our results show that human performance during audio-visual phonemic labeling is qualitatively consistent with the behavior of a Bayes-optimal observer.Furthermore, we show that in our task, the sensory variability affecting the visual modality during cue-combination is not well estimated from single-cue performance, but can be estimated from multi-cue performance.The findings and computational principles described here represent a principled first step towards characterizing the mechanisms underlying human cue integration in categorical tasks.

View Article: PubMed Central - PubMed

Affiliation: Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, United States of America. vrao@bcs.rochester.edu

ABSTRACT
Previous cue integration studies have examined continuous perceptual dimensions (e.g., size) and have shown that human cue integration is well described by a normative model in which cues are weighted in proportion to their sensory reliability, as estimated from single-cue performance. However, this normative model may not be applicable to categorical perceptual dimensions (e.g., phonemes). In tasks defined over categorical perceptual dimensions, optimal cue weights should depend not only on the sensory variance affecting the perception of each cue but also on the environmental variance inherent in each task-relevant category. Here, we present a computational and experimental investigation of cue integration in a categorical audio-visual (articulatory) speech perception task. Our results show that human performance during audio-visual phonemic labeling is qualitatively consistent with the behavior of a Bayes-optimal observer. Specifically, we show that the participants in our task are sensitive, on a trial-by-trial basis, to the sensory uncertainty associated with the auditory and visual cues, during phonemic categorization. In addition, we show that while sensory uncertainty is a significant factor in determining cue weights, it is not the only one and participants' performance is consistent with an optimal model in which environmental, within category variability also plays a role in determining cue weights. Furthermore, we show that in our task, the sensory variability affecting the visual modality during cue-combination is not well estimated from single-cue performance, but can be estimated from multi-cue performance. The findings and computational principles described here represent a principled first step towards characterizing the mechanisms underlying human cue integration in categorical tasks.

Show MeSH
Related in: MedlinePlus