Date Published: January 31, 2017
Publisher: Public Library of Science
Author(s): Izumi Yahata, Tetsuaki Kawase, Akitake Kanno, Hiroshi Hidaka, Shuichi Sakamoto, Nobukazu Nakasato, Ryuta Kawashima, Yukio Katori, Jyrki Ahveninen.
The effects of visual speech (the moving image of the speaker’s face uttering speech sound) on early auditory evoked fields (AEFs) were examined using a helmet-shaped magnetoencephalography system in 12 healthy volunteers (9 males, mean age 35.5 years). AEFs (N100m) in response to the monosyllabic sound /be/ were recorded and analyzed under three different visual stimulus conditions, the moving image of the same speaker’s face uttering /be/ (congruent visual stimuli) or uttering /ge/ (incongruent visual stimuli), and visual noise (still image processed from speaker’s face using a strong Gaussian filter: control condition). On average, latency of N100m was significantly shortened in the bilateral hemispheres for both congruent and incongruent auditory/visual (A/V) stimuli, compared to the control A/V condition. However, the degree of N100m shortening was not significantly different between the congruent and incongruent A/V conditions, despite the significant differences in psychophysical responses between these two A/V conditions. Moreover, analysis of the magnitudes of these visual effects on AEFs in individuals showed that the lip-reading effects on AEFs tended to be well correlated between the two different audio-visual conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres but were not significantly correlated between right and left hemisphere. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses. These results may indicate that the auditory-visual interaction observed on the N100m is a fundamental process which does not depend on the congruency of the visual information.
The information of visual speech (the moving image of the speaker’s face uttering speech sound) presented with speech sound is known to help speech perception under conditions of impaired hearing, such as in noisy environments and/or in subjects with impaired hearing (lip-reading effects) [1, 2]. This lip-reading effect is usually useful in perceiving congruent audio-visual (A/V) information, such as watching the speaker’s face when listening to speech, but may also be influenced by incongruent A/V information. For example, if the /ba/ sound (auditory stimulus) is presented with the speaker’s face uttering the /ga/ sound (incongruent visual stimulus) simultaneously, the presented /ba/ sound is often perceived as /da/; i.e., the auditory perception may be modified by the incongruent visual information. This lip-reading effect under incongruent condition, which is known as the “McGurk effect,” is important evidence that the lip-reading effect can be observed not only in subjects with impaired hearing but also in subjects with normal hearing . Therefore, it is important to understand how the perception caused by auditory input is affected by the visual input presented simultaneously in order to understand the underlying mechanism of lip-reading.
The present study examined the lip-reading effects of monosyllables on AEFs generated from the auditory cortices. Overall, lip reading had significant effects on the latency reduction of N100m with no apparent hemispheric laterality. However, the degree of N100m shortening was not significantly different between congruent and incongruent A/V conditions, although behavioral data showed psychophysical responses were significantly different between these two A/V conditions. The magnitude of visual speech effects on N100m was varied among subjects, but the lip-reading effects on AEFs tended to be well correlated between the two different A/V conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres on an individual basis. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses.
The present study investigated the individual variations in the effects of lip reading on the latencies of N100m as well as psychophysical responses, by examining the lip-reading effects under congruent and incongruent visual conditions. The individual visual-speech effects on N100m latency (shortening of the N100m latency) were significantly correlated between congruent and incongruent A/V stimuli, but did not reflect the individual differences in psychophysical confusion. These findings suggest that the inter-individual difference of processing time for N100m does not depend on stimulus content. Moreover, the A/V interaction seen in the auditory cortex may just be an initial stage, and the psychophysical response as a final judgment of A/V perception occurs at a later processing stage.