Research Article: Listening to yourself is special: Evidence from global speech rate tracking

Date Published: September 5, 2018

Publisher: Public Library of Science

Author(s): Merel Maslowski, Antje S. Meyer, Hans Rutger Bosker, Marte Otten.


Listeners are known to use adjacent contextual speech rate in processing temporally ambiguous speech sounds. For instance, an ambiguous vowel between short /α/ and long /a:/ in Dutch sounds relatively long (i.e., as /a:/) embedded in a fast precursor sentence, but short in a slow sentence. Besides the local speech rate, listeners also track talker-specific global speech rates. However, it is yet unclear whether other talkers’ global rates are encoded with reference to a listener’s self-produced rate. Three experiments addressed this question. In Experiment 1, one group of participants was instructed to speak fast, whereas another group had to speak slowly. The groups were compared on their perception of ambiguous /α/-/a:/ vowels embedded in neutral rate speech from another talker. In Experiment 2, the same participants listened to playback of their own speech and again evaluated target vowels in neutral rate speech. Neither of these experiments provided support for the involvement of self-produced speech in perception of another talker’s speech rate. Experiment 3 repeated Experiment 2 but with a new participant sample that was unfamiliar with the participants from Experiment 2. This experiment revealed fewer /a:/ responses in neutral speech in the group also listening to a fast rate, suggesting that neutral speech sounds slow in the presence of a fast talker and vice versa. Taken together, the findings show that self-produced speech is processed differently from speech produced by others. They carry implications for our understanding of rate-dependent speech perception in dialogue settings, suggesting that both perceptual and cognitive mechanisms are involved.

Partial Text

Self takes a special role in the processing of cognitive and perceptual information. For instance, one’s own face is recognized faster and more accurately than other familiar and unfamiliar faces [1]. Also, self-relevant stimuli, such as self-owned or self-associated items, attract more attention compared to stimuli associated with others [2–6]. Not only self-relevant, but also self-produced items aid processing: Stroke recognition in hand-writing is facilitated when strokes are self-produced [7, 8]. Additionally, words were remembered better when participants read them aloud themselves during encoding than when they heard them read by others [9]. This advantage of self-produced words remains even when participants’ own voices are recorded earlier and played back to them at test.

Experiment 1 addressed the question whether self-produced speech rate affects perception of other talkers in global speech contexts. On the one hand, this may not be the case, as tracking self-produced rate may not necessarily be useful for comprehension of other talkers. On the other hand, self-produced speech may affect perception of others in global speech contexts in the same way as the global speech rate of one talker A influences perception of the speech rate of another talker B [31]. Moreover, effects of self-produced speech have previously been found in local contexts [22], with one’s own voice affecting subsequent perception of an immediately following target produced by another talker.

Experiment 2 tested whether the lack of an effect of self-produced speech in Experiment 1 could be related to the production task itself. Auditory input from self-produced speech has been found to lead to reduced responses in auditory cortex (i.e., speaking-induced suppression; SIS), which may in turn have reduced the magnitude of a potential effect of self-produced global speech rate. As such, SIS could be argued to account for the lack of a shift in phonetic categorization in Experiment 1. Therefore, Experiment 2 repeated the experiment without a speech production task, by having the same participants listen to playback of their own speech (produced in Experiment 1).

Experiment 3 aimed to evaluate whether the null results from the previous experiments were related to participants hearing themselves (rather than another talker). Therefore, Experiment 2 was repeated but with a new sample of participants, who listened to the fast of slow sentence recordings made in Experiment 1 and evaluated neutral rate sentences.

This study investigated the involvement of self-produced speech in perception of another talker’s global speech rate. In each of the three experiments, two groups of participants listened to and evaluated neutral speech rate trials from another talker. In Experiment 1, these perception trials were interspersed with production trials in which a high-rate group was instructed to produce speech at a (pre-specified) fast rate, whereas a low-rate group was instructed to produce speech at a slow speech rate. We measured the difference in perception of the Dutch /α, a:/ vowel contrast in the neutral rate speech from the other talker between the two groups. The results indicated that self-produced speech did not influence rate perception of the other talker (i.e., there was no difference between groups).

The current study shows that listening to one’s own voice is special: Self-produced speech is processed differently from speech produced by others (Experiment 1). This seems to be task independent, as playback of one’s own speech also does not elicit an effect of self-produced rate on perception of others (Experiment 2). Furthermore, this study shows that global rate effects can be replicated with naturally produced speech (Experiment 3). Importantly, this indicates that some amount of within-talker variability in speech rate is allowed before global rate tracking fails. These findings shed further light on the complex mechanisms of speech perception in dialogue settings, highlighting the hierarchical processes involved in rate normalization, as suggested by the two-stage model in Bosker et al. [22]. To further empirically test this model, future work may investigate the time course of global rate effects to explore timing differences between local and global rate normalization.