Modulation of covert speech on overt loudness perception implies the mechanism of speech monitoring

Xing Tian, Dept. of Neural and Cognitive Sciences and NYU-ECNU Institute of Brain and Cognitive Science, New York University Shanghai, Shanghai, China


We continuously monitor our own speech in real time without delay. One key computational component of speech online control has been hypothesized as an interaction between top-down induced processes and auditory feedback: auditory consequences of speech production can be predicted via a top-down process and the predicted speech results are compared with feedback to constrain and update production. In this study, we test a critical assumption of this model which is that top-down induced mental representation can interact with speech perception, even at a basic level of acoustic attributes such as loudness. In a behavioral, a Magnetoencephalography (MEG) and an Electroencephalography (EEG) experiments, participants were asked to imagine speaking the syllable /da/ loudly (loud condition) or softly (soft condition) before they heard the playback of their own voice of the same syllable. Behavioral results showed that the loudness rating of the playback was smaller in the loud condition than that in the soft condition. MEG results demonstrated that the magnitude of neural responses to the overt auditory stimuli was smaller in the loud condition compared to those elicited in the soft condition. EEG results further showed that the suppressed neural responses correlated with the decreased loudness rating. These consistent behavioral and electrophysiological results suggest that the top-down induced neural representation converges to the same representational format as the neural representation established during speech perception, even for basic sensory features such as loudness. Such a coordinate transformation in a top-down process forms the neurocomputational foundation that enables the interaction with a bottom-up process in speech production monitoring and control.