Jason Bohland, Dept. of Speech, Language and Hearing Sciences, Boston University, Boston, MA
Repeating a word or non-word requires speakers to map auditory representations of incoming sounds onto learned speech items, maintain those items in short-term memory, interface that representation with the motor output system, and articulate the target sounds. Links between speech perception and speech production are tight, and activation of areas traditionally associated with motor output are commonly observed in speech perception tasks. Here we describe two functional magnetic resonance imaging (fMRI) studies in healthy speakers, coupled with multi-voxel pattern analysis (MVPA) in an attempt to disentangle areas of the brain that contain information about the speech sounds heard, remembered, or planned for production in different stages of tasks involving the repetition of single syllables. The first experiment, using a simple delayed repetition paradigm, found that clusters of voxels in the left inferior frontal sulcus (IFs), an area previously suggested to serve as a phonological output buffer, predicted vowel identity at the input stage of the task (upon hearing a syllable), while bilateral clusters in the mid-posterior superior temporal sulcus (STs) had greater information at the output stage (a GO signal cueing production). A follow-up experiment modified the syllable repetition task to dissociate the auditory input from the spoken output on one-half of trials. The overall pattern of results revealed that the left IFs and surrounding regions were again used to encode the vowel heard at input, but that this representation was overwritten when subjects were given a “change” cue that required them to update their motor output plan. The STs, on the other hand, maintained a representation of the heard syllable throughout the task, suggesting a more direct role in processing and encoding auditory inputs. This set of results helps to clarify the functions of specific brain regions in phonological encoding, and provides a wealth of data for driving improvements to neurocomputational models of speech.