Timing Regulation in Speech (and Birdsong)

Louis Goldstein - Dept of Linguistics, University of Southern California

Speech production can be decomposed into a sequence discrete actions corresponding to phonological units such as segments, features or (as will be assumed in this talk) articulatory gestures [2,7]. Stable temporal coherence of these events is necessary to ensure successful communication, but the coherence must also be flexible to allow modulation by factors such as speaking rate and prosody. Two general mechanisms of timing regulation can be considered: (1) a chain mechanism in which a gesture is triggered by the achievement of some point in the state space (position, velocity) of some previous gesture, and (2) a clock mechanism in which gestures are triggered in a feedforward way (in part at least) by pulses from a (neural) clock. Evidence will presented in favor of the clock alternative, particularly stemming from the control of geminate (long) consonants [5]. A particular clock mechanism has been proposed in recent work [3,6], in which each gesture is associated with its own clock, and the clocks are coupled to one another in coupling graphs, that embody the basic syllable structures of a language [3,6]. Two sources of evidence for this model will be discussed: (1) data from speech errors suggesting the role of oscillator entrainment in speech production [4] and (2) a variety of temporal phenomena that can be understood as consequences of the topology of the proposed syllable-level coupling graphs. Finally, recent findings from the Zebra finch [1] that are compatible with this model will presented, showing that pulses in the premotor cortex appear to be synchronized with extrema of the control parameters for syrinx gestures during singing (e.g., gesture onsets and offsets).