Speech Sound Acquisition, Coarticulation, and Rate Effects in a Neural Network Model of Speech Production
OA Version
Citation
Abstract
This article describes a neural network model of speech motor skill acquisition and speech production that explains a wide range of data on contextual variability, motor equivalence, coarticulation, and speaking rate effects. Model parameters are learned during a babbling phase. To explain how infants learn phoneme-specific and language-specific limits on acceptable articulatory variability, the learned speech sound targets take the form of multidimensional convex regions in orosensory coordinates. Reduction of target size for better accuracy during slower speech (in the spirit of the speed-accuracy trade-off described by Fitts' law) leads to differential effects for vowels and consonants, as seen iu speaking rate experiments that have been previously taken as evidence for separate control processes for the two sound types. An account of anticipatory coarticulation is posited wherein the target for a speech sound is reduced in size based on context to provide a more efficient sequence of articulator movements. This explanation generalizes the well-known look ahead model of coarticulation to incorporate convex region targets. Computer simulations verify the model's properties, including linear velocity/distance relationships, motor equivalence, speaking rate effects, and carryover and anticipatory coarticulation.
Description
License
Copyright 1994 Boston University. Permission to copy without fee all or part of this material is granted provided that: 1. The copies are not made or distributed for direct commercial advantage; 2. the report title, author, document number, and release date appear, and notice is given that copying is by permission of BOSTON UNIVERSITY TRUSTEES. To copy otherwise, or to republish, requires a fee and / or special permission.