Pitch-based Streaming in Auditory Perception
This chapter summarizes a neural model of how humans use pitch-based information to separate and attentively track multiple voices or instruments in distinct auditory streams, as in the cocktail party problem. The model incorporates concepts of top-down matching, attention, and resonance that have been used to analyse how humans can autonomously learn and stably remember large amounts of information in response to a rapidly changing environment. These Adaptive Resonance Theory, or AHT, concepts are joined to a Spatial Pitch NETwork, or SPINET, model to form an ARTSREAM model for pitch-based streaming. The ARTSTREAM model suggests that a resonance between spectral and pitch representations is necessary for a conscious auditory percept to occur. Examples from auditory perception in noise and context-sensitive speech perception are discussed, such as the auditory continuity illusion and phonemic restoration. The Gjerdingen analysis of apparent motion in music is shown to have a natural embedding within the ARTSTREAM model.