Development and assessment of a real-time brain-inspired spatial sound processing algorithm
Embargo Date
2028-01-29
OA Version
Citation
Abstract
Listening in acoustically complex scenes remains a difficult task for both hearing-impaired (HI) individuals and automatic speech recognition (ASR) technologies. Normal-hearing (NH) listeners cope with these environments by segregating a scene into its constituent sound sources, then selecting and attending to a target source. By mimicking the biological mechanisms that enable this behavior, a brain-inspired assistive listening device may provide an effective solution to improve speech comprehension in acoustically cluttered environments (e.g., a cocktail party). Although numerous techniques have been proposed to address the so called “cocktail party problem” (CPP), existing methods fall short of the remarkably fast and accurate capabilities of normal-hearing listeners. This observation suggests that sound processing algorithms designed to solve the CPP may benefit from the integration of principles derived from the auditory system. Here we propose a binaural sound segregation algorithm based on a hierarchical network model of the auditory system. This neural-spiking-based sound segregation algorithm was evaluated in speech-on-speech acoustic scenes for listeners with normal hearing and hearing loss. We found robust improvement in word recognition performance for listeners with normal hearing and hearing loss. The algorithm was modified to run in real-time such that it may be deployed on devices such as hearing aids and assistive audio wearables. The real-time algorithm was then evaluated as an audio preprocessor for publicly available speech recognition (ASR) technologies tasked with transcribing a speaker of interest in noisy multi-talker scenes. We found that the algorithm significantly improved performance across multiple state-of-the-art ASRs in such scenes. By taking inspiration from biology, this work aims to advance the development of technologies for hearing assistive devices and machine hearing in the challenging acoustic environments we experience every day.
Description
2026