Boston University Libraries OpenBU
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    •   OpenBU
    • Theses & Dissertations
    • Boston University Theses & Dissertations
    • View Item
    •   OpenBU
    • Theses & Dissertations
    • Boston University Theses & Dissertations
    • View Item

    A biologically inspired approach to the cocktail party problem

    Thumbnail
    License
    Attribution-NonCommercial-ShareAlike 4.0 International
    Date Issued
    2020
    Author(s)
    Chou, Kenny
    Share to FacebookShare to TwitterShare by Email
    Export Citation
    Download to BibTex
    Download to EndNote/RefMan (RIS)
    Metadata
    Show full item record
    Permanent Link
    https://hdl.handle.net/2144/41043
    Abstract
    At a cocktail party, one can choose to scan the room for conversations of interest, attend to a specific conversation partner, switch between conversation partners, or not attend to anything at all. The ability of the normal-functioning auditory system to flexibly listen in complex acoustic scenes plays a central role in solving the cocktail party problem (CPP). In contrast, certain demographics (e.g., individuals with hearing impairment or older adults) are unable to solve the CPP, leading to psychological ailments and reduced quality of life. Since the normal auditory system still outperforms machines in solving the CPP, an effective solution may be found by mimicking the normal-functioning auditory system. Spatial hearing likely plays an important role in CPP-processing in the auditory system. This thesis details the development of a biologically based approach to the CPP by modeling specific neural mechanisms underlying spatial tuning in the auditory cortex. First, we modeled bottom-up, stimulus-driven mechanisms using a multi-layer network model of the auditory system. To convert spike trains from the model output into audible waveforms, we designed a novel reconstruction method based on the estimation of time-frequency masks. We showed that our reconstruction method produced sounds with significantly higher intelligibility and quality than previous reconstruction methods. We also evaluated the algorithm's performance using a psychoacoustic study, and found that it provided the same amount of benefit to normal-hearing listeners as a current state-of-the-art acoustic beamforming algorithm. Finally, we modeled top-down, attention driven mechanisms that allowed the network to flexibly operate in different regimes, e.g., monitor the acoustic scene, attend to a specific target, and switch between attended targets. The model explains previous experimental observations, and proposes candidate neural mechanisms underlying flexible listening in cocktail-party scenarios. The strategies proposed here would benefit hearing-assistive devices for CPP processing (e.g., hearing aids), where users would benefit from switching between various modes of listening in different social situations.
    Rights
    Attribution-NonCommercial-ShareAlike 4.0 International
    Collections
    • Boston University Theses & Dissertations [7858]


    Boston University
    Contact Us | Send Feedback | Help
     

     

    Browse

    All of OpenBUCommunities & CollectionsIssue DateAuthorsTitlesSubjectsThis CollectionIssue DateAuthorsTitlesSubjects

    Deposit Materials

    LoginNon-BU Registration

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Boston University
    Contact Us | Send Feedback | Help