JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    •   OpenBU
    • College of Arts and Sciences
    • Cognitive & Neural Systems
    • CAS/CNS Technical Reports
    • View Item
    •   OpenBU
    • College of Arts and Sciences
    • Cognitive & Neural Systems
    • CAS/CNS Technical Reports
    • View Item

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization

    Thumbnail
    Download/View
    08.005.pdf (762.0Kb)
    Date Issued
    2007-11-24
    Author
    Ames, Heather
    Grossberg, Stephen
    Share to FacebookShare to TwitterShare by Email
    Export Citation
    Download to BibTex
    Download to EndNote/RefMan (RIS)
    Metadata
    Show full item record
    Permanent Link
    https://hdl.handle.net/2144/2216
    Abstract
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.
    Rights
    Copyright 2008 Boston University. Permission to copy without fee all or part of this material is granted provided that: 1. The copies are not made or distributed for direct commercial advantage; 2. the report title, author, document number, and release date appear, and notice is given that copying is by permission of BOSTON UNIVERSITY TRUSTEES. To copy otherwise, or to republish, requires a fee and / or special permission.
    Collections
    • CAS/CNS Technical Reports [485]

    Contact Us | Send Feedback | Help
     

     

    Browse

    All of OpenBUCommunities & CollectionsIssue DateAuthorsTitlesSubjectsThis CollectionIssue DateAuthorsTitlesSubjects

    Deposit Materials

    LoginNon-BU Registration

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Contact Us | Send Feedback | Help