By Topic

Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

Cover Image Copyright Year: 2006
Author(s): Wang, D.; Brown, G.
Publisher: Wiley-IEEE Press
Content Type : Books & eBooks
Topics: Communication, Networking & Broadcasting ;  Components, Circuits, Devices & Systems ;  Computing & Processing (Hardware/Software) ;  Signal Processing & Analysis
  • Print

Abstract

How can we engineer systems capable of "cocktail party" listening?Human listeners are able to perceptually segregate one sound source from an acoustic mixture, such as a single voice from a mixture of other voices and music at a busy cocktail party. How can we engineer "machine listening" systems that achieve this perceptual feat?Albert Bregman's book Auditory Scene Analysis, published in 1990, drew an analogy between the perception of auditory scenes and visual scenes, and described a coherent framework for understanding the perceptual organization of sound. His account has stimulated much interest in computational studies of hearing. Such studies are motivated in part by the demand for practical sound separation systems, which have many applications including noise-robust automatic speech recognition, hearing prostheses, and automatic music transcription. This emerging field has become known as computational auditory scene analysis (CASA).Computational Auditory Scene Analysis: Principles, Algorithms, and Applications provides a comprehensive and coherent account of the state of the art in CASA, in terms of the underlying principles, the algorithms and system architectures that are employed, and the potential applications of this exciting new technology. With a Foreword by Bregman, its chapters are written by leading researchers and cover a wide range of topics including:
Estimation of multiple fundamental frequencies
Feature-based and model-based approaches to CASA
Sound separation based on spatial location
Processing for reverberant environments
Segregation of speech and musical signals
Automatic speech recognition in noisy environments
Neural and perceptual modeling of auditory organizationThe text is written at a level that will be a ccessible to graduate students and researchers from related science and engineering disciplines. The extensive bibliography accompanying each chapter will also make this book a valuable reference source. A web site accompanying the text (www.casabook.org) features software tools and sound demonstrations.

  •   Click to expandTable of Contents

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Frontmatter

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.fmatter
      Page(s): i - xviii
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      The prelims comprise:
      Half Title
      IEEE Press Editorial Board Page
      Title
      Copyright
      Contents
      Foreword
      Preface View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Contributors

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.contrib
      Page(s): xix - xx
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      How can we engineer systems capable of "cocktail party" listening?Human listeners are able to perceptually segregate one sound source from an acoustic mixture, such as a single voice from a mixture of other voices and music at a busy cocktail party. How can we engineer "machine listening" systems that achieve this perceptual feat?Albert Bregman's book Auditory Scene Analysis, published in 1990, drew an analogy between the perception of auditory scenes and visual scenes, and described a coherent framework for understanding the perceptual organization of sound. His account has stimulated much interest in computational studies of hearing. Such studies are motivated in part by the demand for practical sound separation systems, which have many applications including noise-robust automatic speech recognition, hearing prostheses, and automatic music transcription. This emerging field has become known as computational auditory scene analysis (CASA).Computational Auditory Scene Analysis: Principles, Algorithms, and Applications provides a comprehensive and coherent account of the state of the art in CASA, in terms of the underlying principles, the algorithms and system architectures that are employed, and the potential applications of this exciting new technology. With a Foreword by Bregman, its chapters are written by leading researchers and cover a wide range of topics including:
      Estimation of multiple fundamental frequencies
      Feature-based and model-based approaches to CASA
      Sound separation based on spatial location
      Processing for reverberant environments
      Segregation of speech and musical signals
      Automatic speech recognition in noisy environments
      Neural and perceptual modeling of auditory organizationThe text is written at a level that will be a ccessible to graduate students and researchers from related science and engineering disciplines. The extensive bibliography accompanying each chapter will also make this book a valuable reference source. A web site accompanying the text (www.casabook.org) features software tools and sound demonstrations. View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Acronyms

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.acron
      Page(s): xxi - xxiii
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      How can we engineer systems capable of "cocktail party" listening?Human listeners are able to perceptually segregate one sound source from an acoustic mixture, such as a single voice from a mixture of other voices and music at a busy cocktail party. How can we engineer "machine listening" systems that achieve this perceptual feat?Albert Bregman's book Auditory Scene Analysis, published in 1990, drew an analogy between the perception of auditory scenes and visual scenes, and described a coherent framework for understanding the perceptual organization of sound. His account has stimulated much interest in computational studies of hearing. Such studies are motivated in part by the demand for practical sound separation systems, which have many applications including noise-robust automatic speech recognition, hearing prostheses, and automatic music transcription. This emerging field has become known as computational auditory scene analysis (CASA).Computational Auditory Scene Analysis: Principles, Algorithms, and Applications provides a comprehensive and coherent account of the state of the art in CASA, in terms of the underlying principles, the algorithms and system architectures that are employed, and the potential applications of this exciting new technology. With a Foreword by Bregman, its chapters are written by leading researchers and cover a wide range of topics including:
      Estimation of multiple fundamental frequencies
      Feature-based and model-based approaches to CASA
      Sound separation based on spatial location
      Processing for reverberant environments
      Segregation of speech and musical signals
      Automatic speech recognition in noisy environments
      Neural and perceptual modeling of auditory organizationThe text is written at a level that will be a ccessible to graduate students and researchers from related science and engineering disciplines. The extensive bibliography accompanying each chapter will also make this book a valuable reference source. A web site accompanying the text (www.casabook.org) features software tools and sound demonstrations. View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Fundamentals of Computational Auditory Scene Analysis

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch1
      Page(s): 1 - 44
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Human Auditory Scene Analysis
      Computational Auditory Scene Analysis (CASA)
      Basics of CASA Systems
      CASA Evaluation
      Other Sound Separation Approaches
      A Brief History of CASA (Prior to 2000)
      Conclusions This chapter contains sections titled:
      Acknowledgments
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Multiple F0 Estimation

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch2
      Page(s): 45 - 79
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      Signal Models
      Single-Voice F0 Estimation
      Multiple-Voice F0 Estimation
      Issues
      Other Sources of Information
      Estimating the Number of Sources
      Evaluation
      Application Scenarios
      Conclusion This chapter contains sections titled:
      Acknowledgments
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      FeatureBased Speech Segregation

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch3
      Page(s): 81 - 114
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      Feature Extraction
      Auditory Segmentation
      Simultaneous Grouping
      Sequential Grouping
      Discussion This chapter contains sections titled:
      Acknowledgments
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      ModelBased Scene Analysis

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch4
      Page(s): 115 - 146
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      Source Separation as Inference
      Hidden Markov Models
      Aspects of Model-Based Systems
      Discussion
      Conclusions This chapter contains sections titled:
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Binaural Sound Localization

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch5
      Page(s): 147 - 185
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      Physical and Physiological Mechanisms Underlying Auditory Localization
      Spatial Perception of Single Sources
      Spatial Perception of Multiple Sources
      Models of Binaural Perception
      Multisource Sound Localization
      General Discussion This chapter contains sections titled:
      Acknowledgments
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      LocalizationBased Grouping

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch6
      Page(s): 187 - 207
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      Classical Beamforming Techniques
      Location-Based Grouping Using Interaural Time Difference Cue
      Location-Based Grouping Using Interaural Intensity Difference Cue
      Location-Based Grouping Using Multiple Binaural Cues
      Discussion and Conclusions This chapter contains sections titled:
      Acknowledgments
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Reverberation

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch7
      Page(s): 209 - 250
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      Effects of Reverberation on Listeners
      Effects of Reverberation on Machines
      Mechanisms Underlying Robustness to Reverberation in Human Listeners
      Reverberation-Robust Acoustic Processing
      CASA and Reverberation
      Discussion and Conclusions This chapter contains sections titled:
      Acknowledgments
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Analysis of Musical Audio Signals

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch8
      Page(s): 251 - 295
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      Music Scene Description
      Estimating Melody and Bass Lines
      Estimating Beat Structure
      Estimating Chorus Sections and Repeated Sections
      Discussion and Conclusions This chapter contains sections titled:
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Robust Automatic Speech Recognition

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch9
      Page(s): 297 - 350
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      ASA and Speech Perception in Humans
      Speech Recognition by Machine
      Primitive CASA and ASR
      Model-Based CASA and ASR
      Discussion and Conclusions
      Concluding Remarks This chapter contains sections titled:
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Neural and Perceptual Modeling

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.ch10
      Page(s): 351 - 387
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      This chapter contains sections titled:
      Introduction
      The Neural Basis of Auditory Grouping
      Models of Individual Neurons
      Models of Specific Perceptual Phenomena
      The Oscillatory Correlation Framework for CASA
      Schema-Driven Grouping
      Discussion This chapter contains sections titled:
      Acknowledgments
      References View full abstract»

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Index

      Wang, D. ; Brown, G.
      Computational Auditory Scene Analysis:Principles, Algorithms, and Applications

      DOI: 10.1109/9780470043387.index
      Page(s): 389 - 395
      Copyright Year: 2006

      Wiley-IEEE Press eBook Chapters

      How can we engineer systems capable of "cocktail party" listening?Human listeners are able to perceptually segregate one sound source from an acoustic mixture, such as a single voice from a mixture of other voices and music at a busy cocktail party. How can we engineer "machine listening" systems that achieve this perceptual feat?Albert Bregman's book Auditory Scene Analysis, published in 1990, drew an analogy between the perception of auditory scenes and visual scenes, and described a coherent framework for understanding the perceptual organization of sound. His account has stimulated much interest in computational studies of hearing. Such studies are motivated in part by the demand for practical sound separation systems, which have many applications including noise-robust automatic speech recognition, hearing prostheses, and automatic music transcription. This emerging field has become known as computational auditory scene analysis (CASA).Computational Auditory Scene Analysis: Principles, Algorithms, and Applications provides a comprehensive and coherent account of the state of the art in CASA, in terms of the underlying principles, the algorithms and system architectures that are employed, and the potential applications of this exciting new technology. With a Foreword by Bregman, its chapters are written by leading researchers and cover a wide range of topics including:
      Estimation of multiple fundamental frequencies
      Feature-based and model-based approaches to CASA
      Sound separation based on spatial location
      Processing for reverberant environments
      Segregation of speech and musical signals
      Automatic speech recognition in noisy environments
      Neural and perceptual modeling of auditory organizationThe text is written at a level that will be a ccessible to graduate students and researchers from related science and engineering disciplines. The extensive bibliography accompanying each chapter will also make this book a valuable reference source. A web site accompanying the text (www.casabook.org) features software tools and sound demonstrations. View full abstract»