By Topic

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Issue 4 • Date April 2014

Filter Results

Displaying Results 1 - 19 of 19
  • [Front cover]

    Publication Year: 2014, Page(s): C1
    Request permission for commercial reuse | PDF file iconPDF (321 KB)
    Freely Available from IEEE
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing publication information

    Publication Year: 2014, Page(s): C2
    Request permission for commercial reuse | PDF file iconPDF (133 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2014, Page(s):741 - 742
    Request permission for commercial reuse | PDF file iconPDF (242 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2014, Page(s):743 - 744
    Request permission for commercial reuse | PDF file iconPDF (250 KB)
    Freely Available from IEEE
  • An Overview of Noise-Robust Automatic Speech Recognition

    Publication Year: 2014, Page(s):745 - 777
    Cited by:  Papers (25)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (3915 KB) | HTML iconHTML

    New waves of consumer-centric applications, such as voice search and voice interaction with mobile devices and home entertainment systems, increasingly require automatic speech recognition (ASR) to be robust to the full range of real-world noise and other acoustic distorting conditions. Despite its practical importance, however, the inherent links between and distinctions among the myriad of metho... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application of Deep Belief Networks for Natural Language Understanding

    Publication Year: 2014, Page(s):778 - 784
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (900 KB) | HTML iconHTML

    Applications of Deep Belief Nets (DBN) to various problems have been the subject of a number of recent studies ranging from image classification and speech recognition to audio classification. In this study we apply DBNs to a natural language understanding problem. The recent surge of activity in this area was largely spurred by the development of a greedy layer-wise pretraining method that uses a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants

    Publication Year: 2014, Page(s):785 - 799
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2959 KB) | HTML iconHTML

    This paper presents low-rank approximation based multichannel Wiener filter algorithms for noise reduction in speech plus noise scenarios, with application in cochlear implants. In a single speech source scenario, the frequency-domain autocorrelation matrix of the speech signal is often assumed to be a rank-1 matrix, which then allows to derive different rank-1 approximation based noise reduction ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of Superdirective Planar Arrays With Sparse Aperiodic Layouts for Processing Broadband Signals via 3-D Beamforming

    Publication Year: 2014, Page(s):800 - 815
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (3658 KB) | HTML iconHTML

    Planar arrays are used jointly with filter-and-sum beamforming to achieve 3-D spatial discrimination in processing broadband signals. In these systems, the beams are steered in various directions to investigate a given portion of space. The band can be so wide as to require both superdirective performance (to increase directivity at low frequencies) and sparse aperiodic layouts (to avoid grating l... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-Feature Beat Tracking

    Publication Year: 2014, Page(s):816 - 825
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1085 KB) | HTML iconHTML

    A recent trend in the field of beat tracking for musical audio signals has been to explore techniques for measuring the level of agreement and disagreement between a committee of beat tracking algorithms. By using beat tracking evaluation methods to compare all pairwise combinations of beat tracker outputs, it has been shown that selecting the beat tracker which most agrees with the remainder of t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition

    Publication Year: 2014, Page(s):826 - 835
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1050 KB) | HTML iconHTML

    Recently, supervised classification has been shown to work well for the task of speech separation. We perform an in-depth evaluation of such techniques as a front-end for noise-robust automatic speech recognition (ASR). The proposed separation front-end consists of two stages. The first stage removes additive noise via time-frequency masking. The second stage addresses channel mismatch and the dis... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Speaker Identification in Noisy and Reverberant Conditions

    Publication Year: 2014, Page(s):836 - 845
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1255 KB) | HTML iconHTML

    Robustness of speaker recognition systems is crucial for real-world applications, which typically contain both additive noise and room reverberation. However, the combined effects of additive noise and convolutive reverberation have been rarely studied in speaker identification (SID). This paper addresses this issue in two phases. We first remove background noise through binary masking using a dee... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the use of i–vector posterior distributions in Probabilistic Linear Discriminant Analysis

    Publication Year: 2014, Page(s):846 - 857
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2594 KB) | HTML iconHTML

    The i-vector extraction process is affected by several factors such as the noise level, the acoustic content of the observed features, the channel mismatch between the training conditions and the test data, and the duration of the analyzed speech segment. These factors influence both the i-vector estimate and its uncertainty, represented by the i-vector posterior covariance. This paper presents a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Chinese-English Phone Set Construction for Code-Switching ASR Using Acoustic and DNN-Extracted Articulatory Features

    Publication Year: 2014, Page(s):858 - 862
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (936 KB) | HTML iconHTML

    This study proposes a data-driven approach to phone set construction for code-switching automatic speech recognition (ASR). Acoustic and context-dependent cross-lingual articulatory features (AFs) are incorporated into the estimation of the distance between triphone units for constructing a Chinese-English phone set. The acoustic features of each triphone in the training corpus are extracted for c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing Edics

    Publication Year: 2014, Page(s):863 - 864
    Request permission for commercial reuse | PDF file iconPDF (108 KB)
    Freely Available from IEEE
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing Information for Authors

    Publication Year: 2014, Page(s):865 - 866
    Request permission for commercial reuse | PDF file iconPDF (147 KB)
    Freely Available from IEEE
  • Open Access

    Publication Year: 2014, Page(s): 867
    Request permission for commercial reuse | PDF file iconPDF (1157 KB)
    Freely Available from IEEE
  • Publish your article in IEEE Access

    Publication Year: 2014, Page(s): 868
    Request permission for commercial reuse | PDF file iconPDF (1156 KB)
    Freely Available from IEEE
  • IEEE Signal Processing Society Information

    Publication Year: 2014, Page(s): C3
    Request permission for commercial reuse | PDF file iconPDF (121 KB)
    Freely Available from IEEE
  • [Blank page - back cover]

    Publication Year: 2014, Page(s): C4
    Request permission for commercial reuse | PDF file iconPDF (5 KB)
    Freely Available from IEEE

Aims & Scope

The IEEE/ACM Transactions on Audio, Speech, and Language Processing is dedicated to innovative theory and methods for processing signals representing audio, speech and language, and their applications. This includes analysis, synthesis, enhancement, transformation, classification and interpretation of such signals as well as the design, development, and evaluation of associated signal processing systems.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief

Haizhou Li
Institute for Infocomm Research, A*STAR 

Singapore 138632

hli@i2r.a-star.edu.sg