By Topic

Signal Processing Letters, IEEE

Issue 9 • Date Sept. 2014

Filter Results

Displaying Results 1 - 25 of 38
  • Front Cover

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (241 KB)  
    Freely Available from IEEE
  • IEEE Signal Processing Letters publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (132 KB)  
    Freely Available from IEEE
  • Table of Contents

    Page(s): 1027 - 1028
    Save to Project icon | Request Permissions | PDF file iconPDF (163 KB)  
    Freely Available from IEEE
  • Table of Contents

    Page(s): 1029 - 1030
    Save to Project icon | Request Permissions | PDF file iconPDF (164 KB)  
    Freely Available from IEEE
  • Online Visual Tracking via Two View Sparse Representation

    Page(s): 1031 - 1034
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (794 KB) |  | HTML iconHTML  

    In this letter, we present a novel online tracking method based on sparse representation. In contrast to existing “sparse representation”-based tracking algorithms, this work adopts the sparse representation method to construct both object and state models. The tracked object can be sparsely represented by a series of object templates, and also can be sparsely represented by candidate samples in the current frame. Furthermore, we propose a unified objective function to integrate object and state models, and cast the tracking problem as an optimization problem that can be solved in an iteration manner. Finally, we compare the proposed tracker with nine state-of-the-art tracking methods by using some challenging image sequences. Both qualitative and quantitative evaluations demonstrate that our tracker achieves favorable performance in terms of both accuracy and speed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Saliency Detection with Multi-Scale Superpixels

    Page(s): 1035 - 1039
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1310 KB) |  | HTML iconHTML  

    We propose a salient object detection algorithm via multi-scale analysis on superpixels. First, multi-scale segmentations of an input image are computed and represented by superpixels. In contrast to prior work, we utilize various Gaussian smoothing parameters to generate coarse or fine results, thereby facilitating the analysis of salient regions. At each scale, three essential cues from local contrast, integrity and center bias are considered within the Bayesian framework. Next, we compute saliency maps by weighted summation and normalization. The final saliency map is optimized by a guided filter which further improves the detection results. Extensive experiments on two large benchmark datasets demonstrate the proposed algorithm performs favorably against state-of-the-art methods. The proposed method achieves the highest precision value of 97.39% when evaluated on one of the most popular datasets, the ASD dataset. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification

    Page(s): 1040 - 1044
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (594 KB) |  | HTML iconHTML  

    Most speaker recognition systems rely on short-term acoustic cepstral features for extracting the speaker-relevant information from the signal. But phonetic discriminant features, extracted by a bottle-neck multi-layer perceptron (MLP) on longer stretches of time, can provide a complementary information and have been adopted in speech transcription systems. We compare the speaker verification performance using cepstral features, discriminant features, and a concatenation of both followed by a dimension reduction. We consider two speaker recognition systems, one based on maximum likelihood linear regression (MLLR) super-vectors and the other on a state-of-the-art i-vector system with two session variability compensation schemes. Experiments are reported on a standard configuration of NIST SRE 2008 and 2010 databases. The results show that the phonetically discriminative MLP features retain speaker-specific information which is complementary to the short-term cepstral features. The performance improvement is obtained with both score domain and feature domain fusion and the speaker verification equal error rate (EER) is reduced up to 50% relative, compared to the best i-vector system using only cepstral features. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Ghost-Free High Dynamic Range Imaging via Rank Minimization

    Page(s): 1045 - 1049
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (801 KB) |  | HTML iconHTML  

    We propose a ghost-free high dynamic range (HDR) image synthesis algorithm using a low-rank matrix completion framework, which we call RM-HDR. Based on the assumption that irradiance maps are linearly related to low dynamic range (LDR) image exposures, we formulate ghost region detection as a rank minimization problem. We incorporate constraints on moving objects, i.e., sparsity, connectivity, and priors on under- and over-exposed regions into the framework. Experiments on real image collections show that the RM-HDR can often provide significant gains in synthesized HDR image quality over state-of-the-art approaches. Additionally, a complexity analysis is performed which reveals computational merits of RM-HDR over recent advances in deghosting for HDR. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploiting Spectral Regrowth for Channel Identification

    Page(s): 1050 - 1053
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (986 KB) |  | HTML iconHTML  

    In modern communication systems, power amplifiers (PAs) are important components and inherently nonlinear. The nonlinearity of the PA causes bandwidth expansion of the communication signal, often referred to as spectral regrowth, at the PA output. Conventionally, spectral regrowth is treated as a distortion, and a range of compensation and filtering techniques have been considered to mitigate its effect. In this paper, we propose to exploit spectral regrowth to enhance channel identification accuracy. Our approach is motivated by the fact that the nonlinearly amplified communication signal carries more bandwidth and allows better probing of the channel. We introduce an iterative algorithm which jointly estimates the PA characteristics and the channel impulse response. The effectiveness of the proposed algorithm is illustrated by computer simulation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Contraction Mapping Approach for Robust Estimation of Lagged Autocorrelation

    Page(s): 1054 - 1058
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (950 KB) |  | HTML iconHTML  

    We consider the zero-crossing rate (ZCR) of a Gaussian process and establish a property relating the lagged ZCR (LZCR) to the corresponding normalized autocorrelation function. This is a generalization of Kedem's result for the lag-one case. For the specific case of a sinusoid in white Gaussian noise, we use the higher-order property between lagged ZCR and higher-lag autocorrelation to develop an iterative higher-order autoregressive filtering scheme, which stabilizes the ZCR and consequently provide robust estimates of the lagged autocorrelation. Simulation results show that the autocorrelation estimates converge in about 20 to 40 iterations even for low signal-to-noise ratio. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Iterative Recovery of Dense Signals from Incomplete Measurements

    Page(s): 1059 - 1063
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (929 KB) |  | HTML iconHTML  

    Within the framework of compressed sensing, we consider dense signals, which contain both discrete as well as continuous-amplitude components. We demonstrate by a comprehensive numerical study-to the best of our knowledge the first of its kind in the literature-that dense signals can be recovered from noisy, incomplete linear measurements by simple iterative algorithms that are inspired by or are implementations of approximate message passing. Those iterative algorithms are shown to significantly outperform all other algorithms presented so far, when they use a novel noise-adaptive thresholding function that is proposed in this contribution. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Finite Fractional Zak Transform

    Page(s): 1064 - 1067
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1175 KB) |  | HTML iconHTML  

    We give a matrix form of the Zak transform in the finite setting and show it can be diagonalized with eigenvectors of the two variable time-independent Schröedinger difference equation. Using this diagonalization we produce a fractional Zak transform and illustrate the effect that it has on constant frequencies. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition

    Page(s): 1068 - 1072
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (591 KB) |  | HTML iconHTML  

    With the availability of speech data obtained from different devices and varied acquisition conditions, we are often faced with scenarios, where the intrinsic discrepancy between the training and the test data has an adverse impact on affective speech analysis. To address this issue, this letter introduces an Adaptive Denoising Autoencoder based on an unsupervised domain adaptation method, where prior knowledge learned from a target set is used to regularize the training on a source set. Our goal is to achieve a matched feature space representation for the target and source sets while ensuring target domain knowledge transfer. The method has been successfully evaluated on the 2009 INTERSPEECH Emotion Challenge's FAU Aibo Emotion Corpus as target corpus and two other publicly available speech emotion corpora as sources. The experimental results show that our method significantly improves over the baseline performance and outperforms related feature domain adaptation methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Projection of PLLRs for Unbounded Feature Distributions in Spoken Language Recognition

    Page(s): 1073 - 1077
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (915 KB) |  | HTML iconHTML  

    The so called Phone Log-Likelihood Ratio (PLLR) features have been recently introduced as a novel and effective way of retrieving acoustic-phonetic information in spoken language and speaker recognition systems. In this letter, an in-depth insight into the PLLR feature space is provided and the multidimensional distribution of these features is analyzed in a language recognition system. The study reveals that PLLR features are confined into a subspace that strongly bounds PLLR distributions. To enhance the information retrieved by the system, PLLR features are projected into a hyper-plane that provides a more suitable representation of the subspace where the features lie. After applying the projection method, PCA is used to decorrelate the features. Gains attained on each step of the proposed approach are outlined and compared to simple PCA projection. Experiments carried out on NIST 2007, 2009 and 2011 LRE datasets demonstrate the effectiveness of the proposed method, which yields up to a 27% relative improvement with regard to the system based on the original features. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asynchronous Transmitter Position and Velocity Estimation Using A Dual Linear Chirp

    Page(s): 1078 - 1082
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1515 KB) |  | HTML iconHTML  

    We present a closed-form least squares algorithm for estimating the position and velocity of a source, asynchronously transmitting known dual linear chirp signals, given times of arrival measurements obtained by spatially distributed sensors. The estimates involve less computational load compared to the optimal maximum likelihood estimates. Simulations show that the proposed estimates have similar mean square errors as the optimal estimates, and are close to the lower bound for small and moderate measurement noise. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cramér–Rao Bounds for Broadband Dispersion Extraction of Borehole Acoustic Modes

    Page(s): 1083 - 1087
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (947 KB) |  | HTML iconHTML  

    The estimation of slowness (the reciprocal of velocity) and attenuation dispersion of borehole acoustic and surface seismic data is key to a variety of applications. The emerging broadband approach for dispersion extraction of multiple modes has shown significant advantages over traditional narrowband approaches. In this letter, Cramér-Rao bounds (CRBs) are established to characterize the best achievable performance of any unbiased broadband estimator of the dispersion. One noteworthy observation from the derived CRBs is that the same estimation accuracy can be achieved between the angular wavenumber and the attenuation and, between the group slowness and the attenuation rate. The broadband CRB is also shown to include the narrowband CRB as a special case and it quantifies the performance benefit of the broadband approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Frequency-Domain Volterra Filter Based on Data-Driven Soft Decision for Nonlinear Acoustic Echo Suppression

    Page(s): 1088 - 1092
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1075 KB) |  | HTML iconHTML  

    In this letter, we propose a novel frequency-domain second-order Volterra filter based on soft decision for nonlinear acoustic echo suppression. This letter offers an efficient algorithm for nonlinear echo power estimation using the second-order Volterra filter and an AES algorithm using the estimated nonlinear echo power spectrum within a soft decision baseline by incorporating the ratio of the a priori probability of near-end speech presence and absence, which is obtained by using a data-driven training method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Explicit Solution for Target Localization in Noncoherent Distributed MIMO Radar Systems

    Page(s): 1093 - 1097
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1087 KB) |  | HTML iconHTML  

    This work focuses on the moving target localization problem in the noncoherent multiple-input multiple-output radar system with widely separated antennas. We assume that the time delay and Doppler shift between each transmit/receive element pair have already been measured by a preprocessing algorithm. Utilizing these measurements, an explicit method for jointly estimating the target position and velocity is proposed. It first divides the measurements into several groups based on the different transmitter elements or receive elements, and then employs two best linear unbiased estimators successively for each group to independently produce an estimate of target position and velocity. Finally, these results from different groups are combined to form a composite estimate. Simulation results show that the estimated accuracy of the proposed method achieves the Cramér-Rao lower bound at sufficiently small noise conditions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Blind Modulation Classification Algorithm for Single and Multiple-Antenna Systems Over Frequency-Selective Channels

    Page(s): 1098 - 1102
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1105 KB) |  | HTML iconHTML  

    This letter proposes a blind modulation classification (MC) algorithm applicable to single and multiple-antenna systems operating over frequency-selective channels. We show that the correlation functions of the received signals for certain modulation formats exhibit peaks at a particular set of time lags, a result which can be exploited as a discriminating feature. We also develop a new hypothesis test in order to detect the correlation-induced peaks. The proposed algorithm is general in the sense that it accommodates any number of transmit- and receive-antennas, without prior information about channel statistics. The classification performance of the proposed algorithm is assessed through Monte Carlo simulations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asymptotic Error Bounds on Prediction of Narrowband MIMO Wireless Channels

    Page(s): 1103 - 1107
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1134 KB) |  | HTML iconHTML  

    In this letter, we derive simple expressions for the lower bound on the prediction error variance for narrowband MIMO channel with uniform linear array at both ends of the link. The derived bounds show the relationship between the achievable prediction performance and prediction algorithm design parameters, thereby providing useful insights into the development of fading channel prediction algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Continuous Mixed p -Norm Adaptive Algorithm for System Identification

    Page(s): 1108 - 1110
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (705 KB) |  | HTML iconHTML  

    We propose a new adaptive filtering algorithm in system identification applications which is based on a continuous mixed p-norm. It enjoys the advantages of various error norms since it combines p-norms for 1 ≤ p ≤ 2. The mixture is controlled by a continuous probability density-like function of p which is assumed to be uniform in our derivations in this letter. Two versions of the suggested algorithm are developed. The robustness of the proposed algorithms against impulsive noise are demonstrated in a system identification simulation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Nested Array Processing for Distributed Sources

    Page(s): 1111 - 1114
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1322 KB) |  | HTML iconHTML  

    We consider the problem of using linear nested arrays to estimate the directions of arrival (DOAs) of distributed sources and to detect the source number, where we have more sources than actual physical sensors. Angular spread, caused by the multipath nature of the distributed sources, makes the commonly used point-source assumption challenging. We establish the signal model for distributed sources, using a nested array. Due to the characteristics of distributed sources, the regular spatial smoothing technique, which is used to exploit the increased degrees of freedom provided by the co-array, no longer works. We thus propose a novel spatial smoothing approach to circumvent this problem. Based on the analytical results, we construct the corresponding DOA estimation and source number detection methods. The effectiveness of the proposed methods is verified through numerical examples. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Language-Independent Text-Line Extraction Algorithm for Handwritten Documents

    Page(s): 1115 - 1119
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (832 KB) |  | HTML iconHTML  

    Text-line extraction in handwritten documents is an important step for document image understanding, and a number of algorithms have been proposed to address this problem. However, most of them exploit features of specific languages and work only for a given language. In order to overcome this limitation, we develop a language-independent text-line extraction algorithm. Our method is based on connected components (CCs), however, unlike conventional methods, we analyze strokes and partition under-segmented CCs into normalized ones. Due to this normalization, the proposed method is able to estimate the states of CCs for a range of different languages and writing styles. From the estimated states, we build a cost function whose minimization yields text-lines. Experimental results show that the proposed method yields the state-of-the-art performance on Latin-based and Chinese script databases. Further, we submitted the proposed algorithm to the ICDAR 2013 handwriting segmentation competition and our method showed the best text-line extraction performance among 10 participant methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Convolutional Neural Networks for Distant Speech Recognition

    Page(s): 1120 - 1124
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (628 KB) |  | HTML iconHTML  

    We investigate convolutional neural networks (CNNs) for large vocabulary distant speech recognition, trained using speech recorded from a single distant microphone (SDM) and multiple distant microphones (MDM). In the MDM case we explore a beamformed signal input representation compared with the direct use of multiple acoustic channels as a parallel input to the CNN. We have explored different weight sharing approaches, and propose a channel-wise convolution with two-way pooling. Our experiments, using the AMI meeting corpus, found that CNNs improve the word error rate (WER) by 6.5% relative compared to conventional deep neural network (DNN) models and 15.7% over a discriminatively trained Gaussian mixture model (GMM) baseline. For cross-channel CNN training, the WER improves by 3.5% relative over the comparable DNN structure. Compared with the best beamformed GMM system, cross-channel convolution reduces the WER by 9.7% relative, and matches the accuracy of a beamformed DNN. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Group Sparsity via SURE Based on Regression Parameter Mean Squared Error

    Page(s): 1125 - 1129
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1126 KB) |  | HTML iconHTML  

    Any regularization method requires the selection of a penalty parameter and many model selection criteria have been developed based on various discrepancy measures. Most of the attention has been focused on prediction mean squared error. In this paper we develop a model selection criterion based on regression parameter mean squared error via SURE (Stein's unbiased risk estimator). We then apply this to the l1 penalized least squares problem with grouped variables on over-determined systems. Simulation results based on topology identification of a sparse network are presented to illustrate and compare with alternative model selection criteria. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Peter Willett
University of Connecticut
Storrs, CT 06269
peter.willett@uconn.edu