By Topic

Selected Topics in Signal Processing, IEEE Journal of

Issue 5 • Date Oct. 2013

Filter Results

Displaying Results 1 - 25 of 27
  • [Front cover]

    Publication Year: 2013 , Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (255 KB)  
    Freely Available from IEEE
  • IEEE Journal of Selected Topics in Signal Processing publication information

    Publication Year: 2013 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (129 KB)  
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2013 , Page(s): 741 - 742
    Save to Project icon | Request Permissions | PDF file iconPDF (132 KB)  
    Freely Available from IEEE
  • Introduction to the Issue on Learning-Based Decision Making in Dynamic Systems Under Uncertainty

    Publication Year: 2013 , Page(s): 743 - 745
    Save to Project icon | Request Permissions | PDF file iconPDF (1152 KB)  
    Freely Available from IEEE
  • Feature Search in the Grassmanian in Online Reinforcement Learning

    Publication Year: 2013 , Page(s): 746 - 758
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2856 KB) |  | HTML iconHTML  

    We consider the problem of finding the best features for value function approximation in reinforcement learning and develop an online algorithm to optimize the mean square Bellman error objective. For any given feature value, our algorithm performs gradient search in the parameter space via a residual gradient scheme and, on a slower timescale, also performs gradient search in the Grassman manifold of features. We present a proof of convergence of our algorithm. We show empirical results using our algorithm as well as a similar algorithm that uses temporal difference learning in place of the residual gradient scheme for the faster timescale updates. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems

    Publication Year: 2013 , Page(s): 759 - 767
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1665 KB) |  | HTML iconHTML  

    In the Multi-Armed Bandit (MAB) problem, there is a given set of arms with unknown reward models. At each time, a player selects one arm to play, aiming to maximize the total expected reward over a horizon of length T. An approach based on a Deterministic Sequencing of Exploration and Exploitation (DSEE) is developed for constructing sequential arm selection policies. It is shown that for all light-tailed reward distributions, DSEE achieves the optimal logarithmic order of the regret, where regret is defined as the total expected reward loss against the ideal case with known reward models. For heavy-tailed reward distributions, DSEE achieves O(T1/p) regret when the moments of the reward distributions exist up to the pth order for and O(T1/(1+p/2)) for p > 2. With the knowledge of an upper bound on a finite moment of the heavy-tailed reward distributions, DSEE offers the optimal logarithmic regret order. The proposed DSEE approach complements existing work on MAB by providing corresponding results for general reward distributions. Furthermore, with a clearly defined tunable parameter-the cardinality of the exploration sequence, the DSEE approach is easily extendable to variations of MAB, including MAB with various objectives, decentralized MAB with multiple players and incomplete reward observations under collisions, restless MAB with unknown dynamics, and combinatorial MAB with dependent arms that often arise in network optimization problems such as the shortest path, the minimum spanning tree, and the dominating set problems under unknown random weights. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sequentiality and Adaptivity Gains in Active Hypothesis Testing

    Publication Year: 2013 , Page(s): 768 - 782
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4257 KB) |  | HTML iconHTML  

    Consider a decision maker who is responsible to collect observations so as to enhance his information in a speedy manner about an underlying phenomena of interest. The policies under which the decision maker selects sensing actions can be categorized based on the following two factors: i) sequential versus non-sequential; ii) adaptive versus non-adaptive. Non-sequential policies collect a fixed number of observation samples and make the final decision afterwards; while under sequential policies, the sample size is not known initially and is determined by the observation outcomes. Under adaptive policies, the decision maker relies on the previous collected samples to select the next sensing action; while under non-adaptive policies, the actions are selected independent of the past observation outcomes. In this paper, performance bounds are provided for the policies in each category. Using these bounds, sequentiality gain and adaptivity gain, i.e., the gains of sequential and adaptive selection of actions are characterized. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multistage Adaptive Estimation of Sparse Signals

    Publication Year: 2013 , Page(s): 783 - 796
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4343 KB) |  | HTML iconHTML  

    This paper considers sequential adaptive estimation of sparse signals under a constraint on the total sensing effort. The advantage of adaptivity in this context is the ability to focus more resources on regions of space where signal components exist, thereby improving performance. A dynamic programming formulation is derived for the allocation of sensing effort to minimize the expected estimation loss. Based on the method of open-loop feedback control, allocation policies are then developed for a variety of loss functions. The policies are optimal in the two-stage case, generalizing an optimal two-stage policy proposed by Bashan , and improve monotonically thereafter with the number of stages. Numerical simulations show gains up to several dB as compared to recently proposed adaptive methods, and dramatic gains compared to non-adaptive estimation. An application to radar imaging is also presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hypothesis Testing in Feedforward Networks With Broadcast Failures

    Publication Year: 2013 , Page(s): 797 - 810
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3839 KB) |  | HTML iconHTML  

    Consider a large number of nodes, which sequentially make decisions between two given hypotheses. Each node takes a measurement of the underlying truth, observes the decisions from some immediate predecessors, and makes a decision between the given hypotheses. We consider two classes of broadcast failures: 1) each node broadcasts a decision to the other nodes, subject to random erasure in the form of a binary erasure channel; 2) each node broadcasts a randomly flipped decision to the other nodes in the form of a binary symmetric channel. We are interested in conditions under which there does (or does not) exist a decision strategy consisting of a sequence of likelihood ratio tests such that the node decisions converge in probability to the underlying truth, as the number of nodes goes to infinity. In both cases, we show that if each node only learns from a bounded number of immediate predecessors, then there does not exist a decision strategy such that the decisions converge in probability to the underlying truth. However, in case 1, we show that if each node learns from an unboundedly growing number of predecessors, then there exists a decision strategy such that the decisions converge in probability to the underlying truth, even when the erasure probabilities converge to 1. We show that a locally optimal strategy, consisting of a sequence of Bayesian likelihood ratio tests, is such a strategy, and we derive the convergence rate of the error probability for this strategy. In case 2, we show that if each node learns from all of its previous predecessors, then there exists a decision strategy such that the decisions converge in probability to the underlying truth when the flipping probabilities of the binary symmetric channels are bounded away from 1/2. Again, we show that a locally optimal strategy achieves this, and we derive the convergence rate of the error probability for it. In the case where the flipping probabilities converge to 1/2, we derive a necessary co- dition on the convergence rate of the flipping probabilities such that the decisions based on the locally optimal strategy still converge to the underlying truth. We also explicitly characterize the relationship between the convergence rate of the error probability and the convergence rate of the flipping probabilities. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning-Based Constraint Satisfaction With Sensing Restrictions

    Publication Year: 2013 , Page(s): 811 - 820
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2483 KB) |  | HTML iconHTML  

    In this paper we consider graph-coloring problems, an important subset of general constraint satisfaction problems that arise in wireless resource allocation. We constructively establish the existence of fully decentralized learning-based algorithms that are able to find a proper coloring even in the presence of strong sensing restrictions, in particular sensing asymmetry of the type encountered when hidden terminals are present. Our main analytic contribution is to establish sufficient conditions on the sensing behavior to ensure that the solvers find satisfying assignments with probability one. These conditions take the form of connectivity requirements on the induced sensing graph. These requirements are mild, and we demonstrate that they are commonly satisfied in wireless allocation tasks. We argue that our results are of considerable practical importance in view of the prevalence of both communication and sensing restrictions in wireless resource allocation problems. The class of algorithms analyzed here requires no message-passing whatsoever between wireless devices, and we show that they continue to perform well even when devices are only able to carry out constrained sensing of the surrounding radio environment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed Energy-Aware Diffusion Least Mean Squares: Game-Theoretic Learning

    Publication Year: 2013 , Page(s): 821 - 836
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3842 KB) |  | HTML iconHTML  

    This paper presents a game-theoretic approach to node activation control in parameter estimation via diffusion least mean squares (LMS). Nodes cooperate by exchanging estimates over links characterized by the connectivity graph of the network. The energy-aware activation control is formulated as a noncooperative repeated game where nodes autonomously decide when to activate based on a utility function that captures the trade-off between individual node's contribution and energy expenditure. The diffusion LMS stochastic approximation is combined with a game-theoretic learning algorithm such that the overall energy-aware diffusion LMS has two timescales: the fast timescale corresponds to the game-theoretic activation mechanism, whereby nodes distributively learn their optimal activation strategies, whereas the slow timescale corresponds to the diffusion LMS. The convergence analysis shows that the parameter estimates weakly converge to the true parameter across the network, yet the global activation behavior along the way tracks the set of correlated equilibria of the underlying activation control game. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed Learning and Multiaccess of On-Off Channels

    Publication Year: 2013 , Page(s): 837 - 845
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1320 KB) |  | HTML iconHTML  

    The problem of distributed access of a set of N on-off channels by K ≤ N users is considered. The channels are slotted and modeled as independent but not necessarily identical alternating renewal processes. Each user decides to either observe or transmit at the beginning of every slot. A transmission is successful only if the channel is at the on state and there is only one user transmitting. When a user observes, it identifies whether a transmission would have been successful had it decided to transmit. A distributed learning and access policy referred to as alternating sensing and access (ASA) is proposed. It is shown that ASA has finite expected regret when compared with the optimal centralized scheme with fixed channel allocation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Winning the Lottery: Learning Perfect Coordination With Minimal Feedback

    Publication Year: 2013 , Page(s): 846 - 857
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1977 KB) |  | HTML iconHTML  

    Coordination is a central problem whenever stations (or nodes or users) share resources across a network. In the absence of coordination, there will be collision, congestion or interference, with concomitant loss of performance. This paper proposes new protocols, which we call perfect coordination (PC) protocols, that solve the coordination problem. PC protocols are completely distributed (requiring neither central control nor the exchange of any control messages), fast (with speeds comparable to those of any existing protocols), fully efficient (achieving perfect coordination, with no collisions and no gaps) and require minimal feedback. PC protocols rely heavily on learning, exploiting the possibility to use both actions and silence as messages and the ability of stations to learn from their own histories while simultaneously enabling the learning of other stations. PC protocols can be formulated as finite automata and implemented using currently existing technology (e.g., wireless cards). Simulations show that, in a variety of deployment scenarios, PC protocols outperform existing state-of-the-art protocols-despite requiring much less feedback. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiagent Reinforcement Learning Based Spectrum Sensing Policies for Cognitive Radio Networks

    Publication Year: 2013 , Page(s): 858 - 868
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2058 KB) |  | HTML iconHTML  

    This paper proposes distributed multiuser multiband spectrum sensing policies for cognitive radio networks based on multiagent reinforcement learning. The spectrum sensing problem is formulated as a partially observable stochastic game and multiagent reinforcement learning is employed to find a solution. In the proposed reinforcement learning based sensing policies the secondary users (SUs) collaborate to improve the sensing reliability and to distribute the sensing tasks among the network nodes. The SU collaboration is carried out through local interactions in which the SUs share their local test statistics or decisions as well as information on the frequency bands sensed with their neighbors. As a result, a map of spectrum occupancy in a local neighborhood is created. The goal of the proposed sensing policies is to maximize the amount of free spectrum found given a constraint on the probability of missed detection. This is addressed by obtaining a balance between sensing more spectrum and the reliability of sensing results. Simulation results show that the proposed sensing policies provide an efficient way to find available spectrum in multiuser multiband cognitive radio scenarios. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Opportunistic Spectrum Access by Exploiting Primary User Feedbacks in Underlay Cognitive Radio Systems: An Optimality Analysis

    Publication Year: 2013 , Page(s): 869 - 882
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (6223 KB) |  | HTML iconHTML  

    We consider an underlay cognitive radio (CR) communication system in which a cognitive secondary user (SU) can access multiple primary spectrum channels only when its interference to the primary users (PU) is limited. To identify and exploit instantaneous transmission opportunities, the SU probes a subset of primary channels by overhearing the primary feedback signals so as to learn the primary receiver's channel condition and the interference tolerance level, then chooses appropriate power to transmit its data. In such context, the SU cannot probe all the channels for its limited number of receiving antennas, then a crucial optimization problem faced by the SU is to probe which channel(s) in order to maximize the long-term throughput given the past probing history. In this paper, we tackle this optimization problem by casting it into a restless multi-armed bandit (RMAB) problem that is of fundamental importance in decision theory. Given the specific and practical constraints posed by the problem, we analyze the myopic probing policy which consists of probing the best channels based on the past observation. We perform an analytical study on the optimality of the developed myopic probing policy. Specifically, for a family of generic and practically important utility functions, we establish the closed-form conditions to guarantee the optimality of the myopic probing policy, and also illustrate our analytical results via simulations on several typical network scenarios. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Maximizing Quality of Information From Multiple Sensor Devices: The Exploration vs Exploitation Tradeoff

    Publication Year: 2013 , Page(s): 883 - 894
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3115 KB) |  | HTML iconHTML  

    This paper investigates Quality of Information (QoI) aware adaptive sampling in a system where two sensor devices report information to an end user. The system carries out a sequence of tasks, where each task relates to a random event that must be observed. The accumulated information obtained from the sensor devices is reported once per task to a higher layer application at the end user. The utility of each report depends on the timeliness of the report and also on the quality of the observations. Quality can be improved by accumulating more observations for the same task, at the expense of delay. We assume new tasks arrive randomly, and the qualities of each new observation are also random. The goal is to maximize time average quality of information subject to cost constraints. We solve the problem by leveraging dynamic programming and Lyapunov optimization. Our algorithms involve solving a 2-dimensional optimal stopping problem, and result in a 2-dimensional threshold rule. When task arrivals are i.i.d., the optimal solution to the stopping problem can be closely approximated with a small number of simplified value iterations. When task arrivals are periodic, we derive a structured form approximately optimal stopping policy. We also introduce hybrid policies applied over the proposed adaptive sampling algorithms to further improve the performance. Numerical results demonstrate that our policies perform near optimal. Overall, this work provides new insights into network operation based on QoI attributes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transmit Power Control Policies for Energy Harvesting Sensors With Retransmissions

    Publication Year: 2013 , Page(s): 895 - 906
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2707 KB) |  | HTML iconHTML  

    This paper addresses the problem of finding outage- optimal power control policies for wireless energy harvesting sensor (EHS) nodes with automatic repeat request (ARQ)-based packet transmissions. The power control policy of the EHS specifies the transmission power for each packet transmission attempt, based on all the information available at the EHS. In particular, the acknowledgement (ACK) or negative acknowledgement (NACK) messages received provide the EHS with partial information about the channel state. We solve the problem of finding an optimal power control policy by casting it as a partially observable Markov decision process (POMDP). We study the structure of the optimal power policy in two ways. First, for the special case of binary power levels at the EHS, we show that the optimal policy for the underlying Markov decision process (MDP) when the channel state is observable is a threshold policy in the battery state. Second, we benchmark the performance of the EHS by rigorously analyzing the outage probability of a general fixed-power transmission scheme, where the EHS uses a predetermined power level at each slot within the frame. Monte Carlo simulation results illustrate the performance of the POMDP approach and verify the accuracy of the analysis. They also show that the POMDP solutions can significantly outperform conventional ad hoc approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Reputation Protocol Design for Online Communities: A Stochastic Stability Analysis

    Publication Year: 2013 , Page(s): 907 - 920
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4122 KB) |  | HTML iconHTML  

    This paper proposes a new class of incentive mechanisms aiming at compelling self-interested users in online communities to cooperate with each other by exchanging resources or services. Examples of such communities are social multimedia platforms, social networks, online labor markets, crowdsourcing platforms, etc. To optimize their individual long-term performance, users adapt their strategies by solving individual stochastic control problems. The users' adaptation catalyze a stochastic dynamic process, in which the strategies of users in the community evolve over time. We first characterize the structural properties of the users' best response strategies. Subsequently, using these structural results we design incentive mechanisms based on reputation protocols for governing the online communities, which can “manage” the long-run evolution of the community. We prove that by appropriately penalizing and rewarding users based on their behavior in the community, such incentive mechanisms can eliminate free-riding and ensure that the community converges to a desirable equilibrium selected by the community designer such that social welfare is maximized and in which users find it in their self-interest to cooperate with each other. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Journal of Selected Topics in Signal Processing information for authors

    Publication Year: 2013 , Page(s): 921 - 922
    Save to Project icon | Request Permissions | PDF file iconPDF (136 KB)  
    Freely Available from IEEE
  • J-STSP call for special issue proposals

    Publication Year: 2013 , Page(s): 923
    Save to Project icon | Request Permissions | PDF file iconPDF (765 KB)  
    Freely Available from IEEE
  • Special issue on signal processing in smart electric power grid

    Publication Year: 2013 , Page(s): 924
    Save to Project icon | Request Permissions | PDF file iconPDF (258 KB)  
    Freely Available from IEEE
  • Special issue on visual signal processing for wireless networks

    Publication Year: 2013 , Page(s): 925
    Save to Project icon | Request Permissions | PDF file iconPDF (319 KB)  
    Freely Available from IEEE
  • Special Issue on Signal Process for Situational Awareness from Networked Sensors and Social Media

    Publication Year: 2013 , Page(s): 926
    Save to Project icon | Request Permissions | PDF file iconPDF (913 KB)  
    Freely Available from IEEE
  • Open Access

    Publication Year: 2013 , Page(s): 927
    Save to Project icon | Request Permissions | PDF file iconPDF (1156 KB)  
    Freely Available from IEEE
  • MyIEEE

    Publication Year: 2013 , Page(s): 928
    Save to Project icon | Request Permissions | PDF file iconPDF (771 KB)  
    Freely Available from IEEE

Aims & Scope

The Journal of Selected Topics in Signal Processing (J-STSP) solicits special issues on topics that cover the entire scope of the IEEE Signal Processing Society including the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals by digital or analog devices or techniques.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Fernando Pereira
Instituto Superior Técnico