<![CDATA[ IEEE Transactions on Information Theory - new TOC ]]>
http://ieeexplore.ieee.org
TOC Alert for Publication# 18 2018February 19<![CDATA[Table of contents]]>643C1C4151<![CDATA[IEEE Transactions on Information Theory publication information]]>643C2C264<![CDATA[A One-Shot Achievability Result for Quantum State Redistribution]]>et al. (2016).]]>64314251435393<![CDATA[A Generalized Quantum Slepian–Wolf]]>et al., who studied a special case of this problem. As another special case wherein Bob holds trivial registers, we recover the result of Devetak and Yard regarding quantum state redistribution.]]>64314361453438<![CDATA[Separation Between Quantum Lovász Number and Entanglement-Assisted Zero-Error Classical Capacity]]>64314541460225<![CDATA[Non-Binary Quantum Synchronizable Codes From Repeated-Root Cyclic Codes]]>$p^{s}$ and $lp^{s}$ over $mathbb {F}_{q}$ , where $sgeq 1~and lgeq 2$ are integers, and $pgeq 3$ is the odd characteristic. Within some loose limitations, these synchronizable codes can possess the best possible capability in synchronization recovery, and therefore, enriches the variety of good quantum synchronizable codes. Furthermore, by using known techniques in classical coding theory which convert the computation of the minimum distance of a repeated-root cyclic code to that of a shorter simple-root cyclic code, we prove that the repeated-root cyclic codes of lengths $p^{s}$ and $lp^{s}$ are in general better than narrow-sense BCH codes of close lengths in terms of minimum distances, and thereby enable the obtained synchronizable codes to correct more Pauli errors.]]>64314611470900<![CDATA[Maximum Weight Matching Using Odd-Sized Cycles: Max-Product Belief Propagation and Half-Integrality]]>a posteriori (MAP) assignment in a joint probability distribution represented by a graphical model (GM), and respective LPs can be considered as continuous relaxations of the discrete MAP problem. It was recently shown that a BP algorithm converges to the correct MAP/MWM assignment under a simple GM formulation of MWM, as long as the corresponding LP relaxation is tight. First, under the motivation for forcing the tightness condition, we consider a new GM formulation of MWM, say C-GM, using non-intersecting odd-sized cycles in the graph; the new corresponding LP relaxation, say C-LP, becomes tight for more MWM instances. However, the tightness of C-LP now does not guarantee such convergence and correctness of the new BP on C-GM. To address the issue, we introduce a novel graph transformation applied to C-GM, which results in another GM formulation of MWM, and prove that the respective BP on it converges to the correct MAP/MWM assignment, as long as C-LP is tight. Finally, we also show that C-LP always has half-integral solutions, which leads to an efficient BP-based MWM heuristic consisting of making sequential, “cutting plane”, modifications to the underlying GM. Our experiments show that this BP-based cutting plane heuristic performs, as well as that based on traditional LP solvers.]]>64314711480910<![CDATA[Lattice Codes Achieve the Capacity of Common Message Gaussian Broadcast Channels With Coded Side Information]]>coded side information, i.e., prior information in the form of linear combinations of the messages. This channel model is motivated by applications to multi-terminal networks, where the nodes may have access to coded versions of the messages from previous signal hops or through orthogonal channels. The capacity of this channel is known and follows from the work of Tuncel (2006), which is based on random coding arguments. In this paper, following the approach of Erez and Zamir, we design lattice codes for this family of channels when the source messages are symbols from a finite field $mathbb {F}_{p}$ of prime size. Our coding scheme utilizes Construction A lattices designed over the same prime field $mathbb {F}_{p}$ , and uses algebraic binning at the decoders to expurgate the channel code and obtain good lattice subcodes, for every possible set of linear combinations available as side information. The achievable rate of our coding scheme is a function of the size $p$ of underlying prime field, and approaches the capacity as $p$ tends to infinity.]]>64314811496684<![CDATA[Extended Product and Integrated Interleaved Codes]]>64314971513417<![CDATA[Speeding Up Distributed Machine Learning Using Codes]]>robustness against noise. In large-scale systems, there are several types of noise that can affect the performance of distributed machine learning algorithms—straggler nodes, system failures, or communication bottlenecks—but there has been little interaction cutting across codes, machine learning, and distributed systems. In this paper, we provide theoretical insights on how coded solutions can achieve significant gains compared with uncoded ones. We focus on two of the most basic building blocks of distributed learning algorithms: matrix multiplication and data shuffling. For matrix multiplication, we use codes to alleviate the effect of stragglers and show that if the number of homogeneous workers is $n$ , and the runtime of each subtask has an exponential tail, coded computation can speed up distributed matrix multiplication by a factor of $log n$ . For data shuffling, we use codes to reduce communication bottlenecks, exploiting the excess in storage. We show that when a constant fraction $alpha $ of the data matrix can be cached at each worker, and $n$ is the number of workers, coded shuffling reduces the communication cost by a factor of $left({alpha + frac {1}{n}}right)gamma (n)$ compared with uncoded shuffling, where $gamma (n)$ is the ratio of the cost of unicasting $n$ messages to $n$ users to multicasting a common message (of the same size) to $n$ users. For instance, $gamma (n) simeq n$ if multicasting a message to $n$ users is as cheap as unicasting a message to one user. We also provide experimental results, corroborating our theoretical gains of the coded algorithms.]]>643151415291461<![CDATA[Variable Packet-Error Coding]]>$N$ packets, an unknown number of which are subject to adversarial errors en route to the decoder. We seek code designs for which the decoder is guaranteed to be able to reproduce the source subject to a certain distortion constraint when there are no packets errors, subject to a less stringent distortion constraint when there is one error, and so on. Focusing on the special case of the erasure distortion measure, we introduce a code design based on the polytope codes of Kosut et al.. The resulting designs are also applied to a separate problem in distributed storage.]]>64315301547663<![CDATA[Caching and Delivery via Interference Elimination]]>64315481560699<![CDATA[LDA Lattices Without Dithering Achieve Capacity on the Gaussian Channel]]>64315611594850<![CDATA[Lattice Codes for Deletion and Repetition Channels]]>$mathbb {Z}_{4}$ –codes. A lower bound on the size of our codes for the Manhattan distance are obtained through generalized theta series of the corresponding lattices. For any fixed number of deletions, provided the number of runs is large enough our method supplies a correction technique. For fixed number of runs and binary sequence length large our lattice construction is shown to be tight up to constants.]]>643159516031392<![CDATA[Systematic Block Markov Superposition Transmission of Repetition Codes]]>a posteriori decoding. The derived lower bound reveals connections among BER, encoding memory and code rate, which provides a way to design good systematic BMST-R codes and also allows us to make trade-offs among efficiency, performance, and complexity. Numerical results show that: 1) the proposed bounds are tight in the high signal-to-noise ratio region; 2) systematic BMST-R codes perform well in a wide range of code rates; and 3) rate 1/2 systematic BMST-R codes outperform the considered (3,6)- and (4,8)-regular spatially coupled low-density parity-check codes under an equal decoding latency constraint.]]>643160416201825<![CDATA[Information-Theoretically Secure Erasure Codes for Distributed Storage]]>643162116461699<![CDATA[Achieving Secrecy Capacity of the Gaussian Wiretap Channel With Polar Lattices]]>$Lambda _{s}$ Gaussian wiretap channel (GWC). Then, we propose an explicit shaping scheme to remove this mod-$Lambda _{s}$ front end and extend polar lattices to the genuine GWC. The shaping technique is based on the lattice Gaussian distribution, which leads to a binary asymmetric channel at each level for the multilevel lattice codes. By employing the asymmetric polar coding technique, we construct an AWGN-good lattice and a secrecy-good lattice with optimal shaping simultaneously. As a result, the encoding complexity for the sender and the decoding complexity for the legitimate receiver are both $O(Nlog Nlog (log N))$ . The proposed scheme is proven to be semantically secure.]]>643164716651339<![CDATA[Near-Optimal Compressed Sensing of a Class of Sparse Low-Rank Matrices Via Sparse Power Factorization]]>$ell _{1}$ -norm achieves near-optimal performance, for compressed sensing of sparse low-rank matrices, it has been shown recently that convex programmings using the nuclear norm and the mixed norm are highly suboptimal even in the noise-free scenario. We propose an alternating minimization algorithm called sparse power factorization (SPF) for compressed sensing of sparse rank-one matrices. For a class of signals whose sparse representation coefficients are fast-decaying, SPF achieves stable recovery of the rank-one matrix formed by their outer product and requires number of measurements within a logarithmic factor of the information-theoretic fundamental limit. For the recovery of general sparse low-rank matrices, we propose subspace-concatenated SPF (SCSPF), which has analogous near-optimal performance guarantees to SPF in the rank-one case. Numerical results show that SPF and SCSPF empirically outperform convex programmings using the best known combinations of mixed norm and nuclear norm.]]>643166616981019<![CDATA[A Proof of Conjecture on Restricted Isometry Property Constants $delta _{tk} left(0<t<frac {4}{3}right)$]]>$delta _{tk} (0<t<({4}/{3}))$ , which was proposed by T. Cai and A. Zhang. We have shown that when $0 < t < (4/3)$ , the condition $delta _{tk}<({t}/({4-t}))$ is sufficient to guarantee the exact recovery for all $k$ -sparse signals in the noiseless case via the constrained $ell _{1}$ -norm minimization. These bounds are sharp in the sense that for any $epsilon >0,,,delta _{tk}<({t}/({4-t}))+epsilon $ cannot guarantee the exact recovery of some $k$ -sparse signals. Furthermore, it will be shown that similar characterizations also hold for low-rank matrix recovery. Thus, combined with T. Cai and A. Zhang’s work, a complete characterization for sharp RIP constants $delta _{tk}$ for all $t > 0$ is obtained to guarantee the exact recovery of all $k$ -sparse signals and matrices with rank at most $k$ by $ell _{1}$ -norm minimization and nuclear norm minimizat-
on, respectively. Noisy cases and approximately sparse cases are also considered. To solve the conjecture, we construct a few identities so that RIP of order $tk$ , which is the target of our main results, can be perfectly applied to them.]]>64316991705186<![CDATA[Sampling and Distortion Tradeoffs for Bandlimited Periodic Signals]]>64317061724823<![CDATA[Model Consistency of Partly Smooth Regularizers]]>64317251737441<![CDATA[Convex and Nonconvex Formulations for Mixed Regression With Two Components: Minimax Optimal Rates]]>64317381766531<![CDATA[Denoising Flows on Trees]]>64317671783746<![CDATA[Approximate Asymptotic Distribution of Locally Most Powerful Invariant Test for Independence: Complex Case]]>exact distribution for a test statistic. In this paper, asymptotic distributions of locally most powerful invariant test for independence of complex Gaussian vectors are developed. In particular, its cumulative distribution function (CDF) under the null hypothesis is approximated by a function of chi-squared CDFs. Moreover, the CDF corresponding to the non-null distribution is expressed in terms of non-central chi-squared CDFs for close hypothesis, and Gaussian CDF as well as its derivatives for far hypothesis. The results turn out to be very accurate in terms of fitting their empirical counterparts. Closed-form expression for the detection threshold is also provided. Numerical results are presented to validate our theoretical findings.]]>643178417991523<![CDATA[A Multivariate Hawkes Process With Gaps in Observations]]>large amount of missing events by introducing a small number of unknown boundary conditions. In the case where our observations are sparse (e.g., from 10% to 30%), we show through numerical simulations that robust recovery with MHPG is still possible even if the lengths of the observed intervals are small but they are chosen accordingly. The numerical results also show that the knowledge of gaps and imposing the right boundary conditions are very crucial in discovering the underlying patterns and hidden relationships.]]>643180018111039<![CDATA[Information Geometry of Generalized Bayesian Prediction Using $alpha$ -Divergences as Loss Functions]]>${alpha }$ -divergences as the loss functions, optimality, and asymptotic properties of the generalized Bayesian predictive densities are considered. We show that the Bayesian predictive densities minimize a generalized Bayes risk. We also find that the asymptotic expansions of the densities are related to the coefficients of the ${alpha }$ -connections of a statistical manifold. In addition, we discuss the difference between two risk functions of the generalized Bayesian predictions based on different priors. Finally, using the non-informative priors (i.e., Jeffreys and reference priors), uniform prior, and conjugate prior, two examples are presented to illustrate the main results.]]>643181218241152<![CDATA[Information Measures, Inequalities and Performance Bounds for Parameter Estimation in Impulsive Noise Environments]]>64318251844791<![CDATA[A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction]]>64318451866905<![CDATA[Sampling Constrained Asynchronous Communication: How to Sleep Efficiently]]>$rho in ~(0,1]$ of the channel outputs, there is no capacity penalty. That is, for any strictly positive sampling rate $rho $ , the asynchronous capacity per unit cost is the same as under full sampling, i.e., when $rho =1$ . Moreover, there is no penalty in terms of decoding delay. These results are asymptotic in nature, considering the limit as the number $B$ of bits to be transmitted tends to infinity, while the sampling rate $rho $ remains fixed. A natural question is then whether the sampling rate $rho (B)$ can drop to zero without introducing a capacity (or delay) penalty compared with full sampling. We answer this question affirmatively. The main result of this paper is an essentially tight characterization of the minimum sampling rate. We show that any sampling rate that grows at least as fast as $omega (1/B)$ is achievable, while any sampling rate smaller than $o(1/B)$ yields unreliable communication. The key ingredient in our improved achievability result is a new, multi-phase adaptive sampling scheme for locating transient changes, which we believe may b-
of independent interest for certain change-point detection problems.]]>64318671878287<![CDATA[Strong Data Processing Inequalities for Input Constrained Additive Noise Channels]]>$Wto Xto Y$ forming a Markov chain, where $Y = X + Z$ with $X$ and $Z$ real valued, independent and $X$ bounded in $L_{p}$ -norm. It is shown that $I(W; Y) le F_{I}(I(W;X))$ with $F_{I}(t) < t$ whenever $t > 0$ , if and only if $Z$ has a density whose support is not disjoint from any translate of itself. A related question is to characterize for what couplings $(W, X)$ the mutual information $I(W; Y)$ is close to maximum possible. To that end we show that in order to saturate the channel, i.e., for $I(W; Y)$ to approach capacity, it is mandatory that $I(W; X)to infty $ (under suitable conditions on the channel). A key ingredient for this result is a deconvolution lemma which shows that -
ostconvolution total variation distance bounds the preconvolution Kolmogorov–Smirnov distance. Explicit bounds are provided for the special case of the additive Gaussian noise channel with quadratic cost constraint. These bounds are shown to be order optimal. For this case, simplified proofs are provided leveraging Gaussian-specific tools such as the connection between information and estimation (I-MMSE) and Talagrand’s information-transportation inequality.]]>64318791892403<![CDATA[The Rate-and-State Capacity with Feedback]]>64318931918443<![CDATA[Determining Optimal Rates for Communication for Omniscience]]>643191919441438<![CDATA[The Capacity of Private Information Retrieval From Coded Databases]]>$N$ non-colluding databases, each storing an MDS-coded version of $M$ messages. In the PIR problem, the user wishes to retrieve one of the available messages without revealing the message identity to any individual database. We derive the information-theoretic capacity of this problem, which is defined as the maximum number of bits of the desired message that can be privately retrieved per one bit of downloaded information. We show that the PIR capacity in this case is $C=(1+{K}/{N}+{K^{2}}/{N^{2}}+cdots +{K^{M-1}}/{N^{M-1}})^{-1}=(1+R_{c}+R_{c}^{2}+cdots +R_{c}^{M-1})^{-1}=({1-R_{c}})/({1-R_{c}^{M}})$ , where $R_{c}$ is the rate of the $(N,K)$ MDS code used. The capacity is a function of the code rate and the number of messages only regardless of the explicit structure of the storage code. The result implies a fundamental tradeoff between the optimal retrieval cost and the storage cost when the storage code is restricted to the class of MDS codes. The result generalizes the achievability and converse results for the classical PIR with replicated databases to the case of MDS-coded databases.]]>64319451956967<![CDATA[A Rate-Distortion Approach to Caching]]>${ {mathsf {f}}}$ -separable distortion functions recently introduce by Shkel and Verdú. The class of ${ {mathsf {f}}}$ -separable distortion functions includes separable distortion functions as a special case, and our analysis covers both the expected- and excess-distortion settings in detail. We also determine what “common information” should be placed in the cache, and what information should be transmitted during the delivery phase. To this end, two new common-information measures are introduced for caching, and their relationship to the common-information measures of Wyner, Gács, and Körner is discussed in detail.]]>64319571976803<![CDATA[Capacity of Continuous-Space Electromagnetic Channels With Lossy Transceivers]]>64319771991664<![CDATA[Uplink-Downlink Duality for Integer-Forcing]]>$mathbf {H}$ and a Gaussian MIMO broadcast channel (BC) with channel matrix $mathbf {H} ^{mathsf {T}}$ . For the MIMO MAC, the integer-forcing architecture consists of first decoding integer-linear combinations of the transmitted codewords, which are then solved for the original messages. For the MIMO BC, the integer-forcing architecture consists of pre-inverting the integer-linear combinations at the transmitter, so that each receiver can obtain its desired codeword by decoding an integer-linear combination. In both the cases, integer-forcing offers higher achievable rates than zero-forcing while maintaining a similar implementation complexity. This paper establishes an uplink-downlink duality relationship for integer-forcing, i.e., any sum rate that is achievable via integer-forcing on the MIMO MAC can be achieved via integer-forcing on the MIMO BC with the same sum power and vice versa. Using this duality relationship, it is shown that integer-forcing can operate within a constant gap of the MIMO BC sum capacity. Finally, the paper proposes a duality-based iterative algorithm for the non-convex problem of selecting optimal beamforming and equalization vectors, and establishes that it converges to a local optimum.]]>643199220111111<![CDATA[On the Minimum Mean $p$ th Error in Gaussian Noise Channels and Its Applications]]>$p$ th error (MMPE), is considered. The classical minimum mean square error (MMSE) is a special case of the MMPE. Several bounds, properties, and applications of the MMPE are derived and discussed. The optimal MMPE estimator is found for Gaussian and binary input distributions. Properties of the MMPE as a function of the input distribution, signal-to-noise-ratio (SNR) and order $p$ are derived. The “single-crossing-point property” (SCPP) which provides an upper bound on the MMSE, and which together with the mutual information-MMSE relationship is a powerful tool in deriving converse proofs in multi-user information theory, is extended to the MMPE. Moreover, a complementary bound to the SCPP is derived. As a first application of the MMPE, a bound on the conditional differential entropy in terms of the MMPE is provided, which then yields a generalization of the Ozarow–Wyner lower bound on the mutual information achieved by a discrete input on a Gaussian noise channel. As a second application, the MMPE is shown to improve on previous characterizations of the phase transition phenomenon that manifests, in the limit as the length of the capacity achieving code goes to infinity, as a discontinuity of the MMSE as a function of SNR. As a final application, the MMPE is used to show new bounds on the second derivative of mutual information, or the first derivative of the MMSE.]]>64320122037756<![CDATA[On Achievable Rates of AWGN Energy-Harvesting Channels With Block Energy Arrival and Non-Vanishing Error Probabilities]]>$L$ , which can be interpreted as the coherence time of the energy-arrival process. If $L$ is a constant or grows sublinearly in the blocklength $n$ , we fully characterize the first-order term in the asymptotic expansion of the maximum transmission rate subject to a fixed tolerable error probability $varepsilon $ . The first-order term is known as the $varepsilon $ -capacity. In addition, we obtain lower and upper bounds on the second-order term in the asymptotic expansion, which reveal that the second order term is proportional to $-({L/{n}})^{1/2}$ for any $varepsilon $ less than 1/2. The lower bound is obtained through analyzing the save-and-transmit strategy. If $L$ grows linearly in $n$ , we obtain lower and upper bounds on the $varepsilon $ -capacity, which coincide whenever the c-
mulative distribution function of the EH random variable is continuous and strictly increasing. In order to achieve the lower bound, we have proposed a novel adaptive save-and-transmit strategy, which chooses different save-and-transmit codes across different blocks according to the energy variation across the blocks.]]>64320382064657<![CDATA[TDMA is Optimal for All-Unicast DoF Region of TIM if and only if Topology is Chordal Bipartite]]>64320652076737<![CDATA[A New Wiretap Channel Model and Its Strong Secrecy Capacity]]>doubly-exponential convergence rate for the probability that, for a fixed choice of the subset, the key is uniform and independent from the public discussion and the wiretapping source’s observation. The converse is derived by using Sanov’s theorem to upper bound the secrecy capacity of the generalized wiretap channel by the secrecy capacity when the tapped subset is randomly chosen by nature.]]>64320772092571<![CDATA[Secure Degrees of Freedom of the Multiple Access Wiretap Channel With Multiple Antennas]]>$N$ antennas at each transmitter, $N$ antennas at the legitimate receiver, and $K$ antennas at the eavesdropper. We determine the optimal sum secure degrees of freedom (s.d.o.f.) for this model for all values of $N$ and $K$ . We subdivide our problem into several regimes based on the values of $N$ and $K$ , and provide achievable schemes based on vector space alignment and real alignment techniques for fixed and fading channel gains. To prove the optimality of the achievable schemes, we provide matching converses for each regime. Our results show how the number of eavesdropper antennas affects the optimal sum s.d.o.f. of the multiple access wiretap channel.]]>64320932103462<![CDATA[Degraded Broadcast Channel With Secrecy Outside a Bounded Range]]>$K$ -receiver degraded broadcast channel with secrecy outside a bounded range is studied, in which a transmitter sends $K$ messages to $K$ receivers, and the channel quality gradually degrades from receiver $K$ to receiver 1. Each receiver $k$ is required to decode message $W_{1},ldots,W_{k}$ , for $1leq kleq K$ , and to be kept ignorant of $W_{k+2},ldots,W_{K}$ , for $k=1,ldots, K-2$ . Thus, each message $W_{k}$ is kept secure from receivers with at least two-level worse channel quality, i.e., receivers 1, $ldots $ , $k-2$ . The secrecy capacity region is fully characterized. The achievable scheme designates one superposition layer to each message with binning employed for each layer. Joint embedded coding and binning are employed to protect all upper-layer messages from lower-layer receivers. Furthermore, the scheme allows adjacent layers to share rates so that part of the rate of each message can be shared with its immediate upper-layer message to enlarge the rate region. More importantly, an induction approach is developed to perform Fourier-Motzkin elimination of $2K$ variables from the order of $K^{2}$ bounds to obtain a close-form achievable rate region. An outer bound is developed that matches the achievable rate region, whose proof involves recursive construction of the rate bounds and exploits the intuition gained from the achievable scheme.]]>64321042120730<![CDATA[Shared Rate Process for Mobile Users in Poisson Networks and Applications]]>643212121412250<![CDATA[When Can Intelligent Helper Node Selection Improve the Performance of Distributed Storage Networks?]]>$k$ out of $n$ surviving nodes should be able to reconstruct the protected file; and if one node fails, the replacement node can access $d$ helper nodes to repair its content either functionally or exactly. Two major existing approaches for DSNs are the so-called regenerating codes (RCs) and locally repairable codes (LRCs), which have different design philosophies and focus on distinct applications. Instead of being limited by the framework of either RCs or LRCs, this work answers a fundamental question for general DSNs: For an arbitrarily given $(n,k,d)$ value, whether there exists an intelligent helper node selection design that can strictly improve the storage-bandwidth tradeoff when compared with naive blind helper selection. Surprisingly, the answer is negative for a large set of $(n,k,d)$ values. Namely, for those $(n,k,d)$ values even the best helper selection design offers no gain over a blind solution. We call those $(n,k,d)$ values indifferent-to-helper-selection (ITHS). The main contribution of this work is a necessary and sufficient condition that characterizes whether an $(n,k,d)$ value is ITHS. As a fundamental study, this work assumes functional repa-
r with unlimited computing power for encoding/decoding and focuses on the fundamental performance limits of intelligent helper selection. A new helper selection scheme, termed family helper selection, is proposed and used in the achievability analysis. For some scenarios, the proposed scheme is indeed optimal (as good as any helper selection one can design).]]>643214221711317<![CDATA[[Blank page]]]>643B2172B21722<![CDATA[IEEE Transactions on Information Theory information for authors]]>643C3C348