I. Introduction
Sparse linear regression is the recovery of a sparse vector from linear measurements , with . A vector is sparse if it has few nonzero components; more precisely, we call k-sparse a vector with non-zero components. The interest for sparse solutions has different motivations. In machine learning and system identification, a purpose is to build models as simple as possible from large datasets. Indeed, we know that in many cases the true number of parameters of a system is much smaller than the global dimensionality of the problem, and sparsity supports the removal of redundant parameters. In the recent literature, the identification of linear systems under sparsity constraints is considered in [1], [2], [3], [4]. Furthermore, in the last decade, sparsity has been attracting a lot of attention due to the theory of compressed sensing (CS, [5], [6]), which states that a sparse vector can be recovered from compressed linear measurements, that is, when m < n, under suitable conditions on the structure of A.