Skip to Main Content
Variable selection for regression is a classical statistical problem, motivated by concerns that too large a number of covariates may bring about overfitting and unnecessarily high measurement costs. Novel difficulties arise in streaming contexts, where the correlation structure of the process may be drifting, in which case it must be constantly tracked so that selections may be revised accordingly. A particularly interesting phenomenon is that non-selected covariates become missing variables, inducing bias on subsequent decisions. This raises an intricate exploration-exploitation tradeoff, whose dependence on the covariance tracking algorithm and the choice of variable selection scheme is too complex to be dealt with analytically. We hence capitalise on the strength of simulations to explore this problem, taking the opportunity to tackle the difficult task of simulating dynamic correlation structures.