Skip to Main Content
We consider the task of dimensionality reduction informed by real-valued multivariate labels. The problem is often treated as Dimensionality Reduction for Regression (DRR), whose goal is to find a low-dimensional representation, the central subspace, of the input data that preserves the statistical correlation with the targets. A class of DRR methods exploits the notion of inverse regression (IR) to discover central subspaces. Whereas most existing IR techniques rely on explicit output space slicing, we propose a novel method called the Covariance Operator Inverse Regression (COIR) that generalizes IR to nonlinear input/output spaces without explicit target slicing. COIR's unique properties make DRR applicable to problem domains with high-dimensional output data corrupted by potentially significant amounts of noise. Unlike recent kernel dimensionality reduction methods that employ iterative nonconvex optimization, COIR yields a closed-form solution. We also establish the link between COIR, other DRR techniques, and popular supervised dimensionality reduction methods, including canonical correlation analysis and linear discriminant analysis. We then extend COIR to semi-supervised settings where many of the input points lack their labels. We demonstrate the benefits of COIR on several important regression problems in both fully supervised and semi-supervised settings.