Abstract:
Principal component analysis (PCA) is one of the most fundamental techniques for Big Data analytics in e.g, smart manufacturing and biostatistics, which is capable of ext...Show MoreMetadata
Abstract:
Principal component analysis (PCA) is one of the most fundamental techniques for Big Data analytics in e.g, smart manufacturing and biostatistics, which is capable of extracting the most essential information from a dataset. However, when it encounters multiple datasets, PCA cannot reveal the specific inherent data structure hidden in one dataset relative to the other(s), which we term contrastive data analytics in this paper. Although a number of proposals such as contrastive or discriminative PCA have been suggested, they require fine-tuning of hyper-parameters or cannot effectively deal with nonlinear data. In this context, we advocate deep contrastive (Dc) PCA for nonlinear contrastive data analytics, which leverages the power of deep neural networks to explore the hidden nonlinear relationships in the datasets and extract the desired contrastive features. An alternating minimization algorithm is developed for simultaneously seeking the best nonlinear transformations for the data as well as the associated contrastive projections, tantamount to performing an eigenvalue decomposition and a back-propagation step. Substantial numerical tests using both synthetic and real datasets from protein expression, electroencephalography, wearable human motion, and smart manufacturing, are conducted, which corroborate the superior adaptivity of DcPCA in dealing with nonlinear data relative to state-of-the-art alternatives.
Published in: IEEE Transactions on Signal Processing ( Volume: 70)