Skip to Main Content
This paper discusses the problem of what kind of learning model is suitable for the tasks of feature extraction for data representation and suggests two evaluation criteria for nonlinear feature extractors: reconstruction error minimization and similarity preservation. Based on the suggested evaluation criteria, a new type of principal curve-similarity preserving principal curve (SPPC) is proposed. SPPCs minimize the reconstruction error under the condition that the similarity between similar samples are preserved in the extracted features, thus giving researchers effective and reliable cognition of the inner structure of data sets. The existence and properties of SPPCs are analyzed; a practical learning algorithm is proposed and high dimensional extensions of SPPCs are also discussed. Experimental results show the virtues of SPPCs in preserving inner structures of data sets and discovering manifolds with high nonlinearity.