Abstract:
Fuzzy-rough sets (FRS) provide an outstanding theoretical tool for feature selection (FS). Whilst promising, the FRS model is sensitive to noisy information and ineffecti...Show MoreMetadata
Abstract:
Fuzzy-rough sets (FRS) provide an outstanding theoretical tool for feature selection (FS). Whilst promising, the FRS model is sensitive to noisy information and ineffectively applicable to the data with large class density difference, with existing FRS-based FS methods only tackling one of these challenges. Therefore, to overcome both of these issues, this article presents a robust FS algorithm using linear reconstruction measure for the first time. First, a pseudo FRS model is proposed, where the distribution-aware linear reconstruction relation serving as the fuzzy similarity relation is constructed by considering the insight of meaningful information (i.e., distribution information of samples and density information of classes) to enhance the robustness and the pseudofuzzy rough approximations are further redefined based on k-Nearest Neighbor (kNN) granules determined by the linear reconstruction coefficients to empower the antinoise ability. Then, the pseudo FRS model is employed to guide the robust FS algorithm from the perspective of redundant filter, strongly relevant priority, and discriminative selection to determine the final feature subset. The experimental results on 31 datasets and practical applications (i.e., cancer diagnosis and face recognition) demonstrate that the reduct gained by the proposed approach generally outperforms those attained by alternative implementations of FRS-based FS and state-of-the-art FS techniques.
Published in: IEEE Transactions on Fuzzy Systems ( Volume: 32, Issue: 10, October 2024)