Skip to Main Content
The selection of features for classification, clustering and approximation is an important task in pattern recognition, data mining and soft computing. For real-valued features, this contribution shows how feature selection for a high number of features can be implemented using mutual information. Especially, the common problem for mutual information computation of computing joint probabilities for many dimensions using only a few samples is treated by using the Renyi mutual information of order two as computational base. The convex property is proved for ranked target samples. For real world applications like process modelling, the treatment of missing values is included. An example shows how the relevant features and their time lags are determined in time series even if the features determine nonlinearly the output. By the computationally efficient implementation, mutual information becomes an attractive tool for feature selection even for a high number of real-valued features.