Skip to Main Content
Data stream is infinite data and quick stream speed, so traditional clustering algorithm can not be applied to data stream clustering directly. As an efficient tool for data analysis, Gaussian mixture model has been widely applied in the fields of signal and information processing. We can use Gaussian mixture model (GMM) simulate arbitrary clustering graphics. There are two critical problems for the clustering analysis technology to select the appropriate value of number of clusters and partition overlapping clusters. Base on an extending method of Gaussian mixture modeling, a new feature mining method named Gaussian Mixture Model with Genetic Algorithms is proposed in this paper. This method is use a probability density based data stream clustering which requires only the newly arrived data, not the entire historical data, and also can choose optimal estimation clusters number value. The algorithm can determine the number of Gaussian clusters and the parameters of each Gaussian through random split and merge operation of Genetic Algorithms. We can get the accurate information each attribute characteristic describe. So that can make an effective date stream mining.