Abstract:
Although traditional data mining techniques have done a lot of research to improve the accuracy of the original algorithms, most of these improved algorithms are very com...Show MoreMetadata
Abstract:
Although traditional data mining techniques have done a lot of research to improve the accuracy of the original algorithms, most of these improved algorithms are very complex and difficult to adapt to the increase in PCs. Therefore, how to balance the accuracy and computing speed, and how to properly improve the traditional data mining algorithm in the cloud computing environment, has become an important research topic. This paper studies the parallelization of data mining algorithms for big data and cloud computing, and understands the theoretical knowledge of data mining algorithms on the basis of literature, and then designs the parallel implementation of data mining algorithms for big data and cloud computing. The algorithm was tested, and the test results showed that compared with K-means++, the algorithm in this paper reduced the misclassification rate by 1.3% and the average error by 0.06, because this paper merged the cluster centers that were too close. This makes the clustering more accurate.
Date of Conference: 03-05 October 2022
Date Added to IEEE Xplore: 13 February 2023
Electronic ISBN:978-1-83953-817-9