Loading [MathJax]/extensions/MathZoom.js
Efficient Training Support Vector Clustering With Appropriate Boundary Information | IEEE Journals & Magazine | IEEE Xplore

Efficient Training Support Vector Clustering With Appropriate Boundary Information


In ETSVC, noises confusing the edge data are firstly refused. Then, the shrunk edges directly construct a redefined dual problem, which can be effectively solved by the R...

Abstract:

Due to the remarkable capability in handling arbitrary cluster shapes, support vector clustering (SVC) benefits data analysis in terms of data description. However, large...Show More

Abstract:

Due to the remarkable capability in handling arbitrary cluster shapes, support vector clustering (SVC) benefits data analysis in terms of data description. However, large-scale data such as network traffic frequently makes it suffer from highly intensive pricey computation and storage for solving the dual problem and storing the kernel matrix, respectively. Fortunately, support vectors which describe the clusters, in a sense, are expected in the boundaries. To tackle this issue, we propose an efficient training SVC with appropriate boundary information (ETSVC), which features excellent flexibility and scalability. In ETSVC, we first give a shrinkable boundary selection (SBS) method which collects appropriate boundaries while reducing redundancy and noise. Based on the boundary information, a redefined dual problem is then designed without scarifying the principle of SVC. Finally, we design a reformative solver (RSolver) to reformulate the training phase, which estimates the support vector function by solving the dual problem. Since only a subset of boundaries is employed for model training, theoretical analysis suggests that ETSVC reaches efficiency improvement and consumes much less memory if sacrificing efficiency to reduce storage consumption. Towards grouping P2P flows and large-scale intrusion traffic, as well as other non-traffic data, experimental results confirm that ETSVC could be applied to resources constrained platform while achieving comparable accuracies with the state-of-the-art methods.
In ETSVC, noises confusing the edge data are firstly refused. Then, the shrunk edges directly construct a redefined dual problem, which can be effectively solved by the R...
Published in: IEEE Access ( Volume: 7)
Page(s): 146964 - 146978
Date of Publication: 07 October 2019
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.