Journals & Magazines >IEEE Transactions on Knowledg... >Volume: 35 Issue: 5

Fast LDP-MST: An Efficient Density-Peak-Based Clustering Method for Large-Size Datasets

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Recently, a new density-peak-based clustering method, called clustering with local density peaks-based minimum spanning tree (LDP-MST), was proposed, which has several at...Show More

Metadata

Abstract:

Recently, a new density-peak-based clustering method, called clustering with local density peaks-based minimum spanning tree (LDP-MST), was proposed, which has several attractive merits, e.g., being able to detect arbitrarily shaped clusters and not very sensitive to noise and parameters. Nevertheless, we also found the limitation of LDP-MST in efficiency. Specifically, LDP-MST has

$O(N\log N+M^{2})$ time, where

$N$ denotes the dataset size and

$M$ is an intermediate variable denoting the number of local density peaks. As our experimental results reveal, when processing large-size datasets, the value of

$M$ could be very large and consequently those steps of LDP-MST involving

$O(M^{2})$ time term would be time-consuming. And in the worst case, the value of

$M$ could be very close to that of

$N$ , which means that the time complexity of LDP-MST could be

$O(N^{2})$ in the worst case of

$M$ . In this study, we use more efficient algorithms to implement those steps of LDP-MST that involve the

$O(M^{2})$ time term such that the proposed method, Fast LDP-MST, has

$O(N\log N)$ time complexity even if

$M\approx N$ . Our experiments demonstrate that Fast LDP-MST is overall more efficient than LDP-MST on large-size datasets, without sacrificing the merits of LDP-MST in effectiveness, robustness, and user-friendliness.

Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 35, Issue: 5, 01 May 2023)

Page(s): 4767 - 4780

Date of Publication: 11 February 2022

ISSN Information:

DOI: 10.1109/TKDE.2022.3150403

Funding Agency:

This article includes code hosted on Code Ocean, a computational reproducibility platform that allows users to view, modify, run, and download code included with IEEE Xplore articles. NOTE: A Code Ocean user account is required to access functionality in the capsule below.

Code: MATLABCode and Data for: Fast LDP-MST: an efficient density-based clustering method for large-size datasetsCode:

MATLABCode and Data for: Fast LDP-MST: an efficient density-based clustering method for large-size datasets

Fast LDP-MST: An Efficient Density-Peak-Based Clustering Method for Large-Size Datasets

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Code: MATLABCode and Data for: Fast LDP-MST: an efficient density-based clustering method for large-size datasetsCode:

IEEE Account

Purchase Details

Profile Information

Need Help?

Fast LDP-MST: An Efficient Density-Peak-Based Clustering Method for Large-Size Datasets

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Code: MATLABCode and Data for: Fast LDP-MST: an efficient density-based clustering method for large-size datasetsCode:

IEEE Account

Purchase Details

Profile Information

Need Help?