Using Warshall to Solve the Density-linked Density Clustering Algorithm
American Journal of Applied Mathematics
Volume 8, Issue 1, February 2020, Pages: 11-16
Received: Jan. 2, 2020; Accepted: Jan. 13, 2020; Published: Jan. 23, 2020
Views 337      Downloads 181
Authors
Mengying Huang, College of Data Science, Huashang College Guangdong University of Finance & Economics, Guangzhou, China
Yuanhai Yan, College of Data Science, Huashang College Guangdong University of Finance & Economics, Guangzhou, China
Lijuan Xu, College of Data Science, Huashang College Guangdong University of Finance & Economics, Guangzhou, China
Lihong Ye, College of Data Science, Huashang College Guangdong University of Finance & Economics, Guangzhou, China
Article Tools
Follow on us
Abstract
Clustering algorithm has a wide range of applications in data mining, pattern recognition and machine learning. It is an important part of data mining technology. The emergence of massive data makes the application of data mining technology endless. Cluster analysis is the basic operation of big data processing. The clustering algorithm is to divide similar elements into one class, and to divide elements with large differences into different classes. Aiming at the computational complexity of the density clustering algorithm, this paper proposes an improved algorithm W-DBSCAN which uses Warshall algorithm to reduce its complexity. In the density clustering algorithm, the data with high similarity are densely connected. In this paper, aiming at the complexity of the density clustering algorithm, an improved algorithm W-DBSCAN using the Warshall algorithm to mitigate its complexity is proposed. In the density clustering algorithm, the data with high similarity is density-connected. This paper constructs a matrix n×n where the element (x, y) is marked as 1 means that the data x and data y are directly reachable, and then the reachability matrix of the matrix is calculated using the Warshall algorithm. The solution density connection problem is transformed into the solution reachability matrix problem, thus reducing the complexity of the algorithm.
Keywords
Warshall Algorithm, DBSCAN, Clustering
To cite this article
Mengying Huang, Yuanhai Yan, Lijuan Xu, Lihong Ye, Using Warshall to Solve the Density-linked Density Clustering Algorithm, American Journal of Applied Mathematics. Vol. 8, No. 1, 2020, pp. 11-16. doi: 10.11648/j.ajam.20200801.12
Copyright
Copyright © 2020 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
References
[1]
TU Q, LU J F, YUAN B, et al. Density-based hierarchical clustering for streaming data [J]. Pattern Recognition Letters, 2012, 33 (5): 641-645.
[2]
LIU Q, DENG M, SHI Y, et al. A density-based spatial clustering algorithm considering both spatial proximity and attribute similarity [J]. Computers & Geosciences, 2012, 46 (3): 296-309.
[3]
J G Sun, J Liu, L Y Zhao. Research on Clustering Algorithm [J]. Journal of Software, 2008, 19 (01): 48-61.
[4]
S G Zhou, A Y Zhou. A fast clustering algorithm based on density [J]. Computer research and development, 2000, 37 (11): 1287-1292.
[5]
ESTER M, KRIEGEL H P, XU X. A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise [C]// International Conference on Knowledge Discovery and Data Mining. AAAI Press, 1996: 226-231.
[6]
BIRABT D, KUT A. ST-DBSCAN: An Algorithm for Clustering Spatial-temporal Data [J]. Data and Knowledge Engineering, 2007, 60 (1): 208-221.
[7]
Z H Feng, X Z QIAN, N N Zhao. Greedy DBSCAN: An Improved DBSCAN Algorithm for Multi-Density Clustering [J]. Application Research of Computers, 2106 (9): 2693-2696.
[8]
Z F Wang, G L Shan. Method for dynamically selecting parameters of DBSCAN algorithm based on k-means [J]. Computer Engineering and Applications, 2017, 53 (03): 80-86.
[9]
Y Cai, J S Yuan. Text clustering based on improved DBSCAN algorithm [J]. Computer Engineering, 2011, 37 (12): 50-52.
[10]
S G Zhou, A Y Zhou, J Cao. DBSCAN algorithm based on data partition [J]. Computer research and development, 2000, 37 (10): 1153-1159.
[11]
G Li, H B Liu, Y Feng. Clustering method based on λ-Warshall algorithm [J]. Computer Engineering and Design, 2008, 29 (8): 1903-1904.
[12]
MACQUEEN J. Some Methods for Classification and Analysis of MultiVariate Observations [C]// Proc. of, Berkeley Symposium on Mathematical Statistics and Probability. 1967: 281-297.
[13]
WARSHALL S. A Theorem on Boolean Matrices [J]. Journal of the Acm, 1962, 9 (1): 11-12.
[14]
BREUNIG M, KRIEGEL H P, NG R, etal. LOF: Identifying Density- based Local Outliers [C]//Proc. of ACM SIGMOD International Conference on Management of Data. [S. l.]: ACM Press, 2000.
[15]
F Z Sun, Y H Li, S M Cheng etal. Application of Warshall algorithm in discriminating transitivity and seeking transitive closure [J]. Journal of Changchun University, 2007, 17 (6): 13-16.
[16]
YANG Z, OJA E. Linear and Nonlinear Projective Nonnegative Matrix Factorization [J]. IEEE Trans Neural Netw, 2010, 21 (5): 734-749.
ADDRESS
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
U.S.A.
Tel: (001)347-983-5186