Improved outlier detection and interpretation method for DPC clustering algorithm
CSTR:
Author:
Affiliation:

(School of Electrical Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450045, China)

Clc Number:

TP181

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To address the limitatios of global outlier detection methods in detecting local outliers and the performance degradation of local anomaly factors in the presence of a large number of local outliers, this paper proposes an outlier detection and interpretation method based on an improved fast search and discovery density peak clustering algorithm (KDPC), utilizing k-nearest neighbor (KNN) and kernel density estimation (KDE) methods. This method enables simultaneous analysis of both global and local data points. Firstly, the local density of data points is calculated using the k-nearest neighbor and kernel density estimation methods instead of the local density based on the truncation distance in the traditional DPC algorithm. Secondly, the sum of the k-nearest neighbor distances of the data points is used as the global outlier and the cluster density as well as the local outliers of the data points are calculated by the KDPC clustering algorithm. Finally, the global and local outliers of the data points are multiplied as the final anomaly score. The Top-n data points with the highest anomaly score is selected as the outlier, and the global and local outliers are interpreted by constructing a global-local outlier decision diagram. Experiments were conducted using both artificial and UCI datasets and our method was compared with 10 commonly used outlier detection methods. The results show that our method achieves high detection accuracy and performance for both global and local outliers. Moreover, the AUC performance is minimally affected by the k-value. Additionally, our method is also used to analyze NBA player data, further demonstrating its practicality and effectiveness.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 23,2023
  • Revised:
  • Adopted:
  • Online: August 08,2024
  • Published:
Article QR Code