Sample selection algorithm for SVM with minimum uncertainty in local density

doi:10.11918/202407085

Home > Archive>Volume 57, Issue 8, 2025 >45-56. DOI:10.11918/202407085

Sample selection algorithm for SVM with minimum uncertainty in local density
DOI:
                        10.11918/202407085
                    
CSTR:
                        
Author:
                        
Affiliation:(1.School of Electrical Engineering, North China University of Water Resource and Electric Power, Zhengzhou 450011, China; 2.Hebei Water Conservancy Engineering Bureau Group Limited, Shijiazhuang 050021, China)
Clc Number:TP181
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

To address the issue that support vector machines (SVM) frequently encompass a considerable number of redundant samples during classification, which restricts the computational complexity of SVM when confronted with large-scale datasets, a SVM sample selection algorithm based on local density minimum uncertainty is put forward. This approach efficiently identifies influential boundary data points that significantly affect the decision boundary, subsequently reducing computational costs by isolating potential support vectors from the training set, thereby bolstering SVM’s overall effectiveness. Firstly, the number of K nearest neighbors and Gaussian kernel density estimation of the training samples are computed; Secondly, the sum of the number of K nearest neighbors and Gaussian kernel density estimation is derived for each sample point to acquire the K local density and obtain the density matrix; Subsequently, employing the local density uncertainty balancing optimization method, the density matrix undergoes a triple-mapping process to minimize uncertainty changes, yielding the optimal threshold. This threshold then partitions the density matrix into center data and boundary data. Finally, the boundary data are extracted and utilized as training samples for the SVM, enabling the establishment of an effective classification model. To experimentally evaluate the efficacy of our method, we compared it with six commonly utilized sample selection techniques on UCI datasets, employing accuracy and preservation rate as key performance metrics. The findings indicate that the method introduced in this paper significantly reduces the number of redundant training samples, thereby effectively alleviating the training burden on SVM and enhancing its classification performance.

Reference

Cited by

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 30,2024
Revised:
Adopted:
Online: August 11,2025
Published:

Publication Statement

Journal Subscription

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code