Improved classification for imbalanced data using ensemble clustering
Sharanjit Kaur, Manju Bhardwaj, Adi Maqsood, Aditya Maurya, Mayank Kumar, Nishant Pratap Singh
Abstract
Imbalanced datasets frequently occur in fields like fraud detection and medical diagnosis, where the number of instances in the majority class vastly exceeds those in the minority class. Traditional classification algorithms often become biased towards the majority class in these scenarios. To address this challenge, we introduce a novel method called improved classification using ensemble clustering (ICEC) for imbalanced datasets in this paper. ICEC merges classification with the strengths of consensus clustering to improve the classifier’s generalization ability. This approach utilizes a cluster ensemble to capture the structural characteristics of both the majority and minority classes, and the stable clustering scheme thus delivered is used to generate new auxiliary features. These features enhance the existing feature set, helping classifiers develop a more ro bust predictive model. Extensive testing on fifteen imbalanced datasets from the knowledge extraction based on evolutionary learning (KEEL) repository demonstrates the effectiveness of our proposed method. The approach was evaluated for random forest (RF) and linear support vector machine (SVM) classifiers on these data sets. Results indicate that ICEC proved to be effective for both classifiers, with an observed F1-score improvement of more than 10% for SVM and 3%for RF.
Keywords
auxiliary features; classification; ensemble clustering; imbalanced data; minority class;
DOI:
http://doi.org/10.12928/telkomnika.v23i5.26897
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
TELKOMNIKA Telecommunication, Computing, Electronics and Control ISSN: 1693-6930 , e-ISSN: 2302-9293 Universitas Ahmad Dahlan , 4th Campus Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191 Phone: +62 (274) 563515, 511830, 379418, 371120 Fax: +62 274 564604
<div class="statcounter"><a title="Web Analytics" href="http://statcounter.com/" target="_blank"><img class="statcounter" src="//c.statcounter.com/10241713/0/0b6069be/0/" alt="Web Analytics"></a></div> View TELKOMNIKA Stats