Improved classification for imbalanced data using ensemble clustering

Sharanjit Kaur, Manju Bhardwaj, Adi Maqsood, Aditya Maurya, Mayank Kumar, Nishant Pratap Singh

Abstract


Imbalanced datasets frequently occur in fields like fraud detection and medical diagnosis, where the number of instances in the majority class vastly exceeds those in the minority class. Traditional classification algorithms often become biased towards the majority class in these scenarios. To address this challenge, we introduce a novel method called improved classification using ensemble clustering (ICEC) for imbalanced datasets in this paper. ICEC merges classification with the strengths of consensus clustering to improve the classifier’s generalization ability. This approach utilizes a cluster ensemble to capture the structural characteristics of both the majority and minority classes, and the stable clustering scheme thus delivered is used to generate new auxiliary features. These features enhance the existing feature set, helping classifiers develop a more ro bust predictive model. Extensive testing on fifteen imbalanced datasets from the knowledge extraction based on evolutionary learning (KEEL) repository demonstrates the effectiveness of our proposed method. The approach was evaluated for random forest (RF) and linear support vector machine (SVM) classifiers on these data sets. Results indicate that ICEC proved to be effective for both classifiers, with an observed F1-score improvement of more than 10% for SVM and 3%for RF.


Keywords


auxiliary features; classification; ensemble clustering; imbalanced data; minority class;

Full Text:

PDF


DOI: http://doi.org/10.12928/telkomnika.v23i5.26897

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604

View TELKOMNIKA Stats