Optimized multi correlation-based feature selection in software defect prediction

Muhammad Nabil Muyassar Rahman, Radityo Adi Nugroho, Mohammad Reza Faisal, Friska Abadi, Rudy Herteno

Abstract


In software defect prediction, noisy attributes and high-dimensional data remain to be a critical challenge. This paper introduces a novel approach known as multi correlation-based feature selection (MCFS), which seeks to address these challenges. MCFS integrates two feature selection techniques, namely correlation-based feature selection (CFS) and correlation matrixbased feature selection (CMFS), intending to reduce data dimensionality and eliminate noisy attributes. To accomplish this, CFS and CMFS are applied independently to filter the datasets, and a weighted average of their outcomes is computed to determine the optimal feature selection. This approach not only reduces data dimensionality but also mitigates the impact of noisy attributes. To further enhance predictive performance, this paper leverages the particle swarm optimization (PSO) algorithm as a feature selection mechanism, specifically targeting improvements in the area under the curve (AUC). The evaluation of the proposed method is conducted on 12 benchmark datasets sourced from the NASA metrics data program (MDP) corpus, renowned for their noisy attributes, high dimensionality, and imbalanced class records. The research findings demonstrate that MCFS outperforms CFS and CMFS, yielding an average AUC value of 0.891, thereby emphasizing it is efficacy in advancing classification performance in the context of software defect prediction using k-nearest neighbors (KNN) classification.

Keywords


correlation-based; feature selection; high dimensional; noisy attribute; software defect;

Full Text:

PDF


DOI: http://doi.org/10.12928/telkomnika.v22i3.25793

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604

View TELKOMNIKA Stats