Enhancing spam detection using Harris Hawks optimization algorithm

Mosleh M. Abualhaj, Sumaya Nabil Alkhatib, Ahmad Adel Abu-Shareha, Adeeb M. Alsaaidah, Mohammed Anbar

Abstract


This paper employs machine learning (ML) algorithms to identify and classify spam emails. The Harris Hawks optimization (HHO) algorithm can detect the crucial features that distinguish spam from ham emails. The HHO algorithm decreased the number of features in the ISCX-URL2016 spam dataset from 72 to 10. Implementing this will enhance the efficiency and cognitive acquisition of the ML algorithms. The decision tree (DT), Naive Bayes (NB), and AdaBoost algorithms are evaluated and contrasted to identify spam emails. The random search algorithm is used to optimize the significant hyperparameters of each algorithm for the specific task of spam identification. All three ML algorithms showed exceptional accuracy in detecting spam emails during the conducted testing. The DT algorithm attained a remarkable accuracy rate of 99.75%. The AdaBoost algorithm ranks second with an incredible accuracy of 99.67%. Finally, the NB algorithm attained an accuracy of 96.30%. The results demonstrate that the HHO algorithm shows promise in recognizing the crucial features of spam emails.

Keywords


feature selection; Harris Hawks algorithm; ISCX-URL2016 dataset; machine learning; spam;

Full Text:

PDF


DOI: http://doi.org/10.12928/telkomnika.v23i2.26615

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604

View TELKOMNIKA Stats