Comparison analysis of Bangla news articles classification using support vector machine and logistic regression

Md Gulzar Hussain, Babe Sultana, Mahmuda Rahman, Md Rashidul Hasan

Abstract


In the information age, Bangla news articles on the internet are fast-growing. For organizing, every news site has a particular structure and categorization. News article classification is a method to determine a document’s classification based on various predefined categories. This research discusses the classification of Bangla news articles on the online platform and tries to make constructive comparison using several classification algorithms. For Bangla news articles classification, term frequencyinverse document frequency (TF-IDF) weighting and count vectorizer have been used as a feature extraction process, and two common classifiers named support vector machine (SVM) and logistic regression (LR) employed for classifying the documents. It is clear that the accuracy of the experimental results by applying SVM is 84.0% and LR is 81.0% for twelve categories of news articles. In this research work, when we have made comparison two renowned classification algorithms applied on the Bangla news articles, LR was outperformed by SVM.


Keywords


Bangla news; Bangla text classification; logistic regression; natural language processing; news classification; support vector machine;

Full Text:

PDF


DOI: http://doi.org/10.12928/telkomnika.v21i3.23416

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604

View TELKOMNIKA Stats