Neural network technique with deep structure for improving author homonym and synonym classification in digital libraries

Firdaus Firdaus, Siti Nurmaini, Varindo Ockta Keneddi Putra, Annisa Darmawahyuni, Reza Firsandaya Malik, Muhammad Naufal Rachmatullah, Andre Herviant Juliano, Tio Artha Nugraha

Abstract


Author name disambiguation (AND), also recognized as name-identification, has long been seen as a challenging issue in bibliographic data. In other words, the same author may appear under separate names, synonyms, or distinct authors may have similar to those referred to as homonyms. Some previous research has proposed AND problem. To the best of our knowledge, no study discussed specifically synonym and homonym, whereas such cases are the core in AND topic. This paper presents the classification of non-homonym-synonym, homonym-synonym, synonym, and homonym cases by using the DBLP computer science bibliography dataset. Based on the DBLP raw data, the classification process is proposed by using deep neural networks (DNNs). In the classification process, the DBLP raw data divided into five features, including name, author, title, venue, and year. Twelve scenarios are designed with a different structure to validate and select the best model of DNNs. Furthermore, this paper is also compared DNNs with other classifiers, such as support vector machine (SVM) and decision tree. The results show DNNs outperform SVM and decision tree methods in all performance metrics. The DNNs performances with three hidden layers as the best model, achieve accuracy, sensitivity, specificity, precision, and F1-score are 98.85%, 95.95%, 99.26%, 94.80%, and 95.36%, respectively. In the future, DNNs are more performing with the automated feature representation in AND processing.

Keywords


author name disambiguation; bibliographic data; deep neural networks; homonym; symonym;

Full Text:

PDF


DOI: http://doi.org/10.12928/telkomnika.v19i4.18878

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604

View TELKOMNIKA Stats