Indonesian continuous speech recognition optimization with convolution bidirectional long short-term memory architecture

Sukmawati Nur Endah, Rismiyati Rismiyati, Priyo Sidik Sasongko, Anwar Petrus F Noiborhu

Abstract


Speech recognition can be defined as converting voice signals into text or lines of words by using algorithms implemented in computer programs. There are several types of speech recognition, including recognition for isolated word speech, continuous speech, spontaneous speech, and conversational speech. Research on continuous speech recognition, especially in Indonesian, has been developed using both stochastic methods such as Hidden Markov model (HMM) and deep learning methods. Currently, deep learning approaches are more widely used in speech recognition applications. This research optimizes Indonesian speech recognition by adding convolution layers to the bidirectional long short-term memory (Bi-LSTM) architecture. The goal of this research is to find the best architecture so that better Indonesian continuous speech recognition results can be obtained. The dataset used in this research was created by the intelligent systems research group in the Department of Informatics at Universitas Diponegoro. All speakers who participated in this dataset came from five ethnic groups in Indonesia, representing the dialects of their respective ethnic groups. The research results show that by adding a convolution layer to the Bi-LSTM architecture, speech recognition performance increases significantly with an average word error rate (WER) reduction of 15.56% compared to using only the Bi-LSTM architecture.

Keywords


bidirectional long short-term memory; continuous speech; convolution bidirectional long short-term memory; indonesian speech recognition; speech recognition;

Full Text:

PDF


DOI: http://doi.org/10.12928/telkomnika.v23i3.24994

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604

View TELKOMNIKA Stats