Indonesian continuous speech recognition optimization with convolution bidirectional long short-term memory architecture
Sukmawati Nur Endah, Rismiyati Rismiyati, Priyo Sidik Sasongko, Anwar Petrus F Noiborhu
Abstract
Speech recognition can be defined as converting voice signals into text or lines of words by using algorithms implemented in computer programs. There are several types of speech recognition, including recognition for isolated word speech, continuous speech, spontaneous speech, and conversational speech. Research on continuous speech recognition, especially in Indonesian, has been developed using both stochastic methods such as Hidden Markov model (HMM) and deep learning methods. Currently, deep learning approaches are more widely used in speech recognition applications. This research optimizes Indonesian speech recognition by adding convolution layers to the bidirectional long short-term memory (Bi-LSTM) architecture. The goal of this research is to find the best architecture so that better Indonesian continuous speech recognition results can be obtained. The dataset used in this research was created by the intelligent systems research group in the Department of Informatics at Universitas Diponegoro. All speakers who participated in this dataset came from five ethnic groups in Indonesia, representing the dialects of their respective ethnic groups. The research results show that by adding a convolution layer to the Bi-LSTM architecture, speech recognition performance increases significantly with an average word error rate (WER) reduction of 15.56% compared to using only the Bi-LSTM architecture.
Keywords
bidirectional long short-term memory; continuous speech; convolution bidirectional long short-term memory; indonesian speech recognition; speech recognition;
DOI:
http://doi.org/10.12928/telkomnika.v23i3.24994
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
TELKOMNIKA Telecommunication, Computing, Electronics and Control ISSN: 1693-6930, e-ISSN: 2302-9293Universitas Ahmad Dahlan , 4th Campus Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191 Phone: +62 (274) 563515, 511830, 379418, 371120 Fax: +62 274 564604
<div class="statcounter"><a title="Web Analytics" href="http://statcounter.com/" target="_blank"><img class="statcounter" src="//c.statcounter.com/10241713/0/0b6069be/0/" alt="Web Analytics"></a></div> View TELKOMNIKA Stats