A Novel Part-of-Speech Set Developing Method for Statistical Machine Translation
Herry Sujaini, Kuspriyanto Kuspriyanto, Arry Akhmad Arman, Ayu Purwarianti
Abstract
Part of speech (PoS) is one of the features that can be used to improve the quality of statistical-based machine translation. Typically, the language PoS determined based grammar of the language or adopt from other languages PoS. This work aims to formulate a model to developing PoS as linguistic factors to improve the quality of machine translation automatically. The research method using word similarity approach, where we perform clustering of the words contained in a corpus. Further classes will be defined as PoS set obtained for a given language.We evaluated the results of the PoS that defined computational results using machine translation system MOSES as the system by comparing the results of the SMT are using PoS sets generated manually, while the assessment of the system using BLEU method. Language that will be used for evaluation is English as the source language and Indonesian as the target language.
Keywords
method; part-of-speech; statistical machine translation; moses; word similarity
DOI:
http://doi.org/10.12928/telkomnika.v12i3.79
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
TELKOMNIKA Telecommunication, Computing, Electronics and Control ISSN: 1693-6930, e-ISSN: 2302-9293Universitas Ahmad Dahlan , 4th Campus Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191 Phone: +62 (274) 563515, 511830, 379418, 371120 Fax: +62 274 564604
<div class="statcounter"><a title="Web Analytics" href="http://statcounter.com/" target="_blank"><img class="statcounter" src="//c.statcounter.com/10241713/0/0b6069be/0/" alt="Web Analytics"></a></div> View TELKOMNIKA Stats