Process Improvement of LSA for Semantic Relatedness Computing
Wujian Yang, Lianyue Lin
Abstract
Tang poetry semantic correlation computing is critical in many applications, such as searching, clustering, automatic generation of poetry and so on. Aiming to increase computing efficiency and accuracy of semantic relatedness, we improved the process of latent semantic analysis (LSA). In this paper, we adopted “representation of words semantic” instead of “words-by-poems” to represent the words semantic, which based on the finding that words having similar distribution in poetry categories are almost always semantically related. Meanwhile, we designed experiment which obtained segmentation words from more than 40000 poems, and computed relatedness by cosine value which calculated from decomposed co-occurrence matrix with Singular Value Decomposition (SVD) method. The experimental result shows that this method is good to analyze semantic and emotional relatedness of words in Tang poetry. We can find associated words and the relevance of poetry categories by matrix manipulation of the decomposing matrices as well.
DOI:
http://doi.org/10.12928/telkomnika.v12i4.811
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
TELKOMNIKA Telecommunication, Computing, Electronics and Control ISSN: 1693-6930, e-ISSN: 2302-9293Universitas Ahmad Dahlan , 4th Campus Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191 Phone: +62 (274) 563515, 511830, 379418, 371120 Fax: +62 274 564604
<div class="statcounter"><a title="Web Analytics" href="http://statcounter.com/" target="_blank"><img class="statcounter" src="//c.statcounter.com/10241713/0/0b6069be/0/" alt="Web Analytics"></a></div> View TELKOMNIKA Stats