Rewriting Turkish texts written in English alphabet using Turkish alphabet

Okur B. C. , TAKCI H. , Akgul Y. S.

21st Signal Processing and Communications Applications Conference (SIU), CYPRUS, 24 - 26 Nisan 2013 identifier identifier


Turkish texts written by English characters are easily comprehended by people, although performing this process by machines is still one of the unsolved Word Sense Disambiguation problems. Rewriting texts in English characters using Turkish characters is a natural language processing problem special to Turkish. Choosing the right Turkish word among different alternatives requires consideration of the text semantically. In this study, the effect of examination of the text either sentence or whole text based, on the right word determination is investigated. Performance of machine learning methods and statistical methods in right word determination is examined. The study is tested on randomly selected news texts. It is shown that examination of the text as a whole provides more information compared to sentence based methods and machine learning methods provides better results compared to statistical studies.