Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Hybrid Approach for Handling OOV Words


Affiliations
1 Shroff S.R Rotary Institute of Chemical Technology, Ankleshwar, Gujarat, India
     

   Subscribe/Renew Journal


Language transliteration is one of the important area in natural language processing. Accurate transliteration of named entities plays an important role in the performance of machine translation (MT),cross-language information retrieval (CLIR) and question answering (QA), and bilingual lexicon construction. Handling out of vocabulary words is crucial in CLIR and MT. It is important for Machine Translation, especially when the languages do not use the same scripts. This paper addresses the issue of transliteration from Roman Script to Gurmukhi Script. Statistical Approach guided by rules is used for transliteration from English to Punjabi using MOSES, a statistical machine translation tool. The overall TAR after the application of observed rules comes out to be 74.18%.

Keywords

Machine Transliteration, Statistical Approach, MOSES and N-Gram.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 154

PDF Views: 3




  • Hybrid Approach for Handling OOV Words

Abstract Views: 154  |  PDF Views: 3

Authors

Jasleen Kaur
Shroff S.R Rotary Institute of Chemical Technology, Ankleshwar, Gujarat, India

Abstract


Language transliteration is one of the important area in natural language processing. Accurate transliteration of named entities plays an important role in the performance of machine translation (MT),cross-language information retrieval (CLIR) and question answering (QA), and bilingual lexicon construction. Handling out of vocabulary words is crucial in CLIR and MT. It is important for Machine Translation, especially when the languages do not use the same scripts. This paper addresses the issue of transliteration from Roman Script to Gurmukhi Script. Statistical Approach guided by rules is used for transliteration from English to Punjabi using MOSES, a statistical machine translation tool. The overall TAR after the application of observed rules comes out to be 74.18%.

Keywords


Machine Transliteration, Statistical Approach, MOSES and N-Gram.