Open Access Open Access  Restricted Access Subscription Access

Extraction of Replicated Punjabi Multiword Expressions


Affiliations
1 Department of Computer Science, Punjabi University, Patiala, India
 

Multiword Expressions (MWEs) play a vital role in Natural Language Processing. Multiword Expression is a combination of two or more words but treated as a single word. In Punjabi Language, there are varieties of MWEs and many of these are of the types that are not found in English. In this paper, we discuss different types of MWEs encountered in Punjabi. For example, replicated words, word combination with antonym, synonym, hyponym, gender, number and ‘waala’ morpheme have not been discovered as MWEs in English. Rule based approachs, statistical methods, and linguists’ approaches were used for MWE identification and extraction. In this paper, we present a methodology for identification and extraction of Punjabi MWEs using statistical methods, rule base methods and linguists’ approach.
User
Notifications
Font Size

  • Agarwal, A., Ray, B., Choudhury, M., Basu, A., & Sarkar, S. (n.d.). Automatic Extraction of Multiword Expressions in Bengali: An Approach for Miserly Resource Scenarios. In academia.edu. Retrieved August 31, 2020, from http://www.academia.edu/download/30405011/icon2004_mwe.pdf
  • Baldwin, T., & Kim, S. N. (2010). Multiword expressions. Handbook of Natural Language Processing, Second Edition, 267–292.
  • Brundage, J., Kresse, M., Schwall, U., & Storrer, A. (1992). Multiword lexemes: A monolingual and contrastive typology for natural language processing and machine translation.
  • Church, K. W., & Hanks, P. (1989). Word association norms, mutual information, and lexicography. April, 76–83. https://doi.org/10.3115/981623.981633
  • Fatima, Z., 2010, N. C.-P. of the, & 2010, undefined. (n.d.). Extracting Hindi Multiword Expressions Using a Rule Based Tool. IEEE Computer Society.
  • Minia, M. (2012). Literature Survey on Multi-Lingual Multiword Expressions.
  • Pecina, P. (2009). Collocation Extraction AND THEORETICAL LINGUISTICS. In Studies in computational and theoretical linguistics.
  • Poddar, L. (2013). Multilingual Multiword Expressions. Detection of MultiWord Expression and Name Entity Recognition, 113050029.
  • Singh, N. B., Bandyopadhyay, S., Nongmeikapam, K., Laishram, D., & Mayekleima Chanu, N. (2011). Identification of Reduplicated Multiword Expressions Using CRF. LNCS, 6608(PART 1), 41–51. https://doi.org/10.1007/978-3-642-19400-9_4
  • Sinha, R. M. K. (2009). Mining complex predicates in Hindi using a parallel Hindi-English corpus. August, 40. https://doi.org/10.3115/1698239.1698247
  • Smadja, F. (n.d.). Retrieving Collocations from Text: Xtract. In dl.acm.org. Retrieved August 31, 2020, from https://dl.acm.org/doi/abs/10.5555/972450.972458

Abstract Views: 219

PDF Views: 0




  • Extraction of Replicated Punjabi Multiword Expressions

Abstract Views: 219  |  PDF Views: 0

Authors

Kapil Dev Goyal
Department of Computer Science, Punjabi University, Patiala, India
Vishal Goyal
Department of Computer Science, Punjabi University, Patiala, India

Abstract


Multiword Expressions (MWEs) play a vital role in Natural Language Processing. Multiword Expression is a combination of two or more words but treated as a single word. In Punjabi Language, there are varieties of MWEs and many of these are of the types that are not found in English. In this paper, we discuss different types of MWEs encountered in Punjabi. For example, replicated words, word combination with antonym, synonym, hyponym, gender, number and ‘waala’ morpheme have not been discovered as MWEs in English. Rule based approachs, statistical methods, and linguists’ approaches were used for MWE identification and extraction. In this paper, we present a methodology for identification and extraction of Punjabi MWEs using statistical methods, rule base methods and linguists’ approach.

References