Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Medical Health Posts Summarization using Lesk Algorithm


Affiliations
1 IT Dept., DYPCOE Akurdi, Pune, Maharashtra, India
2 IT Dept., DYCOE, Akurdi, Pune, Maharashtra, India
3 Comp. Engg. Dept., MITAOE, Alandi, Pune, Maharashtra, India
     

   Subscribe/Renew Journal


Today’s world is all about information, mostly online. With the growth of internet many communication technologies emerged quickly as important information sources, such as emails, forums, social networking sites, etc. It is time and space consuming to handle such large amount of data. Text summarization is technique by which important portion of text are obtained. Traditionally these important portions are selected based on frequency of keywords, position of sentence, style of writing word, keywords in title, etc. Extractive text summarization is produced by concatenating several sentences taken exactly as they appear in the text. Sentences are selected based on some scoring techniques. In our approach, we are using simplified Lesk algorithm with some modification. Our approach is applicable to medical health posts. In this, sentences having important information from medical perspective are arranged in decreasing order of their weights. Based on given input percentage, relevant number of sentences is given as summary. We compared results with human expert summary. The proposed approach gives promising results.

Keywords

Health Posts, Lesk Algorithm, Summarization, UMLS.
Subscription Login to verify subscription
User
Notifications
Font Size


  • A. R. Pal, and D. Saha, “An approach to automatic text summarization using wordnet,” in IEEE Int. Advance Computing Conference (IACC), Gurgaon, India, 2014.
  • E. Lloret, and M. Palomar, “Text summarization in progress: A literature review,” Artificial Intelligence Review, vol. 37, no. 1, pp. 1-41, January 2012.
  • C. Y. Lin, and E. Hovy, “Identify topics by position,” in Proc. of the 5th Conf. on Applied Natural Language Processing, Washington DC, pp. 283-290, 1997.
  • “Unified medical language system (umls),” Nlm.nih.gov, [Online]. Available: www.nlm.nih.gov/research/umls/
  • V. L. Mane, S. S. Panicker, and V. B. Patil, “Knowledge discovery from user health posts,” in IEEE 9th Int. Conf. on Intelligent Systems and Control (ISCO), Coimbatore, India, 2014.
  • V. L. Mane, S. S. Panicker, and V. B. Patil, “Summarization and sentiment analysis from user health posts,” in IEEE Int. Conf. on Pervasive Computing (ICPC), Pune, India, 2015.
  • V. L. Mane, S. S. Panicker, and V. B. Patil, “Knowledge discovery from various algorithms: A survey,” International Journal of Computer Science and Information Technologies, vol. 5, no. 6, pp. 7477-7479, 2014.
  • R. Ferreira, F. Freitas, L. de S. Cabral, R. D. Lins, R. Lima, G. Franca, S. J. Simske, and L. Favaro, “A context based text summarization system,” in IEEE 11th IAPR Int. Workshop on Document Analysis Systems, pp. 66-70, 2014.
  • R. Jayashree, K. S. Murthy, and B. S. Anami, “Categorized text document summarization in the Kannada language by sentence ranking,” in 12th Int. Conf. on Intelligent Systems Design and Applications (ISDA), IEEE, Kochi, India, 2012.
  • C. L. Devasena, and M. Hemalatha, “Automatic text categorization and summarization using rule reduction,” in IEEE Int. Conf. on Advances in Engineering, Science and Management (ICAESM-2012), pp. 594-598, IEEE, 2012.
  • R. Mishra, J. Bian, M. Fiszman, C. R. Weir, S. Jonnalagadda, J. Mostafa, and G. D. Fiol, “Text summarization in the biomedical domain: A systematic review of recent research,” Journal of Biomedical Informatics, Elsevier, vol. 52, pp. 457-467, 2014.
  • A. R. Aronson, “Effective mapping of biomedical text to the UMLS metathesaurus: The metamap program,” in Proc. of AMIA Symp., pp. 17-21, 2001.
  • J. Han, M. Kamber, and J. Pie, Data Mining Concepts and Techniques, 3rd ed., Morgan Kaufmann Publishers, 2012.
  • S. D. Kavila, and Y. Radhika, “Extractive text summarization using modified weighing and sentence symmetric feature methods,” International Journal of Modern Education and Computer Science, vol. 7, no. 10, pp. 33-39, 2015, doi: 10.5815/ijmecs.2015.10.05.
  • D. Y. Sakhare, and R. Kumar, “Syntactic and sentence feature based hybrid approach for text summarization,” International Journal of Information Technology and Computer Science, vol. 3, pp. 38-46, 2014, doi: 10.5815/ijitcs.2014.03.05.
  • A. Ashari, and M. Riasetiawan, “Document summarization using textrank and semantic network,” International Journal of Intelligent Systems and Applications(IJISA), vol. 9, no. 11, pp. 26-33, 2017, doi: 10.5815/ijisa.2017.11.04.
  • M. K. V. V. Ravinuthala, and S. Reddy Ch., “Thematic text graph: A text representation technique for keyword weighting in extractive summarization system,” International Journal of Information Engineering and Electronic Business(IJIEEB), vol. 8, no. 4, pp. 18-25, 2016, doi: 10.5815/ijieeb.2016.04.03.
  • “Ambien discussions (experiences, side effects, dosages, etc...),” Healthboards.com, [Online]. Available: https://www.healthboards.com/drugtalk/ambien/index.htm

Abstract Views: 222

PDF Views: 1




  • Medical Health Posts Summarization using Lesk Algorithm

Abstract Views: 222  |  PDF Views: 1

Authors

Vinod L. Mane
IT Dept., DYPCOE Akurdi, Pune, Maharashtra, India
Ashwini Abhale
IT Dept., DYCOE, Akurdi, Pune, Maharashtra, India
Sumit Khandelwal
Comp. Engg. Dept., MITAOE, Alandi, Pune, Maharashtra, India

Abstract


Today’s world is all about information, mostly online. With the growth of internet many communication technologies emerged quickly as important information sources, such as emails, forums, social networking sites, etc. It is time and space consuming to handle such large amount of data. Text summarization is technique by which important portion of text are obtained. Traditionally these important portions are selected based on frequency of keywords, position of sentence, style of writing word, keywords in title, etc. Extractive text summarization is produced by concatenating several sentences taken exactly as they appear in the text. Sentences are selected based on some scoring techniques. In our approach, we are using simplified Lesk algorithm with some modification. Our approach is applicable to medical health posts. In this, sentences having important information from medical perspective are arranged in decreasing order of their weights. Based on given input percentage, relevant number of sentences is given as summary. We compared results with human expert summary. The proposed approach gives promising results.

Keywords


Health Posts, Lesk Algorithm, Summarization, UMLS.

References