Open Access Open Access  Restricted Access Subscription Access

Simulated and Self-Sustained Classification of Twitter Data based on its Sentiment


Affiliations
1 Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641112, Tamil Nadu, India
 

We present a methodology for naturally grouping the estimation of Twitter messages. Miniaturized scale websites are a testing new wellspring of data for information mining methods. The aim of this paper is to focus the careful feeling of the information from the microblogging site Twitter. Tweets regularly likewise contain URLs to different sites. Tweets additionally contain a certain measure of OOV (Out-Of-Vocabulary) words, for example, Hash tags, a labeling framework for points permitting Tweets in a comparative vein of discussion to be found. Other OOV words incorporate notice which is a system to direct a Tweet to one or more users. The KH coder tool gives a conventional precision result where the content is POS labeled and MySQL is utilized for putting away points of interest as a part of the database. The R tool is utilized to view the factual examination of information. Further, machine learning calculation has likewise been performed. A preprocessing and highlight choice system in blend with a Maximum Entropy, Naive Bayes and Decision Tree classifiers has been exhibited and sensible results has been delivered. Accuracy of the machine adapting methods for sentiment has been thought about and statistical representation of the classes has been depicted through KH Coder.

Keywords

Data Mining, Decision Tree, Maximum Entropy, Microblogging, Naive Bayes, Oov, Sentiment Classification.
User

Abstract Views: 147

PDF Views: 0




  • Simulated and Self-Sustained Classification of Twitter Data based on its Sentiment

Abstract Views: 147  |  PDF Views: 0

Authors

S. N. Vinithra
Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641112, Tamil Nadu, India
S. J. Arun Selvan
Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641112, Tamil Nadu, India
M. Anand Kumar
Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641112, Tamil Nadu, India
K. P. Soman
Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641112, Tamil Nadu, India

Abstract


We present a methodology for naturally grouping the estimation of Twitter messages. Miniaturized scale websites are a testing new wellspring of data for information mining methods. The aim of this paper is to focus the careful feeling of the information from the microblogging site Twitter. Tweets regularly likewise contain URLs to different sites. Tweets additionally contain a certain measure of OOV (Out-Of-Vocabulary) words, for example, Hash tags, a labeling framework for points permitting Tweets in a comparative vein of discussion to be found. Other OOV words incorporate notice which is a system to direct a Tweet to one or more users. The KH coder tool gives a conventional precision result where the content is POS labeled and MySQL is utilized for putting away points of interest as a part of the database. The R tool is utilized to view the factual examination of information. Further, machine learning calculation has likewise been performed. A preprocessing and highlight choice system in blend with a Maximum Entropy, Naive Bayes and Decision Tree classifiers has been exhibited and sensible results has been delivered. Accuracy of the machine adapting methods for sentiment has been thought about and statistical representation of the classes has been depicted through KH Coder.

Keywords


Data Mining, Decision Tree, Maximum Entropy, Microblogging, Naive Bayes, Oov, Sentiment Classification.



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i24%2F141606