Open Access Open Access  Restricted Access Subscription Access

Author profile prediction using pivoted unique term normalization


Affiliations
1 Department of IT, Vardhaman College of Engineering, Shamshabad, Hyderabad - 500018, Telangana, India
2 Department of CSE, JNTUH College of Engineering, Karimnagar - 505501, Telangana, India
3 Department of CSE, Matrusri Engineering College, Hyderabad - 500059, Telangana, India
 

Author profiling is a text classification technique, which is used to predict the demographic characteristics of the authors by analyzing their written texts. Author Profiling became popular in several information technology enabled applications such as marketing, forensic analysis, psychology and entertainment. In reviews domain, most of the authors write reviews on several products without specifying their details. In this context, Author Profiling is helpful to know about the characteristics of the authors like gender, age, native language, educational background, location and personality traits by analyzing their written texts.  Most of the approaches for Author Profiling used various features like lexical features, content based features, structural features, syntactic features and semantic features to differentiate the writing style of the authors. These approaches of Author Profiling suffer from high dimensionality of features and fail to capture the relationship between the features. In this paper, a new approach is proposed to address the high dimensionality feature space problem by aggregating the term weights to find the weight of a document against the profile of the authors. The proposed approach was experimented on reviews domain to predict the gender and age group of the authors using accuracy as a measure.

Keywords

Accuracy, Age Prediction, Author Profiling, Document Weight, Gender Prediction, Pivoted Unique Term Normalization.
User

Abstract Views: 171

PDF Views: 0




  • Author profile prediction using pivoted unique term normalization

Abstract Views: 171  |  PDF Views: 0

Authors

T. Raghunadha Reddy
Department of IT, Vardhaman College of Engineering, Shamshabad, Hyderabad - 500018, Telangana, India
B. Vishnu Vardhan
Department of CSE, JNTUH College of Engineering, Karimnagar - 505501, Telangana, India
P. Vijayapal Reddy
Department of CSE, Matrusri Engineering College, Hyderabad - 500059, Telangana, India

Abstract


Author profiling is a text classification technique, which is used to predict the demographic characteristics of the authors by analyzing their written texts. Author Profiling became popular in several information technology enabled applications such as marketing, forensic analysis, psychology and entertainment. In reviews domain, most of the authors write reviews on several products without specifying their details. In this context, Author Profiling is helpful to know about the characteristics of the authors like gender, age, native language, educational background, location and personality traits by analyzing their written texts.  Most of the approaches for Author Profiling used various features like lexical features, content based features, structural features, syntactic features and semantic features to differentiate the writing style of the authors. These approaches of Author Profiling suffer from high dimensionality of features and fail to capture the relationship between the features. In this paper, a new approach is proposed to address the high dimensionality feature space problem by aggregating the term weights to find the weight of a document against the profile of the authors. The proposed approach was experimented on reviews domain to predict the gender and age group of the authors using accuracy as a measure.

Keywords


Accuracy, Age Prediction, Author Profiling, Document Weight, Gender Prediction, Pivoted Unique Term Normalization.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i46%2F129373