Open Access Open Access  Restricted Access Subscription Access

Comparison of Different Attributes of Authorship Data using Data Mining Approach


Affiliations
1 Amity School of Engineering and Technology, Amity University, Amity Campus Sector –125, Noida –201303, Uttar Pradesh, India
2 School of Computer Science and Information Technology, University of Hyderabad, Central University P.O., Prof. C.R.Rao Road, Gachibowli, Hyderabad–500046, India
 

In recent years, with the rapid increase in Internet usage, the data that has been generated is huge and unstructured. These data can be interpreted with various techniques of Data Mining. Many useful patterns can be extracted from these trends. Classifying these data into meaningful analysis is the key concept behind this study. In this paper, the authorship data for books was used. A data was created where various attributes of users were stored along with the book that they like to read. Naive Bayes was applied on the data set to find which factor is majorly affecting the ratings of the books. The various attributes were compared using data mining tool and found that the rating of books highly depends upon the location of the user. This interpretation was also verified by the measure of precision and recall. High precision results into more accuracy of the system.

Keywords

Authorship Data, Information Retrieval, Naïve Bayes, Precision, Recall.
User

Abstract Views: 240

PDF Views: 0




  • Comparison of Different Attributes of Authorship Data using Data Mining Approach

Abstract Views: 240  |  PDF Views: 0

Authors

Parul Kalra
Amity School of Engineering and Technology, Amity University, Amity Campus Sector –125, Noida –201303, Uttar Pradesh, India
Navjot Kaur Walia
Amity School of Engineering and Technology, Amity University, Amity Campus Sector –125, Noida –201303, Uttar Pradesh, India
Deepti Mehrotra
Amity School of Engineering and Technology, Amity University, Amity Campus Sector –125, Noida –201303, Uttar Pradesh, India
Abdul Wahid
School of Computer Science and Information Technology, University of Hyderabad, Central University P.O., Prof. C.R.Rao Road, Gachibowli, Hyderabad–500046, India

Abstract


In recent years, with the rapid increase in Internet usage, the data that has been generated is huge and unstructured. These data can be interpreted with various techniques of Data Mining. Many useful patterns can be extracted from these trends. Classifying these data into meaningful analysis is the key concept behind this study. In this paper, the authorship data for books was used. A data was created where various attributes of users were stored along with the book that they like to read. Naive Bayes was applied on the data set to find which factor is majorly affecting the ratings of the books. The various attributes were compared using data mining tool and found that the rating of books highly depends upon the location of the user. This interpretation was also verified by the measure of precision and recall. High precision results into more accuracy of the system.

Keywords


Authorship Data, Information Retrieval, Naïve Bayes, Precision, Recall.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i45%2F128480