Mining the Amino Acid Dominance in Gene Sequences

V. Balamurugan; T. Marimuthu

doi:10.17485/ijst/2015/v8i14/75241

Mining the Amino Acid Dominance in Gene Sequences

Affiliations
1 Department of Information Technology, AMET University, Chennai - 603112, Tamil Nadu, India
2 Manonmaniam Sundaranar University, Tirunelveli - 627012, Tamil Nadu, India

Abstract
References
Article Metrics
Refbacks

In the recent period, the classification techniques are widely applied in the field of Bioinformatics. The proposed Amino Acid Component based Classification algorithm adopts Iterative Dichotomiser3 classifier. The algorithm consists of two phases viz. attribute selection and component based classification. In the attribute selection phase the dominating amino acids and deficiencies in amino acids that cause the diseases are found. The second phase finds the components of amino acids which spread the diseases in the specified sequence. The experiments were carried out on the gene sequence of dengue virus which is available on the NCBI online biological database and the accuracy of the proposed algorithm is calculated as 90.744%. The proposed classification algorithm is compared with the traditional benchmark algorithms such as Naive Bayes, ID3, Random Forest, Multilayer Perceptron and J48. The result of this work can be used by the drug designers to predict new viral diseases.