Open Access Open Access  Restricted Access Subscription Access

Biocomputing: Analysis of Protein Sequence with Application of Data-mining Tasks


Affiliations
1 Department of Computer Science, Presidency College, Chennai-600005, Tamil Nadu, India
2 Reader & Head, Dept. of Computer Science, Presidency College, Tamil Nadu, India
 

In Biocomputing, Data mining (DM) techniques are widely used for prediction of protein structure. Interpreting voluminous Biological data is complex and the need for Data mining concepts is significant. Molecular data such as DNA/Protein sequence, level of genetic expression, biochemical pathways, biomarkers and protein structures constitute a major part of biological data. In this paper, an attempt is made to discuss how standard data mining techniques such as extraction of protein data, segregation by clustering, association and visualization on a protein sequence dataset.

Keywords

Biocomputing, Molecular data, Protein sequences, Data mining, Clustering
User
Notifications

  • Antonis Rokas (2011) Phylogentic Analysis of Protein Sequence Data Using the RAXML Program. CORD Conference Proceedings, Chap. 19, Unit 19.11.
  • Bowley Arthur (2000) Elementary Statistics, 4th edition.
  • Hert GZ and Stormo GD (1999) Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics, 15(7-8), 563-77.
  • Jiawei Han (2002) How Can Data Mining Help Bio-Data Analysis?. BIOKDD02: Workshop on Data Mining in Bioinformatics (with SIGKDD02 Conference), Department of Computer Science, University of Illinois at Urbana-Champaign.
  • Jiawei Han and Micheline Kamber (2002) Data Mining: Concepts and Techniques, Simon Fraser University.
  • Lo Conte L, Brenner SE, Hubbard TJ, Chothia C & Murzin AG (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res. 30(1), 264–267.
  • Murzin AG, Brenner SE, Hubbard T and Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540.

Abstract Views: 321

PDF Views: 76




  • Biocomputing: Analysis of Protein Sequence with Application of Data-mining Tasks

Abstract Views: 321  |  PDF Views: 76

Authors

R. Deepalakshmi
Department of Computer Science, Presidency College, Chennai-600005, Tamil Nadu, India
C. Jothi Venkateswaran
Reader & Head, Dept. of Computer Science, Presidency College, Tamil Nadu, India

Abstract


In Biocomputing, Data mining (DM) techniques are widely used for prediction of protein structure. Interpreting voluminous Biological data is complex and the need for Data mining concepts is significant. Molecular data such as DNA/Protein sequence, level of genetic expression, biochemical pathways, biomarkers and protein structures constitute a major part of biological data. In this paper, an attempt is made to discuss how standard data mining techniques such as extraction of protein data, segregation by clustering, association and visualization on a protein sequence dataset.

Keywords


Biocomputing, Molecular data, Protein sequences, Data mining, Clustering

References