The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


A protein family must be identified, so that the protein can be modified and controlled for using it in the identification of drug target interactions, structure prediction, etc. Protein families are identified using the similarity between protein sequences. Alignment-free approaches use machine learning (ML) techniques for protein family prediction. In this study, two novel ML-based models, viz. a stacked framework of random forest, and a stacked framework of random forest, decision tree and naive Bayes for protein family prediction have been developed for a better identification of protein families. Both the models outperform state-of-the-art methods with an accuracy of 98.21% and 98.49% respectively. The proposed models give better results for twilight zone protein datasets as well.

Keywords

Alignment Free Method, Machine Learning, Protein Family Prediction, Stacked Framework, Twilight-Zone Proteins.
User
Notifications
Font Size