Open Access
Subscription Access
An Approach to overcome Imbalance Datasets of Eukaryotic Genomes during the Analysis by Machine Learning Technique (SVM)
In biology, Support Vector Machines (SVM) is most frequently used tool for the analysis of gene expression, microarray experiments and other biological applications. In human genome dataset, only a small proportion of the DNA sequences represent genes, and the rest do not. In our work, we highlighted the reasons why, particular SVM, fails and what can be done to overcome this.
Keywords
Imbalanced Dataset, BioSVM
User
Information
- Akbani R, Kwek S and Japkowicz N (2004) Applying support vector machines to imbalanced datasets. Proc. 15th Eur. Conf. on Machine Learning (ECML). Pisa, Italy, Sept., Springer-Verlag, Germany. pp: 39- 50.
- Chawla N, Bowyer K, Hall L and Kegelmeyer W (2002) SMOTE: Synthetic Minority Over-sampling Technique. J. Artificial Intelligence Res. 16, 321-357.
- Cristianini N and Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge, UK. ISBN 0521780195.
- Joachims T (1998) Text categorization with SVM: Learning with many relevant features. Proc. 10th Eur. Conf. on Machine Learning (ECML).
- Kozak M (1996) Interpreting cDNA sequences: Some insights from studies on translation. Mammalian Genome. 7, 563-574.
- Vapnik V (1995) The nature of statistical learning theory. Springer, NY. ISBN 0387987800.
- Veropoulos K, Campbell C and Cristianini N (1999) Controlling the sensitivity of support vector machines. Proc. Intl. Joint Conf. on AI. pp: 55–60.
- Wu G and Chang E (2003) Class-Boundary alignment for imbalanced dataset learning. Proc. ICML 2003 Workshop on Learning from Imbalanced Data Sets II, Washington DC, USA.
- Zeng F, Yap HC and Wong L (2002) Using feature generation and feature selection for accurate prediction of translation initiation sites. Proc. of 13th Workshop on Genome Informatics, Universal Academy Press. pp: 192-200.
Abstract Views: 412
PDF Views: 99