Spoken English Digit Classification Using Supervised Learning

Maddimsetti Srinivas; Kasiprasad Mannepalli; G. L. P. Ashok

Spoken English Digit Classification Using Supervised Learning

Maddimsetti Srinivas , Kasiprasad Mannepalli , G. L. P. Ashok

Affiliations
1 Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur(Dt), Andhra Pradesh, India

Subscribe/Renew Journal

Abstract
References
Article Metrics
Refbacks

Multiclass classification is a fundamental problem for many speech recognition systems. Spoken digit recognition is a multiclass problem of 10 classes. Present paper using Support Vector Machine (SVM) and K-Nearest-Neighbour (KNN) and Ensemble method i.e., Random Forest (RF) to English digit classification. Caffe speech dataset of 2400 input instances (15 speakers*16 repetitions*10 digits) used for experiments. Mel Frequency Cepstral Coefficients (MFCC) features are formed for all input instances. The dataset is divided into training set and testing set with 10%, 30% and 50% of dataset as testing set. Confusion matrices generated with all test cases for all classification methods. Performance of Ensemble method is high compared to SVM and KNN at different number of frames. The highest accuracy achieved by RF method is 97.50% by taking 10% testing data.

Keywords

Caffe, Ensemble Methods, KNN, MFCC, Random Forest (RF), Spoken English Digit, SVM.

I-Scholar

Journal Help

User

Subscription Login to verify subscription

Notifications

Journal Content
Browse

Font Size

Information

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” Proceedings of the 22^nd ACM International Conference on Multimedia, pp. 675-678, ACM, 2014.

I. A. Lawal, “Spoken character classification using abductive network,” International Journal of Speech Technology, vol. 20, no. 4, pp. 881-890, 2017.

G. Muhammad, Y. A. Alotaibi, and M. N. Huda, “Automatic speech recognition for Bangla digits,” 2009 12^th International Conference on Computers and Information Technology (ICCIT’09), IEEE, 2009.

Z. Ali, A. W. Abbas, T. M. Thasleema, B. Uddin, T. Raaz, and S. A. R. Abid, “Database development and automatic speech recognition of isolated Pashto spoken digits using MFCC and K-NN,” International Journal of Speech Technology, vol. 18, no. 2, pp. 271-275, 2015.

D. F. Silva, V. M. A. de Souza, G. E. A. P. A. Batista, and R. Giusti, “Spoken digit recognition in Portuguese using line spectral frequencies,” Ibero-American Conference on Artificial Intelligence, Springer, Berlin, Heidelberg, 2012.

I. Bazzi, and D. Katabi, “Using support vector machines for spoken digit recognition,” Sixth International Conference on Spoken Language Processing, 2000.

J. V. Doremalen, and L. Boves, “Spoken digit recognition using a hierarchical temporal memory,” Ninth Annual Conference of the International Speech Communication Association, 2008.

F. Diaz, J. M Ferrández, P. Gomez, and V. Rodellar, “Spoken-digit recognition using self-organizing maps with perceptual pre-processing,” International Work-Conference on Artificial Neural Networks, Springer, Berlin, Heidelberg, 1997.

T. Kitamura, S. Ando, and E. Hayahara, “Speaker-independent spoken digit recognition in noisy environments using dynamic spectral features and neural networks,” Second International Conference on Spoken Language Processing, 1992.

N. Hammami, and M. Sellam, “Tree distribution classifier for automatic spoken Arabic digit recognition,” 2009 International Conference for Internet Technology and Secured Transactions (ICITST 2009), IEEE, 2009.

B. Logan, “Mel frequency cepstral coefficients for music modeling,” ISMIR, vol. 270, 2000.

A. Geron, Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, O’Reilly Media, Inc, 2017.

N. Scaringella, G. Zoia, and D. Mlynek, “Automatic genre classification of music content: A survey,” IEEE Signal Processing Magazine, vol. 23, no. 2, pp. 133-141, March 2006.

C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998.

V. Vapnik, The Nature of Statistical Learning Theory, Springer Science and Business Media, 2013.

B. Scholkopf, C. J. C. Burges, and A. J. Smola, “Advances in kernel methods: Support vector machines,” 1998.

Z. Jan, M. Abrar, S. Bashir, and A. M. Mirza, “Seasonal to inter-annual climate prediction using data mining KNN technique,” International Multi Topic Conference, Springer, Berlin, Heidelberg, 2008.

L.-Y. Hu, and M.-W. Huang “The distance function effect on k-nearest neighbor classification for medical datasets,” SpringerPlus, vol. 5, no. 1, p. 1304, 2016.

L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123-140, 1996.

M. Khondoker, R. Dobson, C. Skirrow, A. Simmons, and D. Stahl, “A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies,” Statistical Methods in Medical Research, vol. 25, no. 5, pp. 1804-1823, 2016.

Abstract Views: 500

PDF Views: 0

Spoken English Digit Classification Using Supervised Learning

Abstract Views: 500 | PDF Views: 0

Authors

Maddimsetti Srinivas
Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur(Dt), Andhra Pradesh, India

Kasiprasad Mannepalli
Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur(Dt), Andhra Pradesh, India

G. L. P. Ashok
Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur(Dt), Andhra Pradesh, India

Abstract

Keywords

Caffe, Ensemble Methods, KNN, MFCC, Random Forest (RF), Spoken English Digit, SVM.

Username
Password
Remember me

Username
Password
Remember me

International Journal of Research in Signal Processing, Computing & Communication System Design

International Journal of Research in Signal Processing, Computing & Communication System Design

Spoken English Digit Classification Using Supervised Learning

Subscribe/Renew Journal

Keywords

Spoken English Digit Classification Using Supervised Learning

Authors

Abstract

Keywords

References