Open Access Open Access  Restricted Access Subscription Access

Real Time Implementation of Speaker Recognition System with MFCC and Neural Networks on FPGA


Affiliations
1 School of Electronics Engineering, VIT University, Chennai-600127, Tamil Nadu, India
 

Background: Speaker recognition systems plays a pivotal role in the field of forensics, security and biometric authentication for verifying or identifying the speaker from the group of speakers. Methods: This paper gives a brief introduction about developing a hardware based speaker recognition system using Mel Frequency Cepstral Coefficients (MFCC) which are extracted from input speech signal to linearize the frequency scale at higher frequencies and Perceptron Neural Networks to provide layer weights for verifying the speaker identity to compare the output in the database of stored speaker identities. Findings: The input speech features are extracted using blocking and windowing to reduce noise and get the audio samples to store in the RAM where sampled data is converted into frequency domain using FFT to get the Cepstral Coefficients which are normalised and fed to neural network tool box present in the MATLAB to obtain layer weights for given set of data and the output is compared with the saved speaker identities to find a match. The decision making logic is written in NIOS II processor of FPGA where the taken input features are compared to the existing database of speaker identities with the help of perceptron neural network layer weights which gives the nearest possibility of the match in the database of the group of speakers. The designed system has been tested using two speakers as reference where the vowels spoken by them are taken into account to compare with the database of speakers already stored in FPGA. Conclusion/ Improvements: The probability of detection of the speakers is 80% and verifying the speaker is quite accurate in hardware based systems than in software based systems where performance factor is less. The given performance in the designed system can be increased by retraining the neural networks which can provide nearly 90% in detecting the speaker.

Keywords

Artificial Neural Networks, FPGA, MFCC, NIOS II, Speaker Recognition
User

Abstract Views: 186

PDF Views: 0




  • Real Time Implementation of Speaker Recognition System with MFCC and Neural Networks on FPGA

Abstract Views: 186  |  PDF Views: 0

Authors

Bhanuprathap Kari
School of Electronics Engineering, VIT University, Chennai-600127, Tamil Nadu, India
S. Muthulakshmi
School of Electronics Engineering, VIT University, Chennai-600127, Tamil Nadu, India

Abstract


Background: Speaker recognition systems plays a pivotal role in the field of forensics, security and biometric authentication for verifying or identifying the speaker from the group of speakers. Methods: This paper gives a brief introduction about developing a hardware based speaker recognition system using Mel Frequency Cepstral Coefficients (MFCC) which are extracted from input speech signal to linearize the frequency scale at higher frequencies and Perceptron Neural Networks to provide layer weights for verifying the speaker identity to compare the output in the database of stored speaker identities. Findings: The input speech features are extracted using blocking and windowing to reduce noise and get the audio samples to store in the RAM where sampled data is converted into frequency domain using FFT to get the Cepstral Coefficients which are normalised and fed to neural network tool box present in the MATLAB to obtain layer weights for given set of data and the output is compared with the saved speaker identities to find a match. The decision making logic is written in NIOS II processor of FPGA where the taken input features are compared to the existing database of speaker identities with the help of perceptron neural network layer weights which gives the nearest possibility of the match in the database of the group of speakers. The designed system has been tested using two speakers as reference where the vowels spoken by them are taken into account to compare with the database of speakers already stored in FPGA. Conclusion/ Improvements: The probability of detection of the speakers is 80% and verifying the speaker is quite accurate in hardware based systems than in software based systems where performance factor is less. The given performance in the designed system can be increased by retraining the neural networks which can provide nearly 90% in detecting the speaker.

Keywords


Artificial Neural Networks, FPGA, MFCC, NIOS II, Speaker Recognition



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i19%2F138176