Preprocessing-Techniques-for-Speaker-Recognition

The goal of our proposed study is to construct a Speaker verification model that can recognise the speaker utilizing multiple preprocessing processes, with an emphasis on the deep-learning-based model and MFCCs characteristics collected from the speech waves. This paper includes voice activity detection, Mel Frequency Cepstral Coefficients for feature extraction, then saving the pre-processed data so that processing time will be saved while executing. Then feeding this data to the 3-layer CNN model, which also consists of batch normalization and max-pooling. Max-pooling is applied to downsample the output of the convolutional layer by a factor of two. The activation function used is a rectified linear unit(ReLu). L2 Regularization is used for the overfitting issue. Tuning the hyperparameter-like learning rate, and epochs were very challenging as minute deflection take us up and down. The whole workflow is the first thing that loads the dataset and then splits it into a train, validation and test data splits then implements the CNN model. Then train the model, evaluate the model on the test splits, and finally, save the model so that it can be used afterward.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
IEC2018076_ANKIT _SINGH		IEC2018076_ANKIT _SINGH
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Preprocessing-Techniques-for-Speaker-Recognition

About

Releases

Packages

Languages

License

iec2018076/Preprocessing-Techniques-for-Speaker-Recognition

Folders and files

Latest commit

History

Repository files navigation

Preprocessing-Techniques-for-Speaker-Recognition

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages