Cargando…

A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery

Speaker identification under noisy conditions is one of the challenging topics in the field of speech processing applications. Motivated by the fact that the neural responses are robust against noise, this paper proposes a new speaker identification system using 2-D neurograms constructed from the r...

Descripción completa

Detalles Bibliográficos
Autores principales: Islam, Md. Atiqul, Jassim, Wissam A., Cheok, Ng Siew, Zilany, Muhammad Shamsul Arefeen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4938550/
https://www.ncbi.nlm.nih.gov/pubmed/27392046
http://dx.doi.org/10.1371/journal.pone.0158520
_version_ 1782441877412249600
author Islam, Md. Atiqul
Jassim, Wissam A.
Cheok, Ng Siew
Zilany, Muhammad Shamsul Arefeen
author_facet Islam, Md. Atiqul
Jassim, Wissam A.
Cheok, Ng Siew
Zilany, Muhammad Shamsul Arefeen
author_sort Islam, Md. Atiqul
collection PubMed
description Speaker identification under noisy conditions is one of the challenging topics in the field of speech processing applications. Motivated by the fact that the neural responses are robust against noise, this paper proposes a new speaker identification system using 2-D neurograms constructed from the responses of a physiologically-based computational model of the auditory periphery. The responses of auditory-nerve fibers for a wide range of characteristic frequency were simulated to speech signals to construct neurograms. The neurogram coefficients were trained using the well-known Gaussian mixture model-universal background model classification technique to generate an identity model for each speaker. In this study, three text-independent and one text-dependent speaker databases were employed to test the identification performance of the proposed method. Also, the robustness of the proposed method was investigated using speech signals distorted by three types of noise such as the white Gaussian, pink, and street noises with different signal-to-noise ratios. The identification results of the proposed neural-response-based method were compared to the performances of the traditional speaker identification methods using features such as the Mel-frequency cepstral coefficients, Gamma-tone frequency cepstral coefficients and frequency domain linear prediction. Although the classification accuracy achieved by the proposed method was comparable to the performance of those traditional techniques in quiet, the new feature was found to provide lower error rates of classification under noisy environments.
format Online
Article
Text
id pubmed-4938550
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-49385502016-07-22 A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery Islam, Md. Atiqul Jassim, Wissam A. Cheok, Ng Siew Zilany, Muhammad Shamsul Arefeen PLoS One Research Article Speaker identification under noisy conditions is one of the challenging topics in the field of speech processing applications. Motivated by the fact that the neural responses are robust against noise, this paper proposes a new speaker identification system using 2-D neurograms constructed from the responses of a physiologically-based computational model of the auditory periphery. The responses of auditory-nerve fibers for a wide range of characteristic frequency were simulated to speech signals to construct neurograms. The neurogram coefficients were trained using the well-known Gaussian mixture model-universal background model classification technique to generate an identity model for each speaker. In this study, three text-independent and one text-dependent speaker databases were employed to test the identification performance of the proposed method. Also, the robustness of the proposed method was investigated using speech signals distorted by three types of noise such as the white Gaussian, pink, and street noises with different signal-to-noise ratios. The identification results of the proposed neural-response-based method were compared to the performances of the traditional speaker identification methods using features such as the Mel-frequency cepstral coefficients, Gamma-tone frequency cepstral coefficients and frequency domain linear prediction. Although the classification accuracy achieved by the proposed method was comparable to the performance of those traditional techniques in quiet, the new feature was found to provide lower error rates of classification under noisy environments. Public Library of Science 2016-07-08 /pmc/articles/PMC4938550/ /pubmed/27392046 http://dx.doi.org/10.1371/journal.pone.0158520 Text en © 2016 Islam et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Islam, Md. Atiqul
Jassim, Wissam A.
Cheok, Ng Siew
Zilany, Muhammad Shamsul Arefeen
A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery
title A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery
title_full A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery
title_fullStr A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery
title_full_unstemmed A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery
title_short A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery
title_sort robust speaker identification system using the responses from a model of the auditory periphery
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4938550/
https://www.ncbi.nlm.nih.gov/pubmed/27392046
http://dx.doi.org/10.1371/journal.pone.0158520
work_keys_str_mv AT islammdatiqul arobustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery
AT jassimwissama arobustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery
AT cheokngsiew arobustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery
AT zilanymuhammadshamsularefeen arobustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery
AT islammdatiqul robustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery
AT jassimwissama robustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery
AT cheokngsiew robustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery
AT zilanymuhammadshamsularefeen robustspeakeridentificationsystemusingtheresponsesfromamodeloftheauditoryperiphery