Cargando…

Biomimetic multi-resolution analysis for robust speaker recognition

Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equi...

Descripción completa

Detalles Bibliográficos
Autores principales: Nemala, Sridhar Krishna, Zotkin, Dmitry N, Duraiswami, Ramani, Elhilali, Mounya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289187/
https://www.ncbi.nlm.nih.gov/pubmed/30546387
http://dx.doi.org/10.1186/1687-4722-2012-22
_version_ 1783379937904820224
author Nemala, Sridhar Krishna
Zotkin, Dmitry N
Duraiswami, Ramani
Elhilali, Mounya
author_facet Nemala, Sridhar Krishna
Zotkin, Dmitry N
Duraiswami, Ramani
Elhilali, Mounya
author_sort Nemala, Sridhar Krishna
collection PubMed
description Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equipped with elaborate machinery for speech analysis and feature extraction, which hold great lessons for improving the performance of automatic speech processing systems under adverse conditions. The work presented here explores a biologically-motivated multi-resolution speaker information representation obtained by performing an intricate yet computationally-efficient analysis of the information-rich spectro-temporal attributes of the speech signal. We evaluate the proposed features in a speaker verification task performed on NIST SRE 2010 data. The biomimetic approach yields significant robustness in presence of non-stationary noise and reverberation, offering a new framework for deriving reliable features for speaker recognition and speech processing.
format Online
Article
Text
id pubmed-6289187
institution National Center for Biotechnology Information
language English
publishDate 2012
record_format MEDLINE/PubMed
spelling pubmed-62891872018-12-11 Biomimetic multi-resolution analysis for robust speaker recognition Nemala, Sridhar Krishna Zotkin, Dmitry N Duraiswami, Ramani Elhilali, Mounya EURASIP J Audio Speech Music Process Article Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equipped with elaborate machinery for speech analysis and feature extraction, which hold great lessons for improving the performance of automatic speech processing systems under adverse conditions. The work presented here explores a biologically-motivated multi-resolution speaker information representation obtained by performing an intricate yet computationally-efficient analysis of the information-rich spectro-temporal attributes of the speech signal. We evaluate the proposed features in a speaker verification task performed on NIST SRE 2010 data. The biomimetic approach yields significant robustness in presence of non-stationary noise and reverberation, offering a new framework for deriving reliable features for speaker recognition and speech processing. 2012-09-07 2012 /pmc/articles/PMC6289187/ /pubmed/30546387 http://dx.doi.org/10.1186/1687-4722-2012-22 Text en This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Article
Nemala, Sridhar Krishna
Zotkin, Dmitry N
Duraiswami, Ramani
Elhilali, Mounya
Biomimetic multi-resolution analysis for robust speaker recognition
title Biomimetic multi-resolution analysis for robust speaker recognition
title_full Biomimetic multi-resolution analysis for robust speaker recognition
title_fullStr Biomimetic multi-resolution analysis for robust speaker recognition
title_full_unstemmed Biomimetic multi-resolution analysis for robust speaker recognition
title_short Biomimetic multi-resolution analysis for robust speaker recognition
title_sort biomimetic multi-resolution analysis for robust speaker recognition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289187/
https://www.ncbi.nlm.nih.gov/pubmed/30546387
http://dx.doi.org/10.1186/1687-4722-2012-22
work_keys_str_mv AT nemalasridharkrishna biomimeticmultiresolutionanalysisforrobustspeakerrecognition
AT zotkindmitryn biomimeticmultiresolutionanalysisforrobustspeakerrecognition
AT duraiswamiramani biomimeticmultiresolutionanalysisforrobustspeakerrecognition
AT elhilalimounya biomimeticmultiresolutionanalysisforrobustspeakerrecognition