Cargando…
Biomimetic multi-resolution analysis for robust speaker recognition
Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289187/ https://www.ncbi.nlm.nih.gov/pubmed/30546387 http://dx.doi.org/10.1186/1687-4722-2012-22 |
_version_ | 1783379937904820224 |
---|---|
author | Nemala, Sridhar Krishna Zotkin, Dmitry N Duraiswami, Ramani Elhilali, Mounya |
author_facet | Nemala, Sridhar Krishna Zotkin, Dmitry N Duraiswami, Ramani Elhilali, Mounya |
author_sort | Nemala, Sridhar Krishna |
collection | PubMed |
description | Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equipped with elaborate machinery for speech analysis and feature extraction, which hold great lessons for improving the performance of automatic speech processing systems under adverse conditions. The work presented here explores a biologically-motivated multi-resolution speaker information representation obtained by performing an intricate yet computationally-efficient analysis of the information-rich spectro-temporal attributes of the speech signal. We evaluate the proposed features in a speaker verification task performed on NIST SRE 2010 data. The biomimetic approach yields significant robustness in presence of non-stationary noise and reverberation, offering a new framework for deriving reliable features for speaker recognition and speech processing. |
format | Online Article Text |
id | pubmed-6289187 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
record_format | MEDLINE/PubMed |
spelling | pubmed-62891872018-12-11 Biomimetic multi-resolution analysis for robust speaker recognition Nemala, Sridhar Krishna Zotkin, Dmitry N Duraiswami, Ramani Elhilali, Mounya EURASIP J Audio Speech Music Process Article Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equipped with elaborate machinery for speech analysis and feature extraction, which hold great lessons for improving the performance of automatic speech processing systems under adverse conditions. The work presented here explores a biologically-motivated multi-resolution speaker information representation obtained by performing an intricate yet computationally-efficient analysis of the information-rich spectro-temporal attributes of the speech signal. We evaluate the proposed features in a speaker verification task performed on NIST SRE 2010 data. The biomimetic approach yields significant robustness in presence of non-stationary noise and reverberation, offering a new framework for deriving reliable features for speaker recognition and speech processing. 2012-09-07 2012 /pmc/articles/PMC6289187/ /pubmed/30546387 http://dx.doi.org/10.1186/1687-4722-2012-22 Text en This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Article Nemala, Sridhar Krishna Zotkin, Dmitry N Duraiswami, Ramani Elhilali, Mounya Biomimetic multi-resolution analysis for robust speaker recognition |
title | Biomimetic multi-resolution analysis for robust speaker recognition |
title_full | Biomimetic multi-resolution analysis for robust speaker recognition |
title_fullStr | Biomimetic multi-resolution analysis for robust speaker recognition |
title_full_unstemmed | Biomimetic multi-resolution analysis for robust speaker recognition |
title_short | Biomimetic multi-resolution analysis for robust speaker recognition |
title_sort | biomimetic multi-resolution analysis for robust speaker recognition |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289187/ https://www.ncbi.nlm.nih.gov/pubmed/30546387 http://dx.doi.org/10.1186/1687-4722-2012-22 |
work_keys_str_mv | AT nemalasridharkrishna biomimeticmultiresolutionanalysisforrobustspeakerrecognition AT zotkindmitryn biomimeticmultiresolutionanalysisforrobustspeakerrecognition AT duraiswamiramani biomimeticmultiresolutionanalysisforrobustspeakerrecognition AT elhilalimounya biomimeticmultiresolutionanalysisforrobustspeakerrecognition |