Cargando…

An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition

Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lozano-Diez, Alicia, Zazo, Ruben, Toledano, Doroteo T., Gonzalez-Rodriguez, Joaquin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5552160/ https://www.ncbi.nlm.nih.gov/pubmed/28796806 http://dx.doi.org/10.1371/journal.pone.0182580

_version_	1783256423121027072
author	Lozano-Diez, Alicia Zazo, Ruben Toledano, Doroteo T. Gonzalez-Rodriguez, Joaquin
author_facet	Lozano-Diez, Alicia Zazo, Ruben Toledano, Doroteo T. Gonzalez-Rodriguez, Joaquin
author_sort	Lozano-Diez, Alicia
collection	PubMed
description	Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.
format	Online Article Text
id	pubmed-5552160
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-55521602017-08-25 An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition Lozano-Diez, Alicia Zazo, Ruben Toledano, Doroteo T. Gonzalez-Rodriguez, Joaquin PLoS One Research Article Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance. Public Library of Science 2017-08-10 /pmc/articles/PMC5552160/ /pubmed/28796806 http://dx.doi.org/10.1371/journal.pone.0182580 Text en © 2017 Lozano-Diez et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Lozano-Diez, Alicia Zazo, Ruben Toledano, Doroteo T. Gonzalez-Rodriguez, Joaquin An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition
title	An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition
title_full	An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition
title_fullStr	An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition
title_full_unstemmed	An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition
title_short	An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition
title_sort	analysis of the influence of deep neural network (dnn) topology in bottleneck feature based language recognition
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5552160/ https://www.ncbi.nlm.nih.gov/pubmed/28796806 http://dx.doi.org/10.1371/journal.pone.0182580
work_keys_str_mv	AT lozanodiezalicia ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition AT zazoruben ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition AT toledanodoroteot ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition AT gonzalezrodriguezjoaquin ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition AT lozanodiezalicia analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition AT zazoruben analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition AT toledanodoroteot analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition AT gonzalezrodriguezjoaquin analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition

An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition

Ejemplares similares