Cargando…

Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference

Speech segmentation is a crucial step in automatic speech recognition because additional speech analyses are performed for each framed speech segment. Conventional segmentation techniques primarily segment speech using a fixed frame size for computational simplicity. However, this approach is insuff...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lee, Byeongwook, Cho, Kwang-Hyun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group 2016
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5120313/ https://www.ncbi.nlm.nih.gov/pubmed/27876875 http://dx.doi.org/10.1038/srep37647

_version_	1782469216500187136
author	Lee, Byeongwook Cho, Kwang-Hyun
author_facet	Lee, Byeongwook Cho, Kwang-Hyun
author_sort	Lee, Byeongwook
collection	PubMed
description	Speech segmentation is a crucial step in automatic speech recognition because additional speech analyses are performed for each framed speech segment. Conventional segmentation techniques primarily segment speech using a fixed frame size for computational simplicity. However, this approach is insufficient for capturing the quasi-regular structure of speech, which causes substantial recognition failure in noisy environments. How does the brain handle quasi-regular structured speech and maintain high recognition performance under any circumstance? Recent neurophysiological studies have suggested that the phase of neuronal oscillations in the auditory cortex contributes to accurate speech recognition by guiding speech segmentation into smaller units at different timescales. A phase-locked relationship between neuronal oscillation and the speech envelope has recently been obtained, which suggests that the speech envelope provides a foundation for multi-timescale speech segmental information. In this study, we quantitatively investigated the role of the speech envelope as a potential temporal reference to segment speech using its instantaneous phase information. We evaluated the proposed approach by the achieved information gain and recognition performance in various noisy environments. The results indicate that the proposed segmentation scheme not only extracts more information from speech but also provides greater robustness in a recognition test.
format	Online Article Text
id	pubmed-5120313
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Nature Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-51203132016-11-28 Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference Lee, Byeongwook Cho, Kwang-Hyun Sci Rep Article Speech segmentation is a crucial step in automatic speech recognition because additional speech analyses are performed for each framed speech segment. Conventional segmentation techniques primarily segment speech using a fixed frame size for computational simplicity. However, this approach is insufficient for capturing the quasi-regular structure of speech, which causes substantial recognition failure in noisy environments. How does the brain handle quasi-regular structured speech and maintain high recognition performance under any circumstance? Recent neurophysiological studies have suggested that the phase of neuronal oscillations in the auditory cortex contributes to accurate speech recognition by guiding speech segmentation into smaller units at different timescales. A phase-locked relationship between neuronal oscillation and the speech envelope has recently been obtained, which suggests that the speech envelope provides a foundation for multi-timescale speech segmental information. In this study, we quantitatively investigated the role of the speech envelope as a potential temporal reference to segment speech using its instantaneous phase information. We evaluated the proposed approach by the achieved information gain and recognition performance in various noisy environments. The results indicate that the proposed segmentation scheme not only extracts more information from speech but also provides greater robustness in a recognition test. Nature Publishing Group 2016-11-23 /pmc/articles/PMC5120313/ /pubmed/27876875 http://dx.doi.org/10.1038/srep37647 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle	Article Lee, Byeongwook Cho, Kwang-Hyun Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference
title	Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference
title_full	Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference
title_fullStr	Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference
title_full_unstemmed	Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference
title_short	Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference
title_sort	brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5120313/ https://www.ncbi.nlm.nih.gov/pubmed/27876875 http://dx.doi.org/10.1038/srep37647
work_keys_str_mv	AT leebyeongwook braininspiredspeechsegmentationforautomaticspeechrecognitionusingthespeechenvelopeasatemporalreference AT chokwanghyun braininspiredspeechsegmentationforautomaticspeechrecognitionusingthespeechenvelopeasatemporalreference

Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference

Ejemplares similares