Cargando…

Localizing category-related information in speech with multi-scale analyses

Measurements of the physical outputs of speech—vocal tract geometry and acoustic energy—are high-dimensional, but linguistic theories posit a low-dimensional set of categories such as phonemes and phrase types. How can it be determined when and where in high-dimensional articulatory and acoustic sig...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tilsen, Sam, Kim, Seung-Eun, Wang, Claire
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486085/ https://www.ncbi.nlm.nih.gov/pubmed/34597350 http://dx.doi.org/10.1371/journal.pone.0258178

_version_	1784577668006543360
author	Tilsen, Sam Kim, Seung-Eun Wang, Claire
author_facet	Tilsen, Sam Kim, Seung-Eun Wang, Claire
author_sort	Tilsen, Sam
collection	PubMed
description	Measurements of the physical outputs of speech—vocal tract geometry and acoustic energy—are high-dimensional, but linguistic theories posit a low-dimensional set of categories such as phonemes and phrase types. How can it be determined when and where in high-dimensional articulatory and acoustic signals there is information related to theoretical categories? For a variety of reasons, it is problematic to directly quantify mutual information between hypothesized categories and signals. To address this issue, a multi-scale analysis method is proposed for localizing category-related information in an ensemble of speech signals using machine learning algorithms. By analyzing how classification accuracy on unseen data varies as the temporal extent of training input is systematically restricted, inferences can be drawn regarding the temporal distribution of category-related information. The method can also be used to investigate redundancy between subsets of signal dimensions. Two types of theoretical categories are examined in this paper: phonemic/gestural categories and syntactic relative clause categories. Moreover, two different machine learning algorithms were examined: linear discriminant analysis and neural networks with long short-term memory units. Both algorithms detected category-related information earlier and later in signals than would be expected given standard theoretical assumptions about when linguistic categories should influence speech. The neural network algorithm was able to identify category-related information to a greater extent than the discriminant analyses.
format	Online Article Text
id	pubmed-8486085
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-84860852021-10-02 Localizing category-related information in speech with multi-scale analyses Tilsen, Sam Kim, Seung-Eun Wang, Claire PLoS One Research Article Measurements of the physical outputs of speech—vocal tract geometry and acoustic energy—are high-dimensional, but linguistic theories posit a low-dimensional set of categories such as phonemes and phrase types. How can it be determined when and where in high-dimensional articulatory and acoustic signals there is information related to theoretical categories? For a variety of reasons, it is problematic to directly quantify mutual information between hypothesized categories and signals. To address this issue, a multi-scale analysis method is proposed for localizing category-related information in an ensemble of speech signals using machine learning algorithms. By analyzing how classification accuracy on unseen data varies as the temporal extent of training input is systematically restricted, inferences can be drawn regarding the temporal distribution of category-related information. The method can also be used to investigate redundancy between subsets of signal dimensions. Two types of theoretical categories are examined in this paper: phonemic/gestural categories and syntactic relative clause categories. Moreover, two different machine learning algorithms were examined: linear discriminant analysis and neural networks with long short-term memory units. Both algorithms detected category-related information earlier and later in signals than would be expected given standard theoretical assumptions about when linguistic categories should influence speech. The neural network algorithm was able to identify category-related information to a greater extent than the discriminant analyses. Public Library of Science 2021-10-01 /pmc/articles/PMC8486085/ /pubmed/34597350 http://dx.doi.org/10.1371/journal.pone.0258178 Text en © 2021 Tilsen et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Tilsen, Sam Kim, Seung-Eun Wang, Claire Localizing category-related information in speech with multi-scale analyses
title	Localizing category-related information in speech with multi-scale analyses
title_full	Localizing category-related information in speech with multi-scale analyses
title_fullStr	Localizing category-related information in speech with multi-scale analyses
title_full_unstemmed	Localizing category-related information in speech with multi-scale analyses
title_short	Localizing category-related information in speech with multi-scale analyses
title_sort	localizing category-related information in speech with multi-scale analyses
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486085/ https://www.ncbi.nlm.nih.gov/pubmed/34597350 http://dx.doi.org/10.1371/journal.pone.0258178
work_keys_str_mv	AT tilsensam localizingcategoryrelatedinformationinspeechwithmultiscaleanalyses AT kimseungeun localizingcategoryrelatedinformationinspeechwithmultiscaleanalyses AT wangclaire localizingcategoryrelatedinformationinspeechwithmultiscaleanalyses

Localizing category-related information in speech with multi-scale analyses

Ejemplares similares