Cargando…

Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation

During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker’s vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker’s physical state can affect...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhao, Wayne, Singh, Rita
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10378572/ https://www.ncbi.nlm.nih.gov/pubmed/37509986 http://dx.doi.org/10.3390/e25071039

_version_	1785079800656822272
author	Zhao, Wayne Singh, Rita
author_facet	Zhao, Wayne Singh, Rita
author_sort	Zhao, Wayne
collection	PubMed
description	During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker’s vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker’s physical state can affect voice production and alter these oscillatory patterns. Measuring these can be valuable in developing computational tools that analyze voice to infer the speaker’s state. Traditionally, vocal fold oscillations (VFOs) are measured directly using physical devices in clinical settings. In this paper, we propose a novel analysis-by-synthesis approach that allows us to infer the VFOs directly from recorded speech signals on an individualized, speaker-by-speaker basis. The approach, called the ADLES-VFT algorithm, is proposed in the context of a joint model that combines a phonation model (with a glottal flow waveform as the output) and a vocal tract acoustic wave propagation model such that the output of the joint model is an estimated waveform. The ADLES-VFT algorithm is a forward-backward algorithm which minimizes the error between the recorded waveform and the output of this joint model to estimate its parameters. Once estimated, these parameter values are used in conjunction with a phonation model to obtain its solutions. Since the parameters correlate with the physical properties of the vocal folds of the speaker, model solutions obtained using them represent the individualized VFOs for each speaker. The approach is flexible and can be applied to various phonation models. In addition to presenting the methodology, we show how the VFOs can be quantified from a dynamical systems perspective for classification purposes. Mathematical derivations are provided in an appendix for better readability.
format	Online Article Text
id	pubmed-10378572
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-103785722023-07-29 Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation Zhao, Wayne Singh, Rita Entropy (Basel) Article During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker’s vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker’s physical state can affect voice production and alter these oscillatory patterns. Measuring these can be valuable in developing computational tools that analyze voice to infer the speaker’s state. Traditionally, vocal fold oscillations (VFOs) are measured directly using physical devices in clinical settings. In this paper, we propose a novel analysis-by-synthesis approach that allows us to infer the VFOs directly from recorded speech signals on an individualized, speaker-by-speaker basis. The approach, called the ADLES-VFT algorithm, is proposed in the context of a joint model that combines a phonation model (with a glottal flow waveform as the output) and a vocal tract acoustic wave propagation model such that the output of the joint model is an estimated waveform. The ADLES-VFT algorithm is a forward-backward algorithm which minimizes the error between the recorded waveform and the output of this joint model to estimate its parameters. Once estimated, these parameter values are used in conjunction with a phonation model to obtain its solutions. Since the parameters correlate with the physical properties of the vocal folds of the speaker, model solutions obtained using them represent the individualized VFOs for each speaker. The approach is flexible and can be applied to various phonation models. In addition to presenting the methodology, we show how the VFOs can be quantified from a dynamical systems perspective for classification purposes. Mathematical derivations are provided in an appendix for better readability. MDPI 2023-07-10 /pmc/articles/PMC10378572/ /pubmed/37509986 http://dx.doi.org/10.3390/e25071039 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhao, Wayne Singh, Rita Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
title	Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
title_full	Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
title_fullStr	Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
title_full_unstemmed	Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
title_short	Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
title_sort	deriving vocal fold oscillation information from recorded voice signals using models of phonation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10378572/ https://www.ncbi.nlm.nih.gov/pubmed/37509986 http://dx.doi.org/10.3390/e25071039
work_keys_str_mv	AT zhaowayne derivingvocalfoldoscillationinformationfromrecordedvoicesignalsusingmodelsofphonation AT singhrita derivingvocalfoldoscillationinformationfromrecordedvoicesignalsusingmodelsofphonation

Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation

Ejemplares similares