Cargando…

Speech extraction from vibration signals based on deep learning

Extracting speech information from vibration response signals is a typical system identification problem, and the traditional method is too sensitive to deviations such as model parameters, noise, boundary conditions, and position. A method was proposed to obtain speech signals by collecting vibrati...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Li, Zheng, Weiguang, Li, Shande, Huang, Qibai
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2023
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10599503/ https://www.ncbi.nlm.nih.gov/pubmed/37878667 http://dx.doi.org/10.1371/journal.pone.0288847

_version_	1785125777571840000
author	Wang, Li Zheng, Weiguang Li, Shande Huang, Qibai
author_facet	Wang, Li Zheng, Weiguang Li, Shande Huang, Qibai
author_sort	Wang, Li
collection	PubMed
description	Extracting speech information from vibration response signals is a typical system identification problem, and the traditional method is too sensitive to deviations such as model parameters, noise, boundary conditions, and position. A method was proposed to obtain speech signals by collecting vibration signals of vibroacoustic systems for deep learning training in the work. The vibroacoustic coupling finite element model was first established with the voice signal as the excitation source. The vibration acceleration signals of the vibration response point were used as the training set to extract its spectral characteristics. Training was performed by two types of networks: fully connected, and convolutional. And it is found that the Fully Connected network prediction model has faster Rate of convergence and better quality of extracted speech. The amplitude spectra of the output speech signals (network output) and the phase of the vibration signals were used to convert extracted speech signals back to the time domain during the test set. The simulation results showed that the positions of the vibration response points had little effect on the quality of speech recognition, and good speech extraction quality can be obtained. The noises of the speech signals posed a greater influence on the speech extraction quality than the noises of the vibration signals. Extracted speech quality was poor when both had large noises. This method was robust to the position deviation of vibration responses during training and testing. The smaller the structural flexibility, the better the speech extraction quality. The quality of speech extraction was reduced in a trained system as the mass of node increased in the test set, but with negligible differences. Changes in boundary conditions did not significantly affect extracted speech quality. The speech extraction model proposed in the work has good robustness to position deviations, quality deviations, and boundary conditions.
format	Online Article Text
id	pubmed-10599503
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-105995032023-10-26 Speech extraction from vibration signals based on deep learning Wang, Li Zheng, Weiguang Li, Shande Huang, Qibai PLoS One Research Article Extracting speech information from vibration response signals is a typical system identification problem, and the traditional method is too sensitive to deviations such as model parameters, noise, boundary conditions, and position. A method was proposed to obtain speech signals by collecting vibration signals of vibroacoustic systems for deep learning training in the work. The vibroacoustic coupling finite element model was first established with the voice signal as the excitation source. The vibration acceleration signals of the vibration response point were used as the training set to extract its spectral characteristics. Training was performed by two types of networks: fully connected, and convolutional. And it is found that the Fully Connected network prediction model has faster Rate of convergence and better quality of extracted speech. The amplitude spectra of the output speech signals (network output) and the phase of the vibration signals were used to convert extracted speech signals back to the time domain during the test set. The simulation results showed that the positions of the vibration response points had little effect on the quality of speech recognition, and good speech extraction quality can be obtained. The noises of the speech signals posed a greater influence on the speech extraction quality than the noises of the vibration signals. Extracted speech quality was poor when both had large noises. This method was robust to the position deviation of vibration responses during training and testing. The smaller the structural flexibility, the better the speech extraction quality. The quality of speech extraction was reduced in a trained system as the mass of node increased in the test set, but with negligible differences. Changes in boundary conditions did not significantly affect extracted speech quality. The speech extraction model proposed in the work has good robustness to position deviations, quality deviations, and boundary conditions. Public Library of Science 2023-10-25 /pmc/articles/PMC10599503/ /pubmed/37878667 http://dx.doi.org/10.1371/journal.pone.0288847 Text en © 2023 Wang et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Wang, Li Zheng, Weiguang Li, Shande Huang, Qibai Speech extraction from vibration signals based on deep learning
title	Speech extraction from vibration signals based on deep learning
title_full	Speech extraction from vibration signals based on deep learning
title_fullStr	Speech extraction from vibration signals based on deep learning
title_full_unstemmed	Speech extraction from vibration signals based on deep learning
title_short	Speech extraction from vibration signals based on deep learning
title_sort	speech extraction from vibration signals based on deep learning
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10599503/ https://www.ncbi.nlm.nih.gov/pubmed/37878667 http://dx.doi.org/10.1371/journal.pone.0288847
work_keys_str_mv	AT wangli speechextractionfromvibrationsignalsbasedondeeplearning AT zhengweiguang speechextractionfromvibrationsignalsbasedondeeplearning AT lishande speechextractionfromvibrationsignalsbasedondeeplearning AT huangqibai speechextractionfromvibrationsignalsbasedondeeplearning

Speech extraction from vibration signals based on deep learning

Ejemplares similares