Cargando…

Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)

Speech, speaker, and language systems have traditionally relied on carefully collected speech material for training acoustic models. There is an enormous amount of freely accessible audio content. A major challenge, however, is that such data is not professionally recorded, and therefore may contain...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hansen, John H.L., Stauffer, Allen, Xia, Wei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246086/ https://www.ncbi.nlm.nih.gov/pubmed/35784517 http://dx.doi.org/10.1016/j.specom.2021.07.007

_version_	1784738890405380096
author	Hansen, John H.L. Stauffer, Allen Xia, Wei
author_facet	Hansen, John H.L. Stauffer, Allen Xia, Wei
author_sort	Hansen, John H.L.
collection	PubMed
description	Speech, speaker, and language systems have traditionally relied on carefully collected speech material for training acoustic models. There is an enormous amount of freely accessible audio content. A major challenge, however, is that such data is not professionally recorded, and therefore may contain a wide diversity of background noise, nonlinear distortions, or other unknown environmental or technology-based contamination or mismatch. There is a crucial need for automatic analysis to screen such unknown datasets before acoustic model development training, or to perform input audio purity screening prior to classification. In this study, we propose a waveform based clipping detection algorithm for naturalistic audio streams and examine the impact of clipping at different severities on speech quality measurements and automatic speaker recognition systems. We use the TIMIT and NIST SRE08 corpora as case studies. The results show, as expected, that clipping introduces a nonlinear distortion into clean speech data, which reduces speech quality and performance for speaker recognition. We also investigate what degree of clipping can be present to sustain effective speech system performance. The proposed detection system, which will be released, could contribute to massive new audio collections for speech and language technology development (e.g. Google Audioset (Gemmeke et al., 2017), CRSS-UTDallas Apollo Fearless-Steps (Yu et al., 2014) (19,000 h naturalistic audio from NASA Apollo missions)).
format	Online Article Text
id	pubmed-9246086
institution	National Center for Biotechnology Information
language	English
publishDate	2021
record_format	MEDLINE/PubMed
spelling	pubmed-92460862022-06-30 Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩) Hansen, John H.L. Stauffer, Allen Xia, Wei Speech Commun Article Speech, speaker, and language systems have traditionally relied on carefully collected speech material for training acoustic models. There is an enormous amount of freely accessible audio content. A major challenge, however, is that such data is not professionally recorded, and therefore may contain a wide diversity of background noise, nonlinear distortions, or other unknown environmental or technology-based contamination or mismatch. There is a crucial need for automatic analysis to screen such unknown datasets before acoustic model development training, or to perform input audio purity screening prior to classification. In this study, we propose a waveform based clipping detection algorithm for naturalistic audio streams and examine the impact of clipping at different severities on speech quality measurements and automatic speaker recognition systems. We use the TIMIT and NIST SRE08 corpora as case studies. The results show, as expected, that clipping introduces a nonlinear distortion into clean speech data, which reduces speech quality and performance for speaker recognition. We also investigate what degree of clipping can be present to sustain effective speech system performance. The proposed detection system, which will be released, could contribute to massive new audio collections for speech and language technology development (e.g. Google Audioset (Gemmeke et al., 2017), CRSS-UTDallas Apollo Fearless-Steps (Yu et al., 2014) (19,000 h naturalistic audio from NASA Apollo missions)). 2021-11 2021-08-12 /pmc/articles/PMC9246086/ /pubmed/35784517 http://dx.doi.org/10.1016/j.specom.2021.07.007 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ).
spellingShingle	Article Hansen, John H.L. Stauffer, Allen Xia, Wei Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)
title	Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)
title_full	Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)
title_fullStr	Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)
title_full_unstemmed	Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)
title_short	Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)
title_sort	nonlinear waveform distortion: assessment and detection of clipping on speech data and systems(✩)
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246086/ https://www.ncbi.nlm.nih.gov/pubmed/35784517 http://dx.doi.org/10.1016/j.specom.2021.07.007
work_keys_str_mv	AT hansenjohnhl nonlinearwaveformdistortionassessmentanddetectionofclippingonspeechdataandsystems AT staufferallen nonlinearwaveformdistortionassessmentanddetectionofclippingonspeechdataandsystems AT xiawei nonlinearwaveformdistortionassessmentanddetectionofclippingonspeechdataandsystems

Nonlinear waveform distortion: Assessment and detection of clipping on speech data and systems(✩)

Ejemplares similares