Cargando…

The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study

This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Conti...

Descripción completa

Detalles Bibliográficos
Autores principales:	Geng, Puyang, Lu, Qimeng, Guo, Hong, Zeng, Jinhua
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2023
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/ https://www.ncbi.nlm.nih.gov/pubmed/36996037 http://dx.doi.org/10.1371/journal.pone.0283724

_version_	1785017532512468992
author	Geng, Puyang Lu, Qimeng Guo, Hong Zeng, Jinhua
author_facet	Geng, Puyang Lu, Qimeng Guo, Hong Zeng, Jinhua
author_sort	Geng, Puyang
collection	PubMed
description	This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.
format	Online Article Text
id	pubmed-10062611
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-100626112023-03-31 The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study Geng, Puyang Lu, Qimeng Guo, Hong Zeng, Jinhua PLoS One Research Article This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification. Public Library of Science 2023-03-30 /pmc/articles/PMC10062611/ /pubmed/36996037 http://dx.doi.org/10.1371/journal.pone.0283724 Text en © 2023 Geng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Geng, Puyang Lu, Qimeng Guo, Hong Zeng, Jinhua The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_fullStr	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full_unstemmed	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_short	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_sort	effects of face mask on speech production and its implication for forensic speaker identification-a cross-linguistic study
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/ https://www.ncbi.nlm.nih.gov/pubmed/36996037 http://dx.doi.org/10.1371/journal.pone.0283724
work_keys_str_mv	AT gengpuyang theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT luqimeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT guohong theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT zengjinhua theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT gengpuyang effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT luqimeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT guohong effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT zengjinhua effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy

The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study

Ejemplares similares