Cargando…

The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study

This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Conti...

Descripción completa

Detalles Bibliográficos
Autores principales: Geng, Puyang, Lu, Qimeng, Guo, Hong, Zeng, Jinhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/
https://www.ncbi.nlm.nih.gov/pubmed/36996037
http://dx.doi.org/10.1371/journal.pone.0283724
_version_ 1785017532512468992
author Geng, Puyang
Lu, Qimeng
Guo, Hong
Zeng, Jinhua
author_facet Geng, Puyang
Lu, Qimeng
Guo, Hong
Zeng, Jinhua
author_sort Geng, Puyang
collection PubMed
description This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.
format Online
Article
Text
id pubmed-10062611
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100626112023-03-31 The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study Geng, Puyang Lu, Qimeng Guo, Hong Zeng, Jinhua PLoS One Research Article This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification. Public Library of Science 2023-03-30 /pmc/articles/PMC10062611/ /pubmed/36996037 http://dx.doi.org/10.1371/journal.pone.0283724 Text en © 2023 Geng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Geng, Puyang
Lu, Qimeng
Guo, Hong
Zeng, Jinhua
The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_fullStr The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full_unstemmed The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_short The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_sort effects of face mask on speech production and its implication for forensic speaker identification-a cross-linguistic study
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/
https://www.ncbi.nlm.nih.gov/pubmed/36996037
http://dx.doi.org/10.1371/journal.pone.0283724
work_keys_str_mv AT gengpuyang theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT luqimeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT guohong theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT zengjinhua theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT gengpuyang effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT luqimeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT guohong effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT zengjinhua effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy