Cargando…
Face mask recognition from audio: The MASC database and an overview on the mask challenge
The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Ltd.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8489285/ https://www.ncbi.nlm.nih.gov/pubmed/34629550 http://dx.doi.org/10.1016/j.patcog.2021.108361 |
_version_ | 1784578321385783296 |
---|---|
author | Mohamed, Mostafa M. Nessiem, Mina A. Batliner, Anton Bergler, Christian Hantke, Simone Schmitt, Maximilian Baird, Alice Mallol-Ragolta, Adria Karas, Vincent Amiriparian, Shahin Schuller, Björn W. |
author_facet | Mohamed, Mostafa M. Nessiem, Mina A. Batliner, Anton Bergler, Christian Hantke, Simone Schmitt, Maximilian Baird, Alice Mallol-Ragolta, Adria Karas, Vincent Amiriparian, Shahin Schuller, Björn W. |
author_sort | Mohamed, Mostafa M. |
collection | PubMed |
description | The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask or not. This article reviews the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020 COMputational PARalinguistics challengE (ComParE), which focused on the following classification task: Given an audio chunk of a speaker, classify whether the speaker is wearing a mask or not. First, we report the collection of the Mask Augsburg Speech Corpus (MASC) and the baseline approaches used to solve the problem, achieving a performance of [Formula: see text] Unweighted Average Recall (UAR). We then summarise the methodologies explored in the submitted and accepted papers that mainly used two common patterns: (i) phonetic-based audio features, or (ii) spectrogram representations of audio combined with Convolutional Neural Networks (CNNs) typically used in image processing. Most approaches enhance their models by adapting ensembles of different models and attempting to increase the size of the training data using various techniques. We review and discuss the results of the participants of this sub-challenge, where the winner scored a UAR of [Formula: see text]. Moreover, we present the results of fusing the approaches, leading to a UAR of [Formula: see text]. Finally, we present a smartphone app that can be used as a proof of concept demonstration to detect in real-time whether users are wearing a face mask; we also benchmark the run-time of the best models. |
format | Online Article Text |
id | pubmed-8489285 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-84892852021-10-04 Face mask recognition from audio: The MASC database and an overview on the mask challenge Mohamed, Mostafa M. Nessiem, Mina A. Batliner, Anton Bergler, Christian Hantke, Simone Schmitt, Maximilian Baird, Alice Mallol-Ragolta, Adria Karas, Vincent Amiriparian, Shahin Schuller, Björn W. Pattern Recognit Article The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask or not. This article reviews the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020 COMputational PARalinguistics challengE (ComParE), which focused on the following classification task: Given an audio chunk of a speaker, classify whether the speaker is wearing a mask or not. First, we report the collection of the Mask Augsburg Speech Corpus (MASC) and the baseline approaches used to solve the problem, achieving a performance of [Formula: see text] Unweighted Average Recall (UAR). We then summarise the methodologies explored in the submitted and accepted papers that mainly used two common patterns: (i) phonetic-based audio features, or (ii) spectrogram representations of audio combined with Convolutional Neural Networks (CNNs) typically used in image processing. Most approaches enhance their models by adapting ensembles of different models and attempting to increase the size of the training data using various techniques. We review and discuss the results of the participants of this sub-challenge, where the winner scored a UAR of [Formula: see text]. Moreover, we present the results of fusing the approaches, leading to a UAR of [Formula: see text]. Finally, we present a smartphone app that can be used as a proof of concept demonstration to detect in real-time whether users are wearing a face mask; we also benchmark the run-time of the best models. Elsevier Ltd. 2022-02 2021-10-04 /pmc/articles/PMC8489285/ /pubmed/34629550 http://dx.doi.org/10.1016/j.patcog.2021.108361 Text en © 2021 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Mohamed, Mostafa M. Nessiem, Mina A. Batliner, Anton Bergler, Christian Hantke, Simone Schmitt, Maximilian Baird, Alice Mallol-Ragolta, Adria Karas, Vincent Amiriparian, Shahin Schuller, Björn W. Face mask recognition from audio: The MASC database and an overview on the mask challenge |
title | Face mask recognition from audio: The MASC database and an overview on the mask challenge |
title_full | Face mask recognition from audio: The MASC database and an overview on the mask challenge |
title_fullStr | Face mask recognition from audio: The MASC database and an overview on the mask challenge |
title_full_unstemmed | Face mask recognition from audio: The MASC database and an overview on the mask challenge |
title_short | Face mask recognition from audio: The MASC database and an overview on the mask challenge |
title_sort | face mask recognition from audio: the masc database and an overview on the mask challenge |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8489285/ https://www.ncbi.nlm.nih.gov/pubmed/34629550 http://dx.doi.org/10.1016/j.patcog.2021.108361 |
work_keys_str_mv | AT mohamedmostafam facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT nessiemminaa facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT batlineranton facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT berglerchristian facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT hantkesimone facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT schmittmaximilian facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT bairdalice facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT mallolragoltaadria facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT karasvincent facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT amiriparianshahin facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge AT schullerbjornw facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge |