Cargando…

Face mask recognition from audio: The MASC database and an overview on the mask challenge

The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models...

Descripción completa

Detalles Bibliográficos
Autores principales: Mohamed, Mostafa M., Nessiem, Mina A., Batliner, Anton, Bergler, Christian, Hantke, Simone, Schmitt, Maximilian, Baird, Alice, Mallol-Ragolta, Adria, Karas, Vincent, Amiriparian, Shahin, Schuller, Björn W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8489285/
https://www.ncbi.nlm.nih.gov/pubmed/34629550
http://dx.doi.org/10.1016/j.patcog.2021.108361
_version_ 1784578321385783296
author Mohamed, Mostafa M.
Nessiem, Mina A.
Batliner, Anton
Bergler, Christian
Hantke, Simone
Schmitt, Maximilian
Baird, Alice
Mallol-Ragolta, Adria
Karas, Vincent
Amiriparian, Shahin
Schuller, Björn W.
author_facet Mohamed, Mostafa M.
Nessiem, Mina A.
Batliner, Anton
Bergler, Christian
Hantke, Simone
Schmitt, Maximilian
Baird, Alice
Mallol-Ragolta, Adria
Karas, Vincent
Amiriparian, Shahin
Schuller, Björn W.
author_sort Mohamed, Mostafa M.
collection PubMed
description The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask or not. This article reviews the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020 COMputational PARalinguistics challengE (ComParE), which focused on the following classification task: Given an audio chunk of a speaker, classify whether the speaker is wearing a mask or not. First, we report the collection of the Mask Augsburg Speech Corpus (MASC) and the baseline approaches used to solve the problem, achieving a performance of [Formula: see text] Unweighted Average Recall (UAR). We then summarise the methodologies explored in the submitted and accepted papers that mainly used two common patterns: (i) phonetic-based audio features, or (ii) spectrogram representations of audio combined with Convolutional Neural Networks (CNNs) typically used in image processing. Most approaches enhance their models by adapting ensembles of different models and attempting to increase the size of the training data using various techniques. We review and discuss the results of the participants of this sub-challenge, where the winner scored a UAR of [Formula: see text]. Moreover, we present the results of fusing the approaches, leading to a UAR of [Formula: see text]. Finally, we present a smartphone app that can be used as a proof of concept demonstration to detect in real-time whether users are wearing a face mask; we also benchmark the run-time of the best models.
format Online
Article
Text
id pubmed-8489285
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-84892852021-10-04 Face mask recognition from audio: The MASC database and an overview on the mask challenge Mohamed, Mostafa M. Nessiem, Mina A. Batliner, Anton Bergler, Christian Hantke, Simone Schmitt, Maximilian Baird, Alice Mallol-Ragolta, Adria Karas, Vincent Amiriparian, Shahin Schuller, Björn W. Pattern Recognit Article The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask or not. This article reviews the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020 COMputational PARalinguistics challengE (ComParE), which focused on the following classification task: Given an audio chunk of a speaker, classify whether the speaker is wearing a mask or not. First, we report the collection of the Mask Augsburg Speech Corpus (MASC) and the baseline approaches used to solve the problem, achieving a performance of [Formula: see text] Unweighted Average Recall (UAR). We then summarise the methodologies explored in the submitted and accepted papers that mainly used two common patterns: (i) phonetic-based audio features, or (ii) spectrogram representations of audio combined with Convolutional Neural Networks (CNNs) typically used in image processing. Most approaches enhance their models by adapting ensembles of different models and attempting to increase the size of the training data using various techniques. We review and discuss the results of the participants of this sub-challenge, where the winner scored a UAR of [Formula: see text]. Moreover, we present the results of fusing the approaches, leading to a UAR of [Formula: see text]. Finally, we present a smartphone app that can be used as a proof of concept demonstration to detect in real-time whether users are wearing a face mask; we also benchmark the run-time of the best models. Elsevier Ltd. 2022-02 2021-10-04 /pmc/articles/PMC8489285/ /pubmed/34629550 http://dx.doi.org/10.1016/j.patcog.2021.108361 Text en © 2021 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Mohamed, Mostafa M.
Nessiem, Mina A.
Batliner, Anton
Bergler, Christian
Hantke, Simone
Schmitt, Maximilian
Baird, Alice
Mallol-Ragolta, Adria
Karas, Vincent
Amiriparian, Shahin
Schuller, Björn W.
Face mask recognition from audio: The MASC database and an overview on the mask challenge
title Face mask recognition from audio: The MASC database and an overview on the mask challenge
title_full Face mask recognition from audio: The MASC database and an overview on the mask challenge
title_fullStr Face mask recognition from audio: The MASC database and an overview on the mask challenge
title_full_unstemmed Face mask recognition from audio: The MASC database and an overview on the mask challenge
title_short Face mask recognition from audio: The MASC database and an overview on the mask challenge
title_sort face mask recognition from audio: the masc database and an overview on the mask challenge
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8489285/
https://www.ncbi.nlm.nih.gov/pubmed/34629550
http://dx.doi.org/10.1016/j.patcog.2021.108361
work_keys_str_mv AT mohamedmostafam facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT nessiemminaa facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT batlineranton facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT berglerchristian facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT hantkesimone facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT schmittmaximilian facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT bairdalice facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT mallolragoltaadria facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT karasvincent facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT amiriparianshahin facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge
AT schullerbjornw facemaskrecognitionfromaudiothemascdatabaseandanoverviewonthemaskchallenge