Cargando…

Sounds of COVID-19: exploring realistic performance of audio-based digital testing

To identify Coronavirus disease (COVID-19) cases efficiently, affordably, and at scale, recent work has shown how audio (including cough, breathing and voice) based approaches can be used for testing. However, there is a lack of exploration of how biases and methodological decisions impact these too...

Descripción completa

Detalles Bibliográficos
Autores principales: Han, Jing, Xia, Tong, Spathis, Dimitris, Bondareva, Erika, Brown, Chloë, Chauhan, Jagmohan, Dang, Ting, Grammenos, Andreas, Hasthanasombat, Apinan, Floto, Andres, Cicuta, Pietro, Mascolo, Cecilia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8799654/
https://www.ncbi.nlm.nih.gov/pubmed/35091662
http://dx.doi.org/10.1038/s41746-021-00553-x
_version_ 1784642109851041792
author Han, Jing
Xia, Tong
Spathis, Dimitris
Bondareva, Erika
Brown, Chloë
Chauhan, Jagmohan
Dang, Ting
Grammenos, Andreas
Hasthanasombat, Apinan
Floto, Andres
Cicuta, Pietro
Mascolo, Cecilia
author_facet Han, Jing
Xia, Tong
Spathis, Dimitris
Bondareva, Erika
Brown, Chloë
Chauhan, Jagmohan
Dang, Ting
Grammenos, Andreas
Hasthanasombat, Apinan
Floto, Andres
Cicuta, Pietro
Mascolo, Cecilia
author_sort Han, Jing
collection PubMed
description To identify Coronavirus disease (COVID-19) cases efficiently, affordably, and at scale, recent work has shown how audio (including cough, breathing and voice) based approaches can be used for testing. However, there is a lack of exploration of how biases and methodological decisions impact these tools’ performance in practice. In this paper, we explore the realistic performance of audio-based digital testing of COVID-19. To investigate this, we collected a large crowdsourced respiratory audio dataset through a mobile app, alongside symptoms and COVID-19 test results. Within the collected dataset, we selected 5240 samples from 2478 English-speaking participants and split them into participant-independent sets for model development and validation. In addition to controlling the language, we also balanced demographics for model training to avoid potential acoustic bias. We used these audio samples to construct an audio-based COVID-19 prediction model. The unbiased model took features extracted from breathing, coughs and voice signals as predictors and yielded an AUC-ROC of 0.71 (95% CI: 0.65–0.77). We further explored several scenarios with different types of unbalanced data distributions to demonstrate how biases and participant splits affect the performance. With these different, but less appropriate, evaluation strategies, the performance could be overestimated, reaching an AUC up to 0.90 (95% CI: 0.85–0.95) in some circumstances. We found that an unrealistic experimental setting can result in misleading, sometimes over-optimistic, performance. Instead, we reported complete and reliable results on crowd-sourced data, which would allow medical professionals and policy makers to accurately assess the value of this technology and facilitate its deployment.
format Online
Article
Text
id pubmed-8799654
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-87996542022-02-07 Sounds of COVID-19: exploring realistic performance of audio-based digital testing Han, Jing Xia, Tong Spathis, Dimitris Bondareva, Erika Brown, Chloë Chauhan, Jagmohan Dang, Ting Grammenos, Andreas Hasthanasombat, Apinan Floto, Andres Cicuta, Pietro Mascolo, Cecilia NPJ Digit Med Article To identify Coronavirus disease (COVID-19) cases efficiently, affordably, and at scale, recent work has shown how audio (including cough, breathing and voice) based approaches can be used for testing. However, there is a lack of exploration of how biases and methodological decisions impact these tools’ performance in practice. In this paper, we explore the realistic performance of audio-based digital testing of COVID-19. To investigate this, we collected a large crowdsourced respiratory audio dataset through a mobile app, alongside symptoms and COVID-19 test results. Within the collected dataset, we selected 5240 samples from 2478 English-speaking participants and split them into participant-independent sets for model development and validation. In addition to controlling the language, we also balanced demographics for model training to avoid potential acoustic bias. We used these audio samples to construct an audio-based COVID-19 prediction model. The unbiased model took features extracted from breathing, coughs and voice signals as predictors and yielded an AUC-ROC of 0.71 (95% CI: 0.65–0.77). We further explored several scenarios with different types of unbalanced data distributions to demonstrate how biases and participant splits affect the performance. With these different, but less appropriate, evaluation strategies, the performance could be overestimated, reaching an AUC up to 0.90 (95% CI: 0.85–0.95) in some circumstances. We found that an unrealistic experimental setting can result in misleading, sometimes over-optimistic, performance. Instead, we reported complete and reliable results on crowd-sourced data, which would allow medical professionals and policy makers to accurately assess the value of this technology and facilitate its deployment. Nature Publishing Group UK 2022-01-28 /pmc/articles/PMC8799654/ /pubmed/35091662 http://dx.doi.org/10.1038/s41746-021-00553-x Text en © Crown 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Han, Jing
Xia, Tong
Spathis, Dimitris
Bondareva, Erika
Brown, Chloë
Chauhan, Jagmohan
Dang, Ting
Grammenos, Andreas
Hasthanasombat, Apinan
Floto, Andres
Cicuta, Pietro
Mascolo, Cecilia
Sounds of COVID-19: exploring realistic performance of audio-based digital testing
title Sounds of COVID-19: exploring realistic performance of audio-based digital testing
title_full Sounds of COVID-19: exploring realistic performance of audio-based digital testing
title_fullStr Sounds of COVID-19: exploring realistic performance of audio-based digital testing
title_full_unstemmed Sounds of COVID-19: exploring realistic performance of audio-based digital testing
title_short Sounds of COVID-19: exploring realistic performance of audio-based digital testing
title_sort sounds of covid-19: exploring realistic performance of audio-based digital testing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8799654/
https://www.ncbi.nlm.nih.gov/pubmed/35091662
http://dx.doi.org/10.1038/s41746-021-00553-x
work_keys_str_mv AT hanjing soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT xiatong soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT spathisdimitris soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT bondarevaerika soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT brownchloe soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT chauhanjagmohan soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT dangting soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT grammenosandreas soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT hasthanasombatapinan soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT flotoandres soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT cicutapietro soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting
AT mascolocecilia soundsofcovid19exploringrealisticperformanceofaudiobaseddigitaltesting