Cargando…

Challenges of deep learning methods for COVID-19 detection using public datasets

Since the COVID-19 pandemic, several research studies have proposed Deep Learning (DL)-based automated COVID-19 detection, reporting high cross-validation accuracy when classifying COVID-19 patients from normal or other common Pneumonia. Although the reported outcomes are very high in most cases, th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hasan, Md. Kamrul, Alam, Md. Ashraful, Dahal, Lavsen, Roy, Shidhartho, Wahid, Sifat Redwan, Elahi, Md. Toufick E., Martí, Robert, Khanal, Bishesh
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	The Authors. Published by Elsevier Ltd. 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005223/ https://www.ncbi.nlm.nih.gov/pubmed/35434261 http://dx.doi.org/10.1016/j.imu.2022.100945

_version_	1784686410988519424
author	Hasan, Md. Kamrul Alam, Md. Ashraful Dahal, Lavsen Roy, Shidhartho Wahid, Sifat Redwan Elahi, Md. Toufick E. Martí, Robert Khanal, Bishesh
author_facet	Hasan, Md. Kamrul Alam, Md. Ashraful Dahal, Lavsen Roy, Shidhartho Wahid, Sifat Redwan Elahi, Md. Toufick E. Martí, Robert Khanal, Bishesh
author_sort	Hasan, Md. Kamrul
collection	PubMed
description	Since the COVID-19 pandemic, several research studies have proposed Deep Learning (DL)-based automated COVID-19 detection, reporting high cross-validation accuracy when classifying COVID-19 patients from normal or other common Pneumonia. Although the reported outcomes are very high in most cases, these results were obtained without an independent test set from a separate data source(s). DL models are likely to overfit training data distribution when independent test sets are not utilized or are prone to learn dataset-specific artifacts rather than the actual disease characteristics and underlying pathology. This study aims to assess the promise of such DL methods and datasets by investigating the key challenges and issues by examining the compositions of the available public image datasets and designing different experimental setups. A convolutional neural network-based network, called CVR-Net (COVID-19 Recognition Network), has been proposed for conducting comprehensive experiments to validate our hypothesis. The presented end-to-end CVR-Net is a multi-scale-multi-encoder ensemble model that aggregates the outputs from two different encoders and their different scales to convey the final prediction probability. Three different classification tasks, such as 2-, 3-, 4-classes, are designed where the train–test datasets are from the single, multiple, and independent sources. The obtained binary classification accuracy is 99.8% for a single train–test data source, where the accuracies fall to 98.4% and 88.7% when multiple and independent train–test data sources are utilized. Similar outcomes are noticed in multi-class categorization tasks for single, multiple, and independent data sources, highlighting the challenges in developing DL models with the existing public datasets without an independent test set from a separate dataset. Such a result concludes a requirement for a better-designed dataset for developing DL tools applicable in actual clinical settings. The dataset should have an independent test set; for a single machine or hospital source, have a more balanced set of images for all the prediction classes; and have a balanced dataset from several hospitals and demography. Our source codes and model are publicly available for the research community for further improvements.
format	Online Article Text
id	pubmed-9005223
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	The Authors. Published by Elsevier Ltd.
record_format	MEDLINE/PubMed
spelling	pubmed-90052232022-04-13 Challenges of deep learning methods for COVID-19 detection using public datasets Hasan, Md. Kamrul Alam, Md. Ashraful Dahal, Lavsen Roy, Shidhartho Wahid, Sifat Redwan Elahi, Md. Toufick E. Martí, Robert Khanal, Bishesh Inform Med Unlocked Article Since the COVID-19 pandemic, several research studies have proposed Deep Learning (DL)-based automated COVID-19 detection, reporting high cross-validation accuracy when classifying COVID-19 patients from normal or other common Pneumonia. Although the reported outcomes are very high in most cases, these results were obtained without an independent test set from a separate data source(s). DL models are likely to overfit training data distribution when independent test sets are not utilized or are prone to learn dataset-specific artifacts rather than the actual disease characteristics and underlying pathology. This study aims to assess the promise of such DL methods and datasets by investigating the key challenges and issues by examining the compositions of the available public image datasets and designing different experimental setups. A convolutional neural network-based network, called CVR-Net (COVID-19 Recognition Network), has been proposed for conducting comprehensive experiments to validate our hypothesis. The presented end-to-end CVR-Net is a multi-scale-multi-encoder ensemble model that aggregates the outputs from two different encoders and their different scales to convey the final prediction probability. Three different classification tasks, such as 2-, 3-, 4-classes, are designed where the train–test datasets are from the single, multiple, and independent sources. The obtained binary classification accuracy is 99.8% for a single train–test data source, where the accuracies fall to 98.4% and 88.7% when multiple and independent train–test data sources are utilized. Similar outcomes are noticed in multi-class categorization tasks for single, multiple, and independent data sources, highlighting the challenges in developing DL models with the existing public datasets without an independent test set from a separate dataset. Such a result concludes a requirement for a better-designed dataset for developing DL tools applicable in actual clinical settings. The dataset should have an independent test set; for a single machine or hospital source, have a more balanced set of images for all the prediction classes; and have a balanced dataset from several hospitals and demography. Our source codes and model are publicly available for the research community for further improvements. The Authors. Published by Elsevier Ltd. 2022 2022-04-12 /pmc/articles/PMC9005223/ /pubmed/35434261 http://dx.doi.org/10.1016/j.imu.2022.100945 Text en © 2022 The Authors Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle	Article Hasan, Md. Kamrul Alam, Md. Ashraful Dahal, Lavsen Roy, Shidhartho Wahid, Sifat Redwan Elahi, Md. Toufick E. Martí, Robert Khanal, Bishesh Challenges of deep learning methods for COVID-19 detection using public datasets
title	Challenges of deep learning methods for COVID-19 detection using public datasets
title_full	Challenges of deep learning methods for COVID-19 detection using public datasets
title_fullStr	Challenges of deep learning methods for COVID-19 detection using public datasets
title_full_unstemmed	Challenges of deep learning methods for COVID-19 detection using public datasets
title_short	Challenges of deep learning methods for COVID-19 detection using public datasets
title_sort	challenges of deep learning methods for covid-19 detection using public datasets
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005223/ https://www.ncbi.nlm.nih.gov/pubmed/35434261 http://dx.doi.org/10.1016/j.imu.2022.100945
work_keys_str_mv	AT hasanmdkamrul challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets AT alammdashraful challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets AT dahallavsen challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets AT royshidhartho challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets AT wahidsifatredwan challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets AT elahimdtouficke challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets AT martirobert challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets AT khanalbishesh challengesofdeeplearningmethodsforcovid19detectionusingpublicdatasets

Challenges of deep learning methods for COVID-19 detection using public datasets

Ejemplares similares