Cargando…

Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients

Chest X-ray images are useful for early COVID-19 diagnosis with the advantage that X-ray devices are already available in health centers and images are obtained immediately. Some datasets containing X-ray images with cases (pneumonia or COVID-19) and controls have been made available to develop mach...

Descripción completa

Detalles Bibliográficos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IEEE 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545228/
https://www.ncbi.nlm.nih.gov/pubmed/34812384
http://dx.doi.org/10.1109/ACCESS.2021.3065456
_version_ 1784589972568801280
collection PubMed
description Chest X-ray images are useful for early COVID-19 diagnosis with the advantage that X-ray devices are already available in health centers and images are obtained immediately. Some datasets containing X-ray images with cases (pneumonia or COVID-19) and controls have been made available to develop machine-learning-based methods to aid in diagnosing the disease. However, these datasets are mainly composed of different sources coming from pre-COVID-19 datasets and COVID-19 datasets. Particularly, we have detected a significant bias in some of the released datasets used to train and test diagnostic systems, which might imply that the results published are optimistic and may overestimate the actual predictive capacity of the techniques proposed. In this article, we analyze the existing bias in some commonly used datasets and propose a series of preliminary steps to carry out before the classic machine learning pipeline in order to detect possible biases, to avoid them if possible and to report results that are more representative of the actual predictive power of the methods under analysis.
format Online
Article
Text
id pubmed-8545228
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher IEEE
record_format MEDLINE/PubMed
spelling pubmed-85452282021-11-18 Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients IEEE Access Computational and Artificial Intelligence Chest X-ray images are useful for early COVID-19 diagnosis with the advantage that X-ray devices are already available in health centers and images are obtained immediately. Some datasets containing X-ray images with cases (pneumonia or COVID-19) and controls have been made available to develop machine-learning-based methods to aid in diagnosing the disease. However, these datasets are mainly composed of different sources coming from pre-COVID-19 datasets and COVID-19 datasets. Particularly, we have detected a significant bias in some of the released datasets used to train and test diagnostic systems, which might imply that the results published are optimistic and may overestimate the actual predictive capacity of the techniques proposed. In this article, we analyze the existing bias in some commonly used datasets and propose a series of preliminary steps to carry out before the classic machine learning pipeline in order to detect possible biases, to avoid them if possible and to report results that are more representative of the actual predictive power of the methods under analysis. IEEE 2021-03-10 /pmc/articles/PMC8545228/ /pubmed/34812384 http://dx.doi.org/10.1109/ACCESS.2021.3065456 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
spellingShingle Computational and Artificial Intelligence
Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients
title Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients
title_full Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients
title_fullStr Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients
title_full_unstemmed Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients
title_short Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients
title_sort bias analysis on public x-ray image datasets of pneumonia and covid-19 patients
topic Computational and Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545228/
https://www.ncbi.nlm.nih.gov/pubmed/34812384
http://dx.doi.org/10.1109/ACCESS.2021.3065456
work_keys_str_mv AT biasanalysisonpublicxrayimagedatasetsofpneumoniaandcovid19patients
AT biasanalysisonpublicxrayimagedatasetsofpneumoniaandcovid19patients
AT biasanalysisonpublicxrayimagedatasetsofpneumoniaandcovid19patients
AT biasanalysisonpublicxrayimagedatasetsofpneumoniaandcovid19patients
AT biasanalysisonpublicxrayimagedatasetsofpneumoniaandcovid19patients
AT biasanalysisonpublicxrayimagedatasetsofpneumoniaandcovid19patients
AT biasanalysisonpublicxrayimagedatasetsofpneumoniaandcovid19patients