Cargando…

Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability

Since the outbreak of the COVID-19 pandemic, worldwide research efforts have focused on using artificial intelligence (AI) technologies on various medical data of COVID-19–positive patients in order to identify or classify various aspects of the disease, with promising reported results. However, con...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nguyen, Dan, Kay, Fernando, Tan, Jun, Yan, Yulong, Ng, Yee Seng, Iyengar, Puneeth, Peshock, Ron, Jiang, Steve
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2021
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275994/ https://www.ncbi.nlm.nih.gov/pubmed/34268489 http://dx.doi.org/10.3389/frai.2021.694875

_version_	1783721824862863360
author	Nguyen, Dan Kay, Fernando Tan, Jun Yan, Yulong Ng, Yee Seng Iyengar, Puneeth Peshock, Ron Jiang, Steve
author_facet	Nguyen, Dan Kay, Fernando Tan, Jun Yan, Yulong Ng, Yee Seng Iyengar, Puneeth Peshock, Ron Jiang, Steve
author_sort	Nguyen, Dan
collection	PubMed
description	Since the outbreak of the COVID-19 pandemic, worldwide research efforts have focused on using artificial intelligence (AI) technologies on various medical data of COVID-19–positive patients in order to identify or classify various aspects of the disease, with promising reported results. However, concerns have been raised over their generalizability, given the heterogeneous factors in training datasets. This study aims to examine the severity of this problem by evaluating deep learning (DL) classification models trained to identify COVID-19–positive patients on 3D computed tomography (CT) datasets from different countries. We collected one dataset at UT Southwestern (UTSW) and three external datasets from different countries: CC-CCII Dataset (China), COVID-CTset (Iran), and MosMedData (Russia). We divided the data into two classes: COVID-19–positive and COVID-19–negative patients. We trained nine identical DL-based classification models by using combinations of datasets with a 72% train, 8% validation, and 20% test data split. The models trained on a single dataset achieved accuracy/area under the receiver operating characteristic curve (AUC) values of 0.87/0.826 (UTSW), 0.97/0.988 (CC-CCCI), and 0.86/0.873 (COVID-CTset) when evaluated on their own dataset. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better. However, the performance dropped close to an AUC of 0.5 (random guess) for all models when evaluated on a different dataset outside of its training datasets. Including MosMedData, which only contained positive labels, into the training datasets did not necessarily help the performance of other datasets. Multiple factors likely contributed to these results, such as patient demographics and differences in image acquisition or reconstruction, causing a data shift among different study cohorts.
format	Online Article Text
id	pubmed-8275994
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-82759942021-07-14 Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability Nguyen, Dan Kay, Fernando Tan, Jun Yan, Yulong Ng, Yee Seng Iyengar, Puneeth Peshock, Ron Jiang, Steve Front Artif Intell Artificial Intelligence Since the outbreak of the COVID-19 pandemic, worldwide research efforts have focused on using artificial intelligence (AI) technologies on various medical data of COVID-19–positive patients in order to identify or classify various aspects of the disease, with promising reported results. However, concerns have been raised over their generalizability, given the heterogeneous factors in training datasets. This study aims to examine the severity of this problem by evaluating deep learning (DL) classification models trained to identify COVID-19–positive patients on 3D computed tomography (CT) datasets from different countries. We collected one dataset at UT Southwestern (UTSW) and three external datasets from different countries: CC-CCII Dataset (China), COVID-CTset (Iran), and MosMedData (Russia). We divided the data into two classes: COVID-19–positive and COVID-19–negative patients. We trained nine identical DL-based classification models by using combinations of datasets with a 72% train, 8% validation, and 20% test data split. The models trained on a single dataset achieved accuracy/area under the receiver operating characteristic curve (AUC) values of 0.87/0.826 (UTSW), 0.97/0.988 (CC-CCCI), and 0.86/0.873 (COVID-CTset) when evaluated on their own dataset. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better. However, the performance dropped close to an AUC of 0.5 (random guess) for all models when evaluated on a different dataset outside of its training datasets. Including MosMedData, which only contained positive labels, into the training datasets did not necessarily help the performance of other datasets. Multiple factors likely contributed to these results, such as patient demographics and differences in image acquisition or reconstruction, causing a data shift among different study cohorts. Frontiers Media S.A. 2021-06-29 /pmc/articles/PMC8275994/ /pubmed/34268489 http://dx.doi.org/10.3389/frai.2021.694875 Text en Copyright © 2021 Nguyen, Kay, Tan, Yan, Ng, Iyengar, Peshock and Jiang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Nguyen, Dan Kay, Fernando Tan, Jun Yan, Yulong Ng, Yee Seng Iyengar, Puneeth Peshock, Ron Jiang, Steve Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title	Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_full	Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_fullStr	Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_full_unstemmed	Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_short	Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_sort	deep learning–based covid-19 pneumonia classification using chest ct images: model generalizability
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275994/ https://www.ncbi.nlm.nih.gov/pubmed/34268489 http://dx.doi.org/10.3389/frai.2021.694875
work_keys_str_mv	AT nguyendan deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability AT kayfernando deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability AT tanjun deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability AT yanyulong deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability AT ngyeeseng deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability AT iyengarpuneeth deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability AT peshockron deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability AT jiangsteve deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability

Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability

Ejemplares similares