Cargando…

Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort

PURPOSE: This study evaluated the performance of a commercially available deep-learning algorithm (DLA) (Insight CXR, Lunit, Seoul, South Korea) for referable thoracic abnormalities on chest X-ray (CXR) using a consecutively collected multicenter health screening cohort. METHODS AND MATERIALS: A con...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Eun Young, Kim, Young Jae, Choi, Won-Jun, Lee, Gi Pyo, Choi, Ye Ra, Jin, Kwang Nam, Cho, Young Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7894861/
https://www.ncbi.nlm.nih.gov/pubmed/33606779
http://dx.doi.org/10.1371/journal.pone.0246472
_version_ 1783653313893367808
author Kim, Eun Young
Kim, Young Jae
Choi, Won-Jun
Lee, Gi Pyo
Choi, Ye Ra
Jin, Kwang Nam
Cho, Young Jun
author_facet Kim, Eun Young
Kim, Young Jae
Choi, Won-Jun
Lee, Gi Pyo
Choi, Ye Ra
Jin, Kwang Nam
Cho, Young Jun
author_sort Kim, Eun Young
collection PubMed
description PURPOSE: This study evaluated the performance of a commercially available deep-learning algorithm (DLA) (Insight CXR, Lunit, Seoul, South Korea) for referable thoracic abnormalities on chest X-ray (CXR) using a consecutively collected multicenter health screening cohort. METHODS AND MATERIALS: A consecutive health screening cohort of participants who underwent both CXR and chest computed tomography (CT) within 1 month was retrospectively collected from three institutions’ health care clinics (n = 5,887). Referable thoracic abnormalities were defined as any radiologic findings requiring further diagnostic evaluation or management, including DLA-target lesions of nodule/mass, consolidation, or pneumothorax. We evaluated the diagnostic performance of the DLA for referable thoracic abnormalities using the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, and specificity using ground truth based on chest CT (CT-GT). In addition, for CT-GT-positive cases, three independent radiologist readings were performed on CXR and clear visible (when more than two radiologists called) and visible (at least one radiologist called) abnormalities were defined as CXR-GTs (clear visible CXR-GT and visible CXR-GT, respectively) to evaluate the performance of the DLA. RESULTS: Among 5,887 subjects (4,329 males; mean age 54±11 years), referable thoracic abnormalities were found in 618 (10.5%) based on CT-GT. DLA-target lesions were observed in 223 (4.0%), nodule/mass in 202 (3.4%), consolidation in 31 (0.5%), pneumothorax in one 1 (<0.1%), and DLA-non-target lesions in 409 (6.9%). For referable thoracic abnormalities based on CT-GT, the DLA showed an AUC of 0.771 (95% confidence interval [CI], 0.751–0.791), a sensitivity of 69.6%, and a specificity of 74.0%. Based on CXR-GT, the prevalence of referable thoracic abnormalities decreased, with visible and clear visible abnormalities found in 405 (6.9%) and 227 (3.9%) cases, respectively. The performance of the DLA increased significantly when using CXR-GTs, with an AUC of 0.839 (95% CI, 0.829–0.848), a sensitivity of 82.7%, and s specificity of 73.2% based on visible CXR-GT and an AUC of 0.872 (95% CI, 0.863–0.880, P <0.001 for the AUC comparison of GT-CT vs. clear visible CXR-GT), a sensitivity of 83.3%, and a specificity of 78.8% based on clear visible CXR-GT. CONCLUSION: The DLA provided fair-to-good stand-alone performance for the detection of referable thoracic abnormalities in a multicenter consecutive health screening cohort. The DLA showed varied performance according to the different methods of ground truth.
format Online
Article
Text
id pubmed-7894861
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-78948612021-03-01 Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort Kim, Eun Young Kim, Young Jae Choi, Won-Jun Lee, Gi Pyo Choi, Ye Ra Jin, Kwang Nam Cho, Young Jun PLoS One Research Article PURPOSE: This study evaluated the performance of a commercially available deep-learning algorithm (DLA) (Insight CXR, Lunit, Seoul, South Korea) for referable thoracic abnormalities on chest X-ray (CXR) using a consecutively collected multicenter health screening cohort. METHODS AND MATERIALS: A consecutive health screening cohort of participants who underwent both CXR and chest computed tomography (CT) within 1 month was retrospectively collected from three institutions’ health care clinics (n = 5,887). Referable thoracic abnormalities were defined as any radiologic findings requiring further diagnostic evaluation or management, including DLA-target lesions of nodule/mass, consolidation, or pneumothorax. We evaluated the diagnostic performance of the DLA for referable thoracic abnormalities using the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, and specificity using ground truth based on chest CT (CT-GT). In addition, for CT-GT-positive cases, three independent radiologist readings were performed on CXR and clear visible (when more than two radiologists called) and visible (at least one radiologist called) abnormalities were defined as CXR-GTs (clear visible CXR-GT and visible CXR-GT, respectively) to evaluate the performance of the DLA. RESULTS: Among 5,887 subjects (4,329 males; mean age 54±11 years), referable thoracic abnormalities were found in 618 (10.5%) based on CT-GT. DLA-target lesions were observed in 223 (4.0%), nodule/mass in 202 (3.4%), consolidation in 31 (0.5%), pneumothorax in one 1 (<0.1%), and DLA-non-target lesions in 409 (6.9%). For referable thoracic abnormalities based on CT-GT, the DLA showed an AUC of 0.771 (95% confidence interval [CI], 0.751–0.791), a sensitivity of 69.6%, and a specificity of 74.0%. Based on CXR-GT, the prevalence of referable thoracic abnormalities decreased, with visible and clear visible abnormalities found in 405 (6.9%) and 227 (3.9%) cases, respectively. The performance of the DLA increased significantly when using CXR-GTs, with an AUC of 0.839 (95% CI, 0.829–0.848), a sensitivity of 82.7%, and s specificity of 73.2% based on visible CXR-GT and an AUC of 0.872 (95% CI, 0.863–0.880, P <0.001 for the AUC comparison of GT-CT vs. clear visible CXR-GT), a sensitivity of 83.3%, and a specificity of 78.8% based on clear visible CXR-GT. CONCLUSION: The DLA provided fair-to-good stand-alone performance for the detection of referable thoracic abnormalities in a multicenter consecutive health screening cohort. The DLA showed varied performance according to the different methods of ground truth. Public Library of Science 2021-02-19 /pmc/articles/PMC7894861/ /pubmed/33606779 http://dx.doi.org/10.1371/journal.pone.0246472 Text en © 2021 Kim et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Kim, Eun Young
Kim, Young Jae
Choi, Won-Jun
Lee, Gi Pyo
Choi, Ye Ra
Jin, Kwang Nam
Cho, Young Jun
Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort
title Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort
title_full Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort
title_fullStr Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort
title_full_unstemmed Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort
title_short Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort
title_sort performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: a multicenter study of a health screening cohort
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7894861/
https://www.ncbi.nlm.nih.gov/pubmed/33606779
http://dx.doi.org/10.1371/journal.pone.0246472
work_keys_str_mv AT kimeunyoung performanceofadeeplearningalgorithmforreferablethoracicabnormalitiesonchestradiographsamulticenterstudyofahealthscreeningcohort
AT kimyoungjae performanceofadeeplearningalgorithmforreferablethoracicabnormalitiesonchestradiographsamulticenterstudyofahealthscreeningcohort
AT choiwonjun performanceofadeeplearningalgorithmforreferablethoracicabnormalitiesonchestradiographsamulticenterstudyofahealthscreeningcohort
AT leegipyo performanceofadeeplearningalgorithmforreferablethoracicabnormalitiesonchestradiographsamulticenterstudyofahealthscreeningcohort
AT choiyera performanceofadeeplearningalgorithmforreferablethoracicabnormalitiesonchestradiographsamulticenterstudyofahealthscreeningcohort
AT jinkwangnam performanceofadeeplearningalgorithmforreferablethoracicabnormalitiesonchestradiographsamulticenterstudyofahealthscreeningcohort
AT choyoungjun performanceofadeeplearningalgorithmforreferablethoracicabnormalitiesonchestradiographsamulticenterstudyofahealthscreeningcohort