Cargando…

Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study

PURPOSE: To conduct a prospective observational study across 12 U.S. hospitals to evaluate real-time performance of an interpretable artificial intelligence (AI) model to detect COVID-19 on chest radiographs. MATERIALS AND METHODS: A total of 95 363 chest radiographs were included in model training,...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Ju, Peng, Le, Li, Taihui, Adila, Dyah, Zaiman, Zach, Melton-Meaux, Genevieve B., Ingraham, Nicholas E., Murray, Eric, Boley, Daniel, Switzer, Sean, Burns, John L., Huang, Kun, Allen, Tadashi, Steenburg, Scott D., Gichoya, Judy Wawira, Kummerfeld, Erich, Tignanelli, Christopher J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Radiological Society of North America 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344211/
https://www.ncbi.nlm.nih.gov/pubmed/35923381
http://dx.doi.org/10.1148/ryai.210217
_version_ 1784761169483923456
author Sun, Ju
Peng, Le
Li, Taihui
Adila, Dyah
Zaiman, Zach
Melton-Meaux, Genevieve B.
Ingraham, Nicholas E.
Murray, Eric
Boley, Daniel
Switzer, Sean
Burns, John L.
Huang, Kun
Allen, Tadashi
Steenburg, Scott D.
Gichoya, Judy Wawira
Kummerfeld, Erich
Tignanelli, Christopher J.
author_facet Sun, Ju
Peng, Le
Li, Taihui
Adila, Dyah
Zaiman, Zach
Melton-Meaux, Genevieve B.
Ingraham, Nicholas E.
Murray, Eric
Boley, Daniel
Switzer, Sean
Burns, John L.
Huang, Kun
Allen, Tadashi
Steenburg, Scott D.
Gichoya, Judy Wawira
Kummerfeld, Erich
Tignanelli, Christopher J.
author_sort Sun, Ju
collection PubMed
description PURPOSE: To conduct a prospective observational study across 12 U.S. hospitals to evaluate real-time performance of an interpretable artificial intelligence (AI) model to detect COVID-19 on chest radiographs. MATERIALS AND METHODS: A total of 95 363 chest radiographs were included in model training, external validation, and real-time validation. The model was deployed as a clinical decision support system, and performance was prospectively evaluated. There were 5335 total real-time predictions and a COVID-19 prevalence of 4.8% (258 of 5335). Model performance was assessed with use of receiver operating characteristic analysis, precision-recall curves, and F1 score. Logistic regression was used to evaluate the association of race and sex with AI model diagnostic accuracy. To compare model accuracy with the performance of board-certified radiologists, a third dataset of 1638 images was read independently by two radiologists. RESULTS: Participants positive for COVID-19 had higher COVID-19 diagnostic scores than participants negative for COVID-19 (median, 0.1 [IQR, 0.0–0.8] vs 0.0 [IQR, 0.0–0.1], respectively; P < .001). Real-time model performance was unchanged over 19 weeks of implementation (area under the receiver operating characteristic curve, 0.70; 95% CI: 0.66, 0.73). Model sensitivity was higher in men than women (P = .01), whereas model specificity was higher in women (P = .001). Sensitivity was higher for Asian (P = .002) and Black (P = .046) participants compared with White participants. The COVID-19 AI diagnostic system had worse accuracy (63.5% correct) compared with radiologist predictions (radiologist 1 = 67.8% correct, radiologist 2 = 68.6% correct; McNemar P < .001 for both). CONCLUSION: AI-based tools have not yet reached full diagnostic potential for COVID-19 and underperform compared with radiologist prediction. Keywords: Diagnosis, Classification, Application Domain, Infection, Lung Supplemental material is available for this article.. © RSNA, 2022
format Online
Article
Text
id pubmed-9344211
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Radiological Society of North America
record_format MEDLINE/PubMed
spelling pubmed-93442112022-08-02 Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study Sun, Ju Peng, Le Li, Taihui Adila, Dyah Zaiman, Zach Melton-Meaux, Genevieve B. Ingraham, Nicholas E. Murray, Eric Boley, Daniel Switzer, Sean Burns, John L. Huang, Kun Allen, Tadashi Steenburg, Scott D. Gichoya, Judy Wawira Kummerfeld, Erich Tignanelli, Christopher J. Radiol Artif Intell Original Research PURPOSE: To conduct a prospective observational study across 12 U.S. hospitals to evaluate real-time performance of an interpretable artificial intelligence (AI) model to detect COVID-19 on chest radiographs. MATERIALS AND METHODS: A total of 95 363 chest radiographs were included in model training, external validation, and real-time validation. The model was deployed as a clinical decision support system, and performance was prospectively evaluated. There were 5335 total real-time predictions and a COVID-19 prevalence of 4.8% (258 of 5335). Model performance was assessed with use of receiver operating characteristic analysis, precision-recall curves, and F1 score. Logistic regression was used to evaluate the association of race and sex with AI model diagnostic accuracy. To compare model accuracy with the performance of board-certified radiologists, a third dataset of 1638 images was read independently by two radiologists. RESULTS: Participants positive for COVID-19 had higher COVID-19 diagnostic scores than participants negative for COVID-19 (median, 0.1 [IQR, 0.0–0.8] vs 0.0 [IQR, 0.0–0.1], respectively; P < .001). Real-time model performance was unchanged over 19 weeks of implementation (area under the receiver operating characteristic curve, 0.70; 95% CI: 0.66, 0.73). Model sensitivity was higher in men than women (P = .01), whereas model specificity was higher in women (P = .001). Sensitivity was higher for Asian (P = .002) and Black (P = .046) participants compared with White participants. The COVID-19 AI diagnostic system had worse accuracy (63.5% correct) compared with radiologist predictions (radiologist 1 = 67.8% correct, radiologist 2 = 68.6% correct; McNemar P < .001 for both). CONCLUSION: AI-based tools have not yet reached full diagnostic potential for COVID-19 and underperform compared with radiologist prediction. Keywords: Diagnosis, Classification, Application Domain, Infection, Lung Supplemental material is available for this article.. © RSNA, 2022 Radiological Society of North America 2022-06-01 /pmc/articles/PMC9344211/ /pubmed/35923381 http://dx.doi.org/10.1148/ryai.210217 Text en © 2022 by the Radiological Society of North America, Inc. This article is made available via the PMC Open Access Subset for unrestricted re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the COVID-19 pandemic or until permissions are revoked in writing. Upon expiration of these permissions, PMC is granted a perpetual license to make this article available via PMC and Europe PMC, consistent with existing copyright protections.
spellingShingle Original Research
Sun, Ju
Peng, Le
Li, Taihui
Adila, Dyah
Zaiman, Zach
Melton-Meaux, Genevieve B.
Ingraham, Nicholas E.
Murray, Eric
Boley, Daniel
Switzer, Sean
Burns, John L.
Huang, Kun
Allen, Tadashi
Steenburg, Scott D.
Gichoya, Judy Wawira
Kummerfeld, Erich
Tignanelli, Christopher J.
Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
title Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
title_full Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
title_fullStr Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
title_full_unstemmed Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
title_short Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
title_sort performance of a chest radiograph ai diagnostic tool for covid-19: a prospective observational study
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344211/
https://www.ncbi.nlm.nih.gov/pubmed/35923381
http://dx.doi.org/10.1148/ryai.210217
work_keys_str_mv AT sunju performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT pengle performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT litaihui performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT adiladyah performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT zaimanzach performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT meltonmeauxgenevieveb performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT ingrahamnicholase performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT murrayeric performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT boleydaniel performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT switzersean performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT burnsjohnl performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT huangkun performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT allentadashi performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT steenburgscottd performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT gichoyajudywawira performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT kummerfelderich performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy
AT tignanellichristopherj performanceofachestradiographaidiagnostictoolforcovid19aprospectiveobservationalstudy