Cargando…

AI-Based CXR First Reading: Current Limitations to Ensure Practical Value

We performed a multicenter external evaluation of the practical and clinical efficacy of a commercial AI algorithm for chest X-ray (CXR) analysis (Lunit INSIGHT CXR). A retrospective evaluation was performed with a multi-reader study. For a prospective evaluation, the AI model was run on CXR studies...

Descripción completa

Detalles Bibliográficos
Autores principales: Vasilev, Yuriy, Vladzymyrskyy, Anton, Omelyanskaya, Olga, Blokhin, Ivan, Kirpichev, Yury, Arzamasov, Kirill
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10138086/
https://www.ncbi.nlm.nih.gov/pubmed/37189531
http://dx.doi.org/10.3390/diagnostics13081430
_version_ 1785032623275376640
author Vasilev, Yuriy
Vladzymyrskyy, Anton
Omelyanskaya, Olga
Blokhin, Ivan
Kirpichev, Yury
Arzamasov, Kirill
author_facet Vasilev, Yuriy
Vladzymyrskyy, Anton
Omelyanskaya, Olga
Blokhin, Ivan
Kirpichev, Yury
Arzamasov, Kirill
author_sort Vasilev, Yuriy
collection PubMed
description We performed a multicenter external evaluation of the practical and clinical efficacy of a commercial AI algorithm for chest X-ray (CXR) analysis (Lunit INSIGHT CXR). A retrospective evaluation was performed with a multi-reader study. For a prospective evaluation, the AI model was run on CXR studies; the results were compared to the reports of 226 radiologists. In the multi-reader study, the area under the curve (AUC), sensitivity, and specificity of the AI were 0.94 (CI95%: 0.87–1.0), 0.9 (CI95%: 0.79–1.0), and 0.89 (CI95%: 0.79–0.98); the AUC, sensitivity, and specificity of the radiologists were 0.97 (CI95%: 0.94–1.0), 0.9 (CI95%: 0.79–1.0), and 0.95 (CI95%: 0.89–1.0). In most regions of the ROC curve, the AI performed a little worse or at the same level as an average human reader. The McNemar test showed no statistically significant differences between AI and radiologists. In the prospective study with 4752 cases, the AUC, sensitivity, and specificity of the AI were 0.84 (CI95%: 0.82–0.86), 0.77 (CI95%: 0.73–0.80), and 0.81 (CI95%: 0.80–0.82). Lower accuracy values obtained during the prospective validation were mainly associated with false-positive findings considered by experts to be clinically insignificant and the false-negative omission of human-reported “opacity”, “nodule”, and calcification. In a large-scale prospective validation of the commercial AI algorithm in clinical practice, lower sensitivity and specificity values were obtained compared to the prior retrospective evaluation of the data of the same population.
format Online
Article
Text
id pubmed-10138086
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101380862023-04-28 AI-Based CXR First Reading: Current Limitations to Ensure Practical Value Vasilev, Yuriy Vladzymyrskyy, Anton Omelyanskaya, Olga Blokhin, Ivan Kirpichev, Yury Arzamasov, Kirill Diagnostics (Basel) Article We performed a multicenter external evaluation of the practical and clinical efficacy of a commercial AI algorithm for chest X-ray (CXR) analysis (Lunit INSIGHT CXR). A retrospective evaluation was performed with a multi-reader study. For a prospective evaluation, the AI model was run on CXR studies; the results were compared to the reports of 226 radiologists. In the multi-reader study, the area under the curve (AUC), sensitivity, and specificity of the AI were 0.94 (CI95%: 0.87–1.0), 0.9 (CI95%: 0.79–1.0), and 0.89 (CI95%: 0.79–0.98); the AUC, sensitivity, and specificity of the radiologists were 0.97 (CI95%: 0.94–1.0), 0.9 (CI95%: 0.79–1.0), and 0.95 (CI95%: 0.89–1.0). In most regions of the ROC curve, the AI performed a little worse or at the same level as an average human reader. The McNemar test showed no statistically significant differences between AI and radiologists. In the prospective study with 4752 cases, the AUC, sensitivity, and specificity of the AI were 0.84 (CI95%: 0.82–0.86), 0.77 (CI95%: 0.73–0.80), and 0.81 (CI95%: 0.80–0.82). Lower accuracy values obtained during the prospective validation were mainly associated with false-positive findings considered by experts to be clinically insignificant and the false-negative omission of human-reported “opacity”, “nodule”, and calcification. In a large-scale prospective validation of the commercial AI algorithm in clinical practice, lower sensitivity and specificity values were obtained compared to the prior retrospective evaluation of the data of the same population. MDPI 2023-04-16 /pmc/articles/PMC10138086/ /pubmed/37189531 http://dx.doi.org/10.3390/diagnostics13081430 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Vasilev, Yuriy
Vladzymyrskyy, Anton
Omelyanskaya, Olga
Blokhin, Ivan
Kirpichev, Yury
Arzamasov, Kirill
AI-Based CXR First Reading: Current Limitations to Ensure Practical Value
title AI-Based CXR First Reading: Current Limitations to Ensure Practical Value
title_full AI-Based CXR First Reading: Current Limitations to Ensure Practical Value
title_fullStr AI-Based CXR First Reading: Current Limitations to Ensure Practical Value
title_full_unstemmed AI-Based CXR First Reading: Current Limitations to Ensure Practical Value
title_short AI-Based CXR First Reading: Current Limitations to Ensure Practical Value
title_sort ai-based cxr first reading: current limitations to ensure practical value
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10138086/
https://www.ncbi.nlm.nih.gov/pubmed/37189531
http://dx.doi.org/10.3390/diagnostics13081430
work_keys_str_mv AT vasilevyuriy aibasedcxrfirstreadingcurrentlimitationstoensurepracticalvalue
AT vladzymyrskyyanton aibasedcxrfirstreadingcurrentlimitationstoensurepracticalvalue
AT omelyanskayaolga aibasedcxrfirstreadingcurrentlimitationstoensurepracticalvalue
AT blokhinivan aibasedcxrfirstreadingcurrentlimitationstoensurepracticalvalue
AT kirpichevyury aibasedcxrfirstreadingcurrentlimitationstoensurepracticalvalue
AT arzamasovkirill aibasedcxrfirstreadingcurrentlimitationstoensurepracticalvalue