Cargando…

Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population

OBJECTIVES: Early identification of lung cancer on chest radiographs improves patient outcomes. Artificial intelligence (AI) tools may increase diagnostic accuracy and streamline this pathway. This study evaluated the performance of commercially available AI-based software trained to identify cancer...

Descripción completa

Detalles Bibliográficos
Autores principales: Maiter, Ahmed, Hocking, Katherine, Matthews, Suzanne, Taylor, Jonathan, Sharkey, Michael, Metherall, Peter, Alabed, Samer, Dwivedi, Krit, Shahin, Yousef, Anderson, Elizabeth, Holt, Sarah, Rowbotham, Charlotte, Kamil, Mohamed A, Hoggard, Nigel, Balasubramanian, Saba P, Swift, Andrew, Johns, Christopher S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10632826/
https://www.ncbi.nlm.nih.gov/pubmed/37940155
http://dx.doi.org/10.1136/bmjopen-2023-077348
_version_ 1785132658834014208
author Maiter, Ahmed
Hocking, Katherine
Matthews, Suzanne
Taylor, Jonathan
Sharkey, Michael
Metherall, Peter
Alabed, Samer
Dwivedi, Krit
Shahin, Yousef
Anderson, Elizabeth
Holt, Sarah
Rowbotham, Charlotte
Kamil, Mohamed A
Hoggard, Nigel
Balasubramanian, Saba P
Swift, Andrew
Johns, Christopher S
author_facet Maiter, Ahmed
Hocking, Katherine
Matthews, Suzanne
Taylor, Jonathan
Sharkey, Michael
Metherall, Peter
Alabed, Samer
Dwivedi, Krit
Shahin, Yousef
Anderson, Elizabeth
Holt, Sarah
Rowbotham, Charlotte
Kamil, Mohamed A
Hoggard, Nigel
Balasubramanian, Saba P
Swift, Andrew
Johns, Christopher S
author_sort Maiter, Ahmed
collection PubMed
description OBJECTIVES: Early identification of lung cancer on chest radiographs improves patient outcomes. Artificial intelligence (AI) tools may increase diagnostic accuracy and streamline this pathway. This study evaluated the performance of commercially available AI-based software trained to identify cancerous lung nodules on chest radiographs. DESIGN: This retrospective study included primary care chest radiographs acquired in a UK centre. The software evaluated each radiograph independently and outputs were compared with two reference standards: (1) the radiologist report and (2) the diagnosis of cancer by multidisciplinary team decision. Failure analysis was performed by interrogating the software marker locations on radiographs. PARTICIPANTS: 5722 consecutive chest radiographs were included from 5592 patients (median age 59 years, 53.8% women, 1.6% prevalence of cancer). RESULTS: Compared with radiologist reports for nodule detection, the software demonstrated sensitivity 54.5% (95% CI 44.2% to 64.4%), specificity 83.2% (82.2% to 84.1%), positive predictive value (PPV) 5.5% (4.6% to 6.6%) and negative predictive value (NPV) 99.0% (98.8% to 99.2%). Compared with cancer diagnosis, the software demonstrated sensitivity 60.9% (50.1% to 70.9%), specificity 83.3% (82.3% to 84.2%), PPV 5.6% (4.8% to 6.6%) and NPV 99.2% (99.0% to 99.4%). Normal or variant anatomy was misidentified as an abnormality in 69.9% of the 943 false positive cases. CONCLUSIONS: The software demonstrated considerable underperformance in this real-world patient cohort. Failure analysis suggested a lack of generalisability in the training and testing datasets as a potential factor. The low PPV carries the risk of over-investigation and limits the translation of the software to clinical practice. Our findings highlight the importance of training and testing software in representative datasets, with broader implications for the implementation of AI tools in imaging.
format Online
Article
Text
id pubmed-10632826
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-106328262023-11-10 Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population Maiter, Ahmed Hocking, Katherine Matthews, Suzanne Taylor, Jonathan Sharkey, Michael Metherall, Peter Alabed, Samer Dwivedi, Krit Shahin, Yousef Anderson, Elizabeth Holt, Sarah Rowbotham, Charlotte Kamil, Mohamed A Hoggard, Nigel Balasubramanian, Saba P Swift, Andrew Johns, Christopher S BMJ Open Radiology and Imaging OBJECTIVES: Early identification of lung cancer on chest radiographs improves patient outcomes. Artificial intelligence (AI) tools may increase diagnostic accuracy and streamline this pathway. This study evaluated the performance of commercially available AI-based software trained to identify cancerous lung nodules on chest radiographs. DESIGN: This retrospective study included primary care chest radiographs acquired in a UK centre. The software evaluated each radiograph independently and outputs were compared with two reference standards: (1) the radiologist report and (2) the diagnosis of cancer by multidisciplinary team decision. Failure analysis was performed by interrogating the software marker locations on radiographs. PARTICIPANTS: 5722 consecutive chest radiographs were included from 5592 patients (median age 59 years, 53.8% women, 1.6% prevalence of cancer). RESULTS: Compared with radiologist reports for nodule detection, the software demonstrated sensitivity 54.5% (95% CI 44.2% to 64.4%), specificity 83.2% (82.2% to 84.1%), positive predictive value (PPV) 5.5% (4.6% to 6.6%) and negative predictive value (NPV) 99.0% (98.8% to 99.2%). Compared with cancer diagnosis, the software demonstrated sensitivity 60.9% (50.1% to 70.9%), specificity 83.3% (82.3% to 84.2%), PPV 5.6% (4.8% to 6.6%) and NPV 99.2% (99.0% to 99.4%). Normal or variant anatomy was misidentified as an abnormality in 69.9% of the 943 false positive cases. CONCLUSIONS: The software demonstrated considerable underperformance in this real-world patient cohort. Failure analysis suggested a lack of generalisability in the training and testing datasets as a potential factor. The low PPV carries the risk of over-investigation and limits the translation of the software to clinical practice. Our findings highlight the importance of training and testing software in representative datasets, with broader implications for the implementation of AI tools in imaging. BMJ Publishing Group 2023-11-08 /pmc/articles/PMC10632826/ /pubmed/37940155 http://dx.doi.org/10.1136/bmjopen-2023-077348 Text en © Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY. Published by BMJ. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
spellingShingle Radiology and Imaging
Maiter, Ahmed
Hocking, Katherine
Matthews, Suzanne
Taylor, Jonathan
Sharkey, Michael
Metherall, Peter
Alabed, Samer
Dwivedi, Krit
Shahin, Yousef
Anderson, Elizabeth
Holt, Sarah
Rowbotham, Charlotte
Kamil, Mohamed A
Hoggard, Nigel
Balasubramanian, Saba P
Swift, Andrew
Johns, Christopher S
Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population
title Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population
title_full Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population
title_fullStr Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population
title_full_unstemmed Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population
title_short Evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world UK population
title_sort evaluating the performance of artificial intelligence software for lung nodule detection on chest radiographs in a retrospective real-world uk population
topic Radiology and Imaging
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10632826/
https://www.ncbi.nlm.nih.gov/pubmed/37940155
http://dx.doi.org/10.1136/bmjopen-2023-077348
work_keys_str_mv AT maiterahmed evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT hockingkatherine evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT matthewssuzanne evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT taylorjonathan evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT sharkeymichael evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT metherallpeter evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT alabedsamer evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT dwivedikrit evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT shahinyousef evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT andersonelizabeth evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT holtsarah evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT rowbothamcharlotte evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT kamilmohameda evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT hoggardnigel evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT balasubramaniansabap evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT swiftandrew evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation
AT johnschristophers evaluatingtheperformanceofartificialintelligencesoftwareforlungnoduledetectiononchestradiographsinaretrospectiverealworldukpopulation