Cargando…

Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy

OBJECTIVE: To examine the accuracy of artificial intelligence (AI) for the detection of breast cancer in mammography screening practice. DESIGN: Systematic review of test accuracy studies. DATA SOURCES: Medline, Embase, Web of Science, and Cochrane Database of Systematic Reviews from 1 January 2010...

Descripción completa

Detalles Bibliográficos
Autores principales: Freeman, Karoline, Geppert, Julia, Stinton, Chris, Todkill, Daniel, Johnson, Samantha, Clarke, Aileen, Taylor-Phillips, Sian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group Ltd. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8409323/
https://www.ncbi.nlm.nih.gov/pubmed/34470740
http://dx.doi.org/10.1136/bmj.n1872
_version_ 1783746975621971968
author Freeman, Karoline
Geppert, Julia
Stinton, Chris
Todkill, Daniel
Johnson, Samantha
Clarke, Aileen
Taylor-Phillips, Sian
author_facet Freeman, Karoline
Geppert, Julia
Stinton, Chris
Todkill, Daniel
Johnson, Samantha
Clarke, Aileen
Taylor-Phillips, Sian
author_sort Freeman, Karoline
collection PubMed
description OBJECTIVE: To examine the accuracy of artificial intelligence (AI) for the detection of breast cancer in mammography screening practice. DESIGN: Systematic review of test accuracy studies. DATA SOURCES: Medline, Embase, Web of Science, and Cochrane Database of Systematic Reviews from 1 January 2010 to 17 May 2021. ELIGIBILITY CRITERIA: Studies reporting test accuracy of AI algorithms, alone or in combination with radiologists, to detect cancer in women’s digital mammograms in screening practice, or in test sets. Reference standard was biopsy with histology or follow-up (for screen negative women). Outcomes included test accuracy and cancer type detected. STUDY SELECTION AND SYNTHESIS: Two reviewers independently assessed articles for inclusion and assessed the methodological quality of included studies using the QUality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. A single reviewer extracted data, which were checked by a second reviewer. Narrative data synthesis was performed. RESULTS: Twelve studies totalling 131 822 screened women were included. No prospective studies measuring test accuracy of AI in screening practice were found. Studies were of poor methodological quality. Three retrospective studies compared AI systems with the clinical decisions of the original radiologist, including 79 910 women, of whom 1878 had screen detected cancer or interval cancer within 12 months of screening. Thirty four (94%) of 36 AI systems evaluated in these studies were less accurate than a single radiologist, and all were less accurate than consensus of two or more radiologists. Five smaller studies (1086 women, 520 cancers) at high risk of bias and low generalisability to the clinical context reported that all five evaluated AI systems (as standalone to replace radiologist or as a reader aid) were more accurate than a single radiologist reading a test set in the laboratory. In three studies, AI used for triage screened out 53%, 45%, and 50% of women at low risk but also 10%, 4%, and 0% of cancers detected by radiologists. CONCLUSIONS: Current evidence for AI does not yet allow judgement of its accuracy in breast cancer screening programmes, and it is unclear where on the clinical pathway AI might be of most benefit. AI systems are not sufficiently specific to replace radiologist double reading in screening programmes. Promising results in smaller studies are not replicated in larger studies. Prospective studies are required to measure the effect of AI in clinical practice. Such studies will require clear stopping rules to ensure that AI does not reduce programme specificity. STUDY REGISTRATION: Protocol registered as PROSPERO CRD42020213590.
format Online
Article
Text
id pubmed-8409323
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BMJ Publishing Group Ltd.
record_format MEDLINE/PubMed
spelling pubmed-84093232021-09-16 Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy Freeman, Karoline Geppert, Julia Stinton, Chris Todkill, Daniel Johnson, Samantha Clarke, Aileen Taylor-Phillips, Sian BMJ Research OBJECTIVE: To examine the accuracy of artificial intelligence (AI) for the detection of breast cancer in mammography screening practice. DESIGN: Systematic review of test accuracy studies. DATA SOURCES: Medline, Embase, Web of Science, and Cochrane Database of Systematic Reviews from 1 January 2010 to 17 May 2021. ELIGIBILITY CRITERIA: Studies reporting test accuracy of AI algorithms, alone or in combination with radiologists, to detect cancer in women’s digital mammograms in screening practice, or in test sets. Reference standard was biopsy with histology or follow-up (for screen negative women). Outcomes included test accuracy and cancer type detected. STUDY SELECTION AND SYNTHESIS: Two reviewers independently assessed articles for inclusion and assessed the methodological quality of included studies using the QUality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. A single reviewer extracted data, which were checked by a second reviewer. Narrative data synthesis was performed. RESULTS: Twelve studies totalling 131 822 screened women were included. No prospective studies measuring test accuracy of AI in screening practice were found. Studies were of poor methodological quality. Three retrospective studies compared AI systems with the clinical decisions of the original radiologist, including 79 910 women, of whom 1878 had screen detected cancer or interval cancer within 12 months of screening. Thirty four (94%) of 36 AI systems evaluated in these studies were less accurate than a single radiologist, and all were less accurate than consensus of two or more radiologists. Five smaller studies (1086 women, 520 cancers) at high risk of bias and low generalisability to the clinical context reported that all five evaluated AI systems (as standalone to replace radiologist or as a reader aid) were more accurate than a single radiologist reading a test set in the laboratory. In three studies, AI used for triage screened out 53%, 45%, and 50% of women at low risk but also 10%, 4%, and 0% of cancers detected by radiologists. CONCLUSIONS: Current evidence for AI does not yet allow judgement of its accuracy in breast cancer screening programmes, and it is unclear where on the clinical pathway AI might be of most benefit. AI systems are not sufficiently specific to replace radiologist double reading in screening programmes. Promising results in smaller studies are not replicated in larger studies. Prospective studies are required to measure the effect of AI in clinical practice. Such studies will require clear stopping rules to ensure that AI does not reduce programme specificity. STUDY REGISTRATION: Protocol registered as PROSPERO CRD42020213590. BMJ Publishing Group Ltd. 2021-09-02 /pmc/articles/PMC8409323/ /pubmed/34470740 http://dx.doi.org/10.1136/bmj.n1872 Text en © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Research
Freeman, Karoline
Geppert, Julia
Stinton, Chris
Todkill, Daniel
Johnson, Samantha
Clarke, Aileen
Taylor-Phillips, Sian
Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
title Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
title_full Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
title_fullStr Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
title_full_unstemmed Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
title_short Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
title_sort use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8409323/
https://www.ncbi.nlm.nih.gov/pubmed/34470740
http://dx.doi.org/10.1136/bmj.n1872
work_keys_str_mv AT freemankaroline useofartificialintelligenceforimageanalysisinbreastcancerscreeningprogrammessystematicreviewoftestaccuracy
AT geppertjulia useofartificialintelligenceforimageanalysisinbreastcancerscreeningprogrammessystematicreviewoftestaccuracy
AT stintonchris useofartificialintelligenceforimageanalysisinbreastcancerscreeningprogrammessystematicreviewoftestaccuracy
AT todkilldaniel useofartificialintelligenceforimageanalysisinbreastcancerscreeningprogrammessystematicreviewoftestaccuracy
AT johnsonsamantha useofartificialintelligenceforimageanalysisinbreastcancerscreeningprogrammessystematicreviewoftestaccuracy
AT clarkeaileen useofartificialintelligenceforimageanalysisinbreastcancerscreeningprogrammessystematicreviewoftestaccuracy
AT taylorphillipssian useofartificialintelligenceforimageanalysisinbreastcancerscreeningprogrammessystematicreviewoftestaccuracy