Cargando…

A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images

IMPORTANCE: Breast cancer screening is among the most common radiological tasks, with more than 39 million examinations performed each year. While it has been among the most studied medical imaging applications of artificial intelligence, the development and evaluation of algorithms are hindered by...

Descripción completa

Detalles Bibliográficos
Autores principales: Buda, Mateusz, Saha, Ashirbani, Walsh, Ruth, Ghate, Sujata, Li, Nianyi, Święcicki, Albert, Lo, Joseph Y., Mazurowski, Maciej A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Association 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369362/
https://www.ncbi.nlm.nih.gov/pubmed/34398205
http://dx.doi.org/10.1001/jamanetworkopen.2021.19100
_version_ 1783739278560329728
author Buda, Mateusz
Saha, Ashirbani
Walsh, Ruth
Ghate, Sujata
Li, Nianyi
Święcicki, Albert
Lo, Joseph Y.
Mazurowski, Maciej A.
author_facet Buda, Mateusz
Saha, Ashirbani
Walsh, Ruth
Ghate, Sujata
Li, Nianyi
Święcicki, Albert
Lo, Joseph Y.
Mazurowski, Maciej A.
author_sort Buda, Mateusz
collection PubMed
description IMPORTANCE: Breast cancer screening is among the most common radiological tasks, with more than 39 million examinations performed each year. While it has been among the most studied medical imaging applications of artificial intelligence, the development and evaluation of algorithms are hindered by the lack of well-annotated, large-scale publicly available data sets. OBJECTIVES: To curate, annotate, and make publicly available a large-scale data set of digital breast tomosynthesis (DBT) images to facilitate the development and evaluation of artificial intelligence algorithms for breast cancer screening; to develop a baseline deep learning model for breast cancer detection; and to test this model using the data set to serve as a baseline for future research. DESIGN, SETTING, AND PARTICIPANTS: In this diagnostic study, 16 802 DBT examinations with at least 1 reconstruction view available, performed between August 26, 2014, and January 29, 2018, were obtained from Duke Health System and analyzed. From the initial cohort, examinations were divided into 4 groups and split into training and test sets for the development and evaluation of a deep learning model. Images with foreign objects or spot compression views were excluded. Data analysis was conducted from January 2018 to October 2020. EXPOSURES: Screening DBT. MAIN OUTCOMES AND MEASURES: The detection algorithm was evaluated with breast-based free-response receiver operating characteristic curve and sensitivity at 2 false positives per volume. RESULTS: The curated data set contained 22 032 reconstructed DBT volumes that belonged to 5610 studies from 5060 patients with a mean (SD) age of 55 (11) years and 5059 (100.0%) women. This included 4 groups of studies: (1) 5129 (91.4%) normal studies; (2) 280 (5.0%) actionable studies, for which where additional imaging was needed but no biopsy was performed; (3) 112 (2.0%) benign biopsied studies; and (4) 89 studies (1.6%) with cancer. Our data set included masses and architectural distortions that were annotated by 2 experienced radiologists. Our deep learning model reached breast-based sensitivity of 65% (39 of 60; 95% CI, 56%-74%) at 2 false positives per DBT volume on a test set of 460 examinations from 418 patients. CONCLUSIONS AND RELEVANCE: The large, diverse, and curated data set presented in this study could facilitate the development and evaluation of artificial intelligence algorithms for breast cancer screening by providing data for training as well as a common set of cases for model validation. The performance of the model developed in this study showed that the task remains challenging; its performance could serve as a baseline for future model development.
format Online
Article
Text
id pubmed-8369362
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Medical Association
record_format MEDLINE/PubMed
spelling pubmed-83693622021-08-30 A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images Buda, Mateusz Saha, Ashirbani Walsh, Ruth Ghate, Sujata Li, Nianyi Święcicki, Albert Lo, Joseph Y. Mazurowski, Maciej A. JAMA Netw Open Original Investigation IMPORTANCE: Breast cancer screening is among the most common radiological tasks, with more than 39 million examinations performed each year. While it has been among the most studied medical imaging applications of artificial intelligence, the development and evaluation of algorithms are hindered by the lack of well-annotated, large-scale publicly available data sets. OBJECTIVES: To curate, annotate, and make publicly available a large-scale data set of digital breast tomosynthesis (DBT) images to facilitate the development and evaluation of artificial intelligence algorithms for breast cancer screening; to develop a baseline deep learning model for breast cancer detection; and to test this model using the data set to serve as a baseline for future research. DESIGN, SETTING, AND PARTICIPANTS: In this diagnostic study, 16 802 DBT examinations with at least 1 reconstruction view available, performed between August 26, 2014, and January 29, 2018, were obtained from Duke Health System and analyzed. From the initial cohort, examinations were divided into 4 groups and split into training and test sets for the development and evaluation of a deep learning model. Images with foreign objects or spot compression views were excluded. Data analysis was conducted from January 2018 to October 2020. EXPOSURES: Screening DBT. MAIN OUTCOMES AND MEASURES: The detection algorithm was evaluated with breast-based free-response receiver operating characteristic curve and sensitivity at 2 false positives per volume. RESULTS: The curated data set contained 22 032 reconstructed DBT volumes that belonged to 5610 studies from 5060 patients with a mean (SD) age of 55 (11) years and 5059 (100.0%) women. This included 4 groups of studies: (1) 5129 (91.4%) normal studies; (2) 280 (5.0%) actionable studies, for which where additional imaging was needed but no biopsy was performed; (3) 112 (2.0%) benign biopsied studies; and (4) 89 studies (1.6%) with cancer. Our data set included masses and architectural distortions that were annotated by 2 experienced radiologists. Our deep learning model reached breast-based sensitivity of 65% (39 of 60; 95% CI, 56%-74%) at 2 false positives per DBT volume on a test set of 460 examinations from 418 patients. CONCLUSIONS AND RELEVANCE: The large, diverse, and curated data set presented in this study could facilitate the development and evaluation of artificial intelligence algorithms for breast cancer screening by providing data for training as well as a common set of cases for model validation. The performance of the model developed in this study showed that the task remains challenging; its performance could serve as a baseline for future model development. American Medical Association 2021-08-16 /pmc/articles/PMC8369362/ /pubmed/34398205 http://dx.doi.org/10.1001/jamanetworkopen.2021.19100 Text en Copyright 2021 Buda M et al. JAMA Network Open. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the CC-BY License.
spellingShingle Original Investigation
Buda, Mateusz
Saha, Ashirbani
Walsh, Ruth
Ghate, Sujata
Li, Nianyi
Święcicki, Albert
Lo, Joseph Y.
Mazurowski, Maciej A.
A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images
title A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images
title_full A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images
title_fullStr A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images
title_full_unstemmed A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images
title_short A Data Set and Deep Learning Algorithm for the Detection of Masses and Architectural Distortions in Digital Breast Tomosynthesis Images
title_sort data set and deep learning algorithm for the detection of masses and architectural distortions in digital breast tomosynthesis images
topic Original Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369362/
https://www.ncbi.nlm.nih.gov/pubmed/34398205
http://dx.doi.org/10.1001/jamanetworkopen.2021.19100
work_keys_str_mv AT budamateusz adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT sahaashirbani adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT walshruth adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT ghatesujata adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT linianyi adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT swiecickialbert adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT lojosephy adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT mazurowskimacieja adatasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT budamateusz datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT sahaashirbani datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT walshruth datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT ghatesujata datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT linianyi datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT swiecickialbert datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT lojosephy datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages
AT mazurowskimacieja datasetanddeeplearningalgorithmforthedetectionofmassesandarchitecturaldistortionsindigitalbreasttomosynthesisimages