Cargando…
A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW)
For AI researchers, access to a large and well-curated dataset is crucial. Working in the field of breast radiology, our aim was to develop a high-quality platform that can be used for evaluation of networks aiming to predict breast cancer risk, estimate mammographic sensitivity, and detect tumors....
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7165146/ https://www.ncbi.nlm.nih.gov/pubmed/31520277 http://dx.doi.org/10.1007/s10278-019-00278-0 |
_version_ | 1783523418725941248 |
---|---|
author | Dembrower, Karin Lindholm, Peter Strand, Fredrik |
author_facet | Dembrower, Karin Lindholm, Peter Strand, Fredrik |
author_sort | Dembrower, Karin |
collection | PubMed |
description | For AI researchers, access to a large and well-curated dataset is crucial. Working in the field of breast radiology, our aim was to develop a high-quality platform that can be used for evaluation of networks aiming to predict breast cancer risk, estimate mammographic sensitivity, and detect tumors. Our dataset, Cohort of Screen-Aged Women (CSAW), is a population-based cohort of all women 40 to 74 years of age invited to screening in the Stockholm region, Sweden, between 2008 and 2015. All women were invited to mammography screening every 18 to 24 months free of charge. Images were collected from the PACS of the three breast centers that completely cover the region. DICOM metadata were collected together with the images. Screening decisions and clinical outcome data were collected by linkage to the regional cancer center registers. Incident cancer cases, from one center, were pixel-level annotated by a radiologist. A separate subset for efficient evaluation of external networks was defined for the uptake area of one center. The collection and use of the dataset for the purpose of AI research has been approved by the Ethical Review Board. CSAW included 499,807 women invited to screening between 2008 and 2015 with a total of 1,182,733 completed screening examinations. Around 2 million mammography images have currently been collected, including all images for women who developed breast cancer. There were 10,582 women diagnosed with breast cancer; for 8463, it was their first breast cancer. Clinical data include biopsy-verified breast cancer diagnoses, histological origin, tumor size, lymph node status, Elston grade, and receptor status. One thousand eight hundred ninety-one images of 898 women had tumors pixel level annotated including any tumor signs in the prior negative screening mammogram. Our dataset has already been used for evaluation by several research groups. We have defined a high-volume platform for training and evaluation of deep neural networks in the domain of mammographic imaging. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10278-019-00278-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-7165146 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-71651462020-04-24 A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW) Dembrower, Karin Lindholm, Peter Strand, Fredrik J Digit Imaging Article For AI researchers, access to a large and well-curated dataset is crucial. Working in the field of breast radiology, our aim was to develop a high-quality platform that can be used for evaluation of networks aiming to predict breast cancer risk, estimate mammographic sensitivity, and detect tumors. Our dataset, Cohort of Screen-Aged Women (CSAW), is a population-based cohort of all women 40 to 74 years of age invited to screening in the Stockholm region, Sweden, between 2008 and 2015. All women were invited to mammography screening every 18 to 24 months free of charge. Images were collected from the PACS of the three breast centers that completely cover the region. DICOM metadata were collected together with the images. Screening decisions and clinical outcome data were collected by linkage to the regional cancer center registers. Incident cancer cases, from one center, were pixel-level annotated by a radiologist. A separate subset for efficient evaluation of external networks was defined for the uptake area of one center. The collection and use of the dataset for the purpose of AI research has been approved by the Ethical Review Board. CSAW included 499,807 women invited to screening between 2008 and 2015 with a total of 1,182,733 completed screening examinations. Around 2 million mammography images have currently been collected, including all images for women who developed breast cancer. There were 10,582 women diagnosed with breast cancer; for 8463, it was their first breast cancer. Clinical data include biopsy-verified breast cancer diagnoses, histological origin, tumor size, lymph node status, Elston grade, and receptor status. One thousand eight hundred ninety-one images of 898 women had tumors pixel level annotated including any tumor signs in the prior negative screening mammogram. Our dataset has already been used for evaluation by several research groups. We have defined a high-volume platform for training and evaluation of deep neural networks in the domain of mammographic imaging. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10278-019-00278-0) contains supplementary material, which is available to authorized users. Springer International Publishing 2019-09-13 2020-04 /pmc/articles/PMC7165146/ /pubmed/31520277 http://dx.doi.org/10.1007/s10278-019-00278-0 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. |
spellingShingle | Article Dembrower, Karin Lindholm, Peter Strand, Fredrik A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW) |
title | A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW) |
title_full | A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW) |
title_fullStr | A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW) |
title_full_unstemmed | A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW) |
title_short | A Multi-million Mammography Image Dataset and Population-Based Screening Cohort for the Training and Evaluation of Deep Neural Networks—the Cohort of Screen-Aged Women (CSAW) |
title_sort | multi-million mammography image dataset and population-based screening cohort for the training and evaluation of deep neural networks—the cohort of screen-aged women (csaw) |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7165146/ https://www.ncbi.nlm.nih.gov/pubmed/31520277 http://dx.doi.org/10.1007/s10278-019-00278-0 |
work_keys_str_mv | AT dembrowerkarin amultimillionmammographyimagedatasetandpopulationbasedscreeningcohortforthetrainingandevaluationofdeepneuralnetworksthecohortofscreenagedwomencsaw AT lindholmpeter amultimillionmammographyimagedatasetandpopulationbasedscreeningcohortforthetrainingandevaluationofdeepneuralnetworksthecohortofscreenagedwomencsaw AT strandfredrik amultimillionmammographyimagedatasetandpopulationbasedscreeningcohortforthetrainingandevaluationofdeepneuralnetworksthecohortofscreenagedwomencsaw AT dembrowerkarin multimillionmammographyimagedatasetandpopulationbasedscreeningcohortforthetrainingandevaluationofdeepneuralnetworksthecohortofscreenagedwomencsaw AT lindholmpeter multimillionmammographyimagedatasetandpopulationbasedscreeningcohortforthetrainingandevaluationofdeepneuralnetworksthecohortofscreenagedwomencsaw AT strandfredrik multimillionmammographyimagedatasetandpopulationbasedscreeningcohortforthetrainingandevaluationofdeepneuralnetworksthecohortofscreenagedwomencsaw |