Cargando…
A curated mammography data set for use in computer-aided detection and diagnosis research
Published research results are difficult to replicate due to the lack of a standard evaluation data set in the area of decision support systems in mammography; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5735920/ https://www.ncbi.nlm.nih.gov/pubmed/29257132 http://dx.doi.org/10.1038/sdata.2017.177 |
_version_ | 1783287293886332928 |
---|---|
author | Lee, Rebecca Sawyer Gimenez, Francisco Hoogi, Assaf Miyake, Kanae Kawai Gorovoy, Mia Rubin, Daniel L. |
author_facet | Lee, Rebecca Sawyer Gimenez, Francisco Hoogi, Assaf Miyake, Kanae Kawai Gorovoy, Mia Rubin, Daniel L. |
author_sort | Lee, Rebecca Sawyer |
collection | PubMed |
description | Published research results are difficult to replicate due to the lack of a standard evaluation data set in the area of decision support systems in mammography; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. This causes an inability to directly compare the performance of methods or to replicate prior results. We seek to resolve this substantial challenge by releasing an updated and standardized version of the Digital Database for Screening Mammography (DDSM) for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography. Our data set, the CBIS-DDSM (Curated Breast Imaging Subset of DDSM), includes decompressed images, data selection and curation by trained mammographers, updated mass segmentation and bounding boxes, and pathologic diagnosis for training data, formatted similarly to modern computer vision data sets. The data set contains 753 calcification cases and 891 mass cases, providing a data-set size capable of analyzing decision support systems in mammography. |
format | Online Article Text |
id | pubmed-5735920 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-57359202017-12-21 A curated mammography data set for use in computer-aided detection and diagnosis research Lee, Rebecca Sawyer Gimenez, Francisco Hoogi, Assaf Miyake, Kanae Kawai Gorovoy, Mia Rubin, Daniel L. Sci Data Data Descriptor Published research results are difficult to replicate due to the lack of a standard evaluation data set in the area of decision support systems in mammography; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. This causes an inability to directly compare the performance of methods or to replicate prior results. We seek to resolve this substantial challenge by releasing an updated and standardized version of the Digital Database for Screening Mammography (DDSM) for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography. Our data set, the CBIS-DDSM (Curated Breast Imaging Subset of DDSM), includes decompressed images, data selection and curation by trained mammographers, updated mass segmentation and bounding boxes, and pathologic diagnosis for training data, formatted similarly to modern computer vision data sets. The data set contains 753 calcification cases and 891 mass cases, providing a data-set size capable of analyzing decision support systems in mammography. Nature Publishing Group 2017-12-19 /pmc/articles/PMC5735920/ /pubmed/29257132 http://dx.doi.org/10.1038/sdata.2017.177 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article. |
spellingShingle | Data Descriptor Lee, Rebecca Sawyer Gimenez, Francisco Hoogi, Assaf Miyake, Kanae Kawai Gorovoy, Mia Rubin, Daniel L. A curated mammography data set for use in computer-aided detection and diagnosis research |
title | A curated mammography data set for use in computer-aided detection and diagnosis research |
title_full | A curated mammography data set for use in computer-aided detection and diagnosis research |
title_fullStr | A curated mammography data set for use in computer-aided detection and diagnosis research |
title_full_unstemmed | A curated mammography data set for use in computer-aided detection and diagnosis research |
title_short | A curated mammography data set for use in computer-aided detection and diagnosis research |
title_sort | curated mammography data set for use in computer-aided detection and diagnosis research |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5735920/ https://www.ncbi.nlm.nih.gov/pubmed/29257132 http://dx.doi.org/10.1038/sdata.2017.177 |
work_keys_str_mv | AT leerebeccasawyer acuratedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT gimenezfrancisco acuratedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT hoogiassaf acuratedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT miyakekanaekawai acuratedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT gorovoymia acuratedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT rubindaniell acuratedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT leerebeccasawyer curatedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT gimenezfrancisco curatedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT hoogiassaf curatedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT miyakekanaekawai curatedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT gorovoymia curatedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch AT rubindaniell curatedmammographydatasetforuseincomputeraideddetectionanddiagnosisresearch |