Cargando…

Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography

INTRODUCTION: To date, most mammography-related AI models have been trained using either film or digital mammogram datasets with little overlap. We investigated whether or not combining film and digital mammography during training will help or hinder modern models designed for use on digital mammogr...

Descripción completa

Detalles Bibliográficos
Autores principales: Hwang, InChan, Trivedi, Hari, Brown-Mulry, Beatrice, Zhang, Linglin, Nalla, Vineela, Gastounioti, Aimilia, Gichoya, Judy, Seyyed-Kalantari, Laleh, Banerjee, Imon, Woo, MinJae
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10426498/
https://www.ncbi.nlm.nih.gov/pubmed/37588666
http://dx.doi.org/10.3389/fradi.2023.1181190
_version_ 1785090065432576000
author Hwang, InChan
Trivedi, Hari
Brown-Mulry, Beatrice
Zhang, Linglin
Nalla, Vineela
Gastounioti, Aimilia
Gichoya, Judy
Seyyed-Kalantari, Laleh
Banerjee, Imon
Woo, MinJae
author_facet Hwang, InChan
Trivedi, Hari
Brown-Mulry, Beatrice
Zhang, Linglin
Nalla, Vineela
Gastounioti, Aimilia
Gichoya, Judy
Seyyed-Kalantari, Laleh
Banerjee, Imon
Woo, MinJae
author_sort Hwang, InChan
collection PubMed
description INTRODUCTION: To date, most mammography-related AI models have been trained using either film or digital mammogram datasets with little overlap. We investigated whether or not combining film and digital mammography during training will help or hinder modern models designed for use on digital mammograms. METHODS: To this end, a total of six binary classifiers were trained for comparison. The first three classifiers were trained using images only from Emory Breast Imaging Dataset (EMBED) using ResNet50, ResNet101, and ResNet152 architectures. The next three classifiers were trained using images from EMBED, Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM), and Digital Database for Screening Mammography (DDSM) datasets. All six models were tested only on digital mammograms from EMBED. RESULTS: The results showed that performance degradation to the customized ResNet models was statistically significant overall when EMBED dataset was augmented with CBIS-DDSM/DDSM. While the performance degradation was observed in all racial subgroups, some races are subject to more severe performance drop as compared to other races. DISCUSSION: The degradation may potentially be due to ( 1) a mismatch in features between film-based and digital mammograms ( 2) a mismatch in pathologic and radiological information. In conclusion, use of both film and digital mammography during training may hinder modern models designed for breast cancer screening. Caution is required when combining film-based and digital mammograms or when utilizing pathologic and radiological information simultaneously.
format Online
Article
Text
id pubmed-10426498
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-104264982023-08-16 Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography Hwang, InChan Trivedi, Hari Brown-Mulry, Beatrice Zhang, Linglin Nalla, Vineela Gastounioti, Aimilia Gichoya, Judy Seyyed-Kalantari, Laleh Banerjee, Imon Woo, MinJae Front Radiol Radiology INTRODUCTION: To date, most mammography-related AI models have been trained using either film or digital mammogram datasets with little overlap. We investigated whether or not combining film and digital mammography during training will help or hinder modern models designed for use on digital mammograms. METHODS: To this end, a total of six binary classifiers were trained for comparison. The first three classifiers were trained using images only from Emory Breast Imaging Dataset (EMBED) using ResNet50, ResNet101, and ResNet152 architectures. The next three classifiers were trained using images from EMBED, Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM), and Digital Database for Screening Mammography (DDSM) datasets. All six models were tested only on digital mammograms from EMBED. RESULTS: The results showed that performance degradation to the customized ResNet models was statistically significant overall when EMBED dataset was augmented with CBIS-DDSM/DDSM. While the performance degradation was observed in all racial subgroups, some races are subject to more severe performance drop as compared to other races. DISCUSSION: The degradation may potentially be due to ( 1) a mismatch in features between film-based and digital mammograms ( 2) a mismatch in pathologic and radiological information. In conclusion, use of both film and digital mammography during training may hinder modern models designed for breast cancer screening. Caution is required when combining film-based and digital mammograms or when utilizing pathologic and radiological information simultaneously. Frontiers Media S.A. 2023-06-16 /pmc/articles/PMC10426498/ /pubmed/37588666 http://dx.doi.org/10.3389/fradi.2023.1181190 Text en © 2023 Hwang, Trivedi, Brown-Mulry, Zhang, Nalla, Gastounioti, Gichoya, Seyyed-Kalantari, Banerjee and Woo. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (https://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Radiology
Hwang, InChan
Trivedi, Hari
Brown-Mulry, Beatrice
Zhang, Linglin
Nalla, Vineela
Gastounioti, Aimilia
Gichoya, Judy
Seyyed-Kalantari, Laleh
Banerjee, Imon
Woo, MinJae
Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
title Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
title_full Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
title_fullStr Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
title_full_unstemmed Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
title_short Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
title_sort impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
topic Radiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10426498/
https://www.ncbi.nlm.nih.gov/pubmed/37588666
http://dx.doi.org/10.3389/fradi.2023.1181190
work_keys_str_mv AT hwanginchan impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT trivedihari impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT brownmulrybeatrice impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT zhanglinglin impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT nallavineela impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT gastouniotiaimilia impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT gichoyajudy impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT seyyedkalantarilaleh impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT banerjeeimon impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography
AT woominjae impactofmultisourcedataaugmentationonperformanceofconvolutionalneuralnetworksforabnormalityclassificationinmammography