Cargando…

The effect of variable labels on deep learning models trained to predict breast density

Purpose. High breast density is associated with reduced efficacy of mammographic screening and increased risk of developing breast cancer. Accurate and reliable automated density estimates can be used for direct risk prediction and passing density related information to further predictive models. Ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Squires, Steven, Harkness, Elaine F, Evans, D Gareth, Astley, Susan M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IOP Publishing 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10114494/
https://www.ncbi.nlm.nih.gov/pubmed/37023727
http://dx.doi.org/10.1088/2057-1976/accaea
_version_ 1785028027048001536
author Squires, Steven
Harkness, Elaine F
Evans, D Gareth
Astley, Susan M
author_facet Squires, Steven
Harkness, Elaine F
Evans, D Gareth
Astley, Susan M
author_sort Squires, Steven
collection PubMed
description Purpose. High breast density is associated with reduced efficacy of mammographic screening and increased risk of developing breast cancer. Accurate and reliable automated density estimates can be used for direct risk prediction and passing density related information to further predictive models. Expert reader assessments of density show a strong relationship to cancer risk but also inter-reader variation. The effect of label variability on model performance is important when considering how to utilise automated methods for both research and clinical purposes. Methods. We utilise subsets of images with density labels from the same 13 readers and 12 reader pairs, and train a deep transfer learning model which is used to assess how label variability affects the mapping from representation to prediction. We then create two end-to-end models: one that is trained on averaged labels across the reader pairs and the second that is trained using individual reader scores, with a novel alteration to the objective function. The combination of these two end-to-end models allows us to investigate the effect of label variability on the model representation formed. Results. We show that the trained mappings from representations to labels are altered considerably by the variability of reader scores. Training on labels with distribution variation removed causes the Spearman rank correlation coefficients to rise from 0.751 ± 0.002 to either 0.815 ± 0.026 when averaging across readers or 0.844 ± 0.002 when averaging across images. However, when we train different models to investigate the representation effect we see little difference, with Spearman rank correlation coefficients of 0.846 ± 0.006 and 0.850 ± 0.006 showing no statistically significant difference in the quality of the model representation with regard to density prediction. Conclusions. We show that the mapping between representation and mammographic density prediction is significantly affected by label variability. However, the effect of the label variability on the model representation is limited.
format Online
Article
Text
id pubmed-10114494
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher IOP Publishing
record_format MEDLINE/PubMed
spelling pubmed-101144942023-04-20 The effect of variable labels on deep learning models trained to predict breast density Squires, Steven Harkness, Elaine F Evans, D Gareth Astley, Susan M Biomed Phys Eng Express Paper Purpose. High breast density is associated with reduced efficacy of mammographic screening and increased risk of developing breast cancer. Accurate and reliable automated density estimates can be used for direct risk prediction and passing density related information to further predictive models. Expert reader assessments of density show a strong relationship to cancer risk but also inter-reader variation. The effect of label variability on model performance is important when considering how to utilise automated methods for both research and clinical purposes. Methods. We utilise subsets of images with density labels from the same 13 readers and 12 reader pairs, and train a deep transfer learning model which is used to assess how label variability affects the mapping from representation to prediction. We then create two end-to-end models: one that is trained on averaged labels across the reader pairs and the second that is trained using individual reader scores, with a novel alteration to the objective function. The combination of these two end-to-end models allows us to investigate the effect of label variability on the model representation formed. Results. We show that the trained mappings from representations to labels are altered considerably by the variability of reader scores. Training on labels with distribution variation removed causes the Spearman rank correlation coefficients to rise from 0.751 ± 0.002 to either 0.815 ± 0.026 when averaging across readers or 0.844 ± 0.002 when averaging across images. However, when we train different models to investigate the representation effect we see little difference, with Spearman rank correlation coefficients of 0.846 ± 0.006 and 0.850 ± 0.006 showing no statistically significant difference in the quality of the model representation with regard to density prediction. Conclusions. We show that the mapping between representation and mammographic density prediction is significantly affected by label variability. However, the effect of the label variability on the model representation is limited. IOP Publishing 2023-05-01 2023-04-19 /pmc/articles/PMC10114494/ /pubmed/37023727 http://dx.doi.org/10.1088/2057-1976/accaea Text en © 2023 The Author(s). Published by IOP Publishing Ltd https://creativecommons.org/licenses/by/4.0/Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence (https://creativecommons.org/licenses/by/4.0/) . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
spellingShingle Paper
Squires, Steven
Harkness, Elaine F
Evans, D Gareth
Astley, Susan M
The effect of variable labels on deep learning models trained to predict breast density
title The effect of variable labels on deep learning models trained to predict breast density
title_full The effect of variable labels on deep learning models trained to predict breast density
title_fullStr The effect of variable labels on deep learning models trained to predict breast density
title_full_unstemmed The effect of variable labels on deep learning models trained to predict breast density
title_short The effect of variable labels on deep learning models trained to predict breast density
title_sort effect of variable labels on deep learning models trained to predict breast density
topic Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10114494/
https://www.ncbi.nlm.nih.gov/pubmed/37023727
http://dx.doi.org/10.1088/2057-1976/accaea
work_keys_str_mv AT squiressteven theeffectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity
AT harknesselainef theeffectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity
AT evansdgareth theeffectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity
AT astleysusanm theeffectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity
AT squiressteven effectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity
AT harknesselainef effectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity
AT evansdgareth effectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity
AT astleysusanm effectofvariablelabelsondeeplearningmodelstrainedtopredictbreastdensity