Cargando…

Multi-Stage Harmonization for Robust AI across Breast MR Databases

SIMPLE SUMMARY: Batch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combina...

Descripción completa

Detalles Bibliográficos
Autores principales: Whitney, Heather M., Li, Hui, Ji, Yu, Liu, Peifang, Giger, Maryellen L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8508003/
https://www.ncbi.nlm.nih.gov/pubmed/34638294
http://dx.doi.org/10.3390/cancers13194809
_version_ 1784581994836918272
author Whitney, Heather M.
Li, Hui
Ji, Yu
Liu, Peifang
Giger, Maryellen L.
author_facet Whitney, Heather M.
Li, Hui
Ji, Yu
Liu, Peifang
Giger, Maryellen L.
author_sort Whitney, Heather M.
collection PubMed
description SIMPLE SUMMARY: Batch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combination of them, were used in pre-harmonization and post-harmonization forms to investigate the generalizability of performance in the task of distinguishing between malignant and benign lesions. Most training and independent test scenarios were statistically equivalent, demonstrating that batch harmonization with feature selection harmonization can potentially develop generalizable classification models. ABSTRACT: Radiomic features extracted from medical images may demonstrate a batch effect when cases come from different sources. We investigated classification performance using training and independent test sets drawn from two sources using both pre-harmonization and post-harmonization features. In this retrospective study, a database of thirty-two radiomic features, extracted from DCE-MR images of breast lesions after fuzzy c-means segmentation, was collected. There were 944 unique lesions in Database A (208 benign lesions, 736 cancers) and 1986 unique lesions in Database B (481 benign lesions, 1505 cancers). The lesions from each database were divided by year of image acquisition into training and independent test sets, separately by database and in combination. ComBat batch harmonization was conducted on the combined training set to minimize the batch effect on eligible features by database. The empirical Bayes estimates from the feature harmonization were applied to the eligible features of the combined independent test set. The training sets (A, B, and combined) were then used in training linear discriminant analysis classifiers after stepwise feature selection. The classifiers were then run on the A, B, and combined independent test sets. Classification performance was compared using pre-harmonization features to post-harmonization features, including their corresponding feature selection, evaluated using the area under the receiver operating characteristic curve (AUC) as the figure of merit. Four out of five training and independent test scenarios demonstrated statistically equivalent classification performance when compared pre- and post-harmonization. These results demonstrate that translation of machine learning techniques with batch data harmonization can potentially yield generalizable models that maintain classification performance.
format Online
Article
Text
id pubmed-8508003
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85080032021-10-13 Multi-Stage Harmonization for Robust AI across Breast MR Databases Whitney, Heather M. Li, Hui Ji, Yu Liu, Peifang Giger, Maryellen L. Cancers (Basel) Article SIMPLE SUMMARY: Batch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combination of them, were used in pre-harmonization and post-harmonization forms to investigate the generalizability of performance in the task of distinguishing between malignant and benign lesions. Most training and independent test scenarios were statistically equivalent, demonstrating that batch harmonization with feature selection harmonization can potentially develop generalizable classification models. ABSTRACT: Radiomic features extracted from medical images may demonstrate a batch effect when cases come from different sources. We investigated classification performance using training and independent test sets drawn from two sources using both pre-harmonization and post-harmonization features. In this retrospective study, a database of thirty-two radiomic features, extracted from DCE-MR images of breast lesions after fuzzy c-means segmentation, was collected. There were 944 unique lesions in Database A (208 benign lesions, 736 cancers) and 1986 unique lesions in Database B (481 benign lesions, 1505 cancers). The lesions from each database were divided by year of image acquisition into training and independent test sets, separately by database and in combination. ComBat batch harmonization was conducted on the combined training set to minimize the batch effect on eligible features by database. The empirical Bayes estimates from the feature harmonization were applied to the eligible features of the combined independent test set. The training sets (A, B, and combined) were then used in training linear discriminant analysis classifiers after stepwise feature selection. The classifiers were then run on the A, B, and combined independent test sets. Classification performance was compared using pre-harmonization features to post-harmonization features, including their corresponding feature selection, evaluated using the area under the receiver operating characteristic curve (AUC) as the figure of merit. Four out of five training and independent test scenarios demonstrated statistically equivalent classification performance when compared pre- and post-harmonization. These results demonstrate that translation of machine learning techniques with batch data harmonization can potentially yield generalizable models that maintain classification performance. MDPI 2021-09-26 /pmc/articles/PMC8508003/ /pubmed/34638294 http://dx.doi.org/10.3390/cancers13194809 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Whitney, Heather M.
Li, Hui
Ji, Yu
Liu, Peifang
Giger, Maryellen L.
Multi-Stage Harmonization for Robust AI across Breast MR Databases
title Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_full Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_fullStr Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_full_unstemmed Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_short Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_sort multi-stage harmonization for robust ai across breast mr databases
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8508003/
https://www.ncbi.nlm.nih.gov/pubmed/34638294
http://dx.doi.org/10.3390/cancers13194809
work_keys_str_mv AT whitneyheatherm multistageharmonizationforrobustaiacrossbreastmrdatabases
AT lihui multistageharmonizationforrobustaiacrossbreastmrdatabases
AT jiyu multistageharmonizationforrobustaiacrossbreastmrdatabases
AT liupeifang multistageharmonizationforrobustaiacrossbreastmrdatabases
AT gigermaryellenl multistageharmonizationforrobustaiacrossbreastmrdatabases