Cargando…

Multi-Stage Harmonization for Robust AI across Breast MR Databases

SIMPLE SUMMARY: Batch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combina...

Descripción completa

Detalles Bibliográficos
Autores principales:	Whitney, Heather M., Li, Hui, Ji, Yu, Liu, Peifang, Giger, Maryellen L.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8508003/ https://www.ncbi.nlm.nih.gov/pubmed/34638294 http://dx.doi.org/10.3390/cancers13194809

_version_	1784581994836918272
author	Whitney, Heather M. Li, Hui Ji, Yu Liu, Peifang Giger, Maryellen L.
author_facet	Whitney, Heather M. Li, Hui Ji, Yu Liu, Peifang Giger, Maryellen L.
author_sort	Whitney, Heather M.
collection	PubMed
description	SIMPLE SUMMARY: Batch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combination of them, were used in pre-harmonization and post-harmonization forms to investigate the generalizability of performance in the task of distinguishing between malignant and benign lesions. Most training and independent test scenarios were statistically equivalent, demonstrating that batch harmonization with feature selection harmonization can potentially develop generalizable classification models. ABSTRACT: Radiomic features extracted from medical images may demonstrate a batch effect when cases come from different sources. We investigated classification performance using training and independent test sets drawn from two sources using both pre-harmonization and post-harmonization features. In this retrospective study, a database of thirty-two radiomic features, extracted from DCE-MR images of breast lesions after fuzzy c-means segmentation, was collected. There were 944 unique lesions in Database A (208 benign lesions, 736 cancers) and 1986 unique lesions in Database B (481 benign lesions, 1505 cancers). The lesions from each database were divided by year of image acquisition into training and independent test sets, separately by database and in combination. ComBat batch harmonization was conducted on the combined training set to minimize the batch effect on eligible features by database. The empirical Bayes estimates from the feature harmonization were applied to the eligible features of the combined independent test set. The training sets (A, B, and combined) were then used in training linear discriminant analysis classifiers after stepwise feature selection. The classifiers were then run on the A, B, and combined independent test sets. Classification performance was compared using pre-harmonization features to post-harmonization features, including their corresponding feature selection, evaluated using the area under the receiver operating characteristic curve (AUC) as the figure of merit. Four out of five training and independent test scenarios demonstrated statistically equivalent classification performance when compared pre- and post-harmonization. These results demonstrate that translation of machine learning techniques with batch data harmonization can potentially yield generalizable models that maintain classification performance.
format	Online Article Text
id	pubmed-8508003
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-85080032021-10-13 Multi-Stage Harmonization for Robust AI across Breast MR Databases Whitney, Heather M. Li, Hui Ji, Yu Liu, Peifang Giger, Maryellen L. Cancers (Basel) Article SIMPLE SUMMARY: Batch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combination of them, were used in pre-harmonization and post-harmonization forms to investigate the generalizability of performance in the task of distinguishing between malignant and benign lesions. Most training and independent test scenarios were statistically equivalent, demonstrating that batch harmonization with feature selection harmonization can potentially develop generalizable classification models. ABSTRACT: Radiomic features extracted from medical images may demonstrate a batch effect when cases come from different sources. We investigated classification performance using training and independent test sets drawn from two sources using both pre-harmonization and post-harmonization features. In this retrospective study, a database of thirty-two radiomic features, extracted from DCE-MR images of breast lesions after fuzzy c-means segmentation, was collected. There were 944 unique lesions in Database A (208 benign lesions, 736 cancers) and 1986 unique lesions in Database B (481 benign lesions, 1505 cancers). The lesions from each database were divided by year of image acquisition into training and independent test sets, separately by database and in combination. ComBat batch harmonization was conducted on the combined training set to minimize the batch effect on eligible features by database. The empirical Bayes estimates from the feature harmonization were applied to the eligible features of the combined independent test set. The training sets (A, B, and combined) were then used in training linear discriminant analysis classifiers after stepwise feature selection. The classifiers were then run on the A, B, and combined independent test sets. Classification performance was compared using pre-harmonization features to post-harmonization features, including their corresponding feature selection, evaluated using the area under the receiver operating characteristic curve (AUC) as the figure of merit. Four out of five training and independent test scenarios demonstrated statistically equivalent classification performance when compared pre- and post-harmonization. These results demonstrate that translation of machine learning techniques with batch data harmonization can potentially yield generalizable models that maintain classification performance. MDPI 2021-09-26 /pmc/articles/PMC8508003/ /pubmed/34638294 http://dx.doi.org/10.3390/cancers13194809 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Whitney, Heather M. Li, Hui Ji, Yu Liu, Peifang Giger, Maryellen L. Multi-Stage Harmonization for Robust AI across Breast MR Databases
title	Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_full	Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_fullStr	Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_full_unstemmed	Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_short	Multi-Stage Harmonization for Robust AI across Breast MR Databases
title_sort	multi-stage harmonization for robust ai across breast mr databases
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8508003/ https://www.ncbi.nlm.nih.gov/pubmed/34638294 http://dx.doi.org/10.3390/cancers13194809
work_keys_str_mv	AT whitneyheatherm multistageharmonizationforrobustaiacrossbreastmrdatabases AT lihui multistageharmonizationforrobustaiacrossbreastmrdatabases AT jiyu multistageharmonizationforrobustaiacrossbreastmrdatabases AT liupeifang multistageharmonizationforrobustaiacrossbreastmrdatabases AT gigermaryellenl multistageharmonizationforrobustaiacrossbreastmrdatabases

Multi-Stage Harmonization for Robust AI across Breast MR Databases

Ejemplares similares