Cargando…
A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC
OBJECTIVES: Optimizing a machine learning (ML) pipeline for radiomics analysis involves numerous choices in data set composition, preprocessing, and model selection. Objective identification of the optimal setup is complicated by correlated features, interdependency structures, and a multitude of av...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Lippincott Williams & Wilkins
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10662603/ https://www.ncbi.nlm.nih.gov/pubmed/37504498 http://dx.doi.org/10.1097/RLI.0000000000001009 |
_version_ | 1785148572551872512 |
---|---|
author | Stüber, Anna Theresa Coors, Stefan Schachtner, Balthasar Weber, Tobias Rügamer, David Bender, Andreas Mittermeier, Andreas Öcal, Osman Seidensticker, Max Ricke, Jens Bischl, Bernd Ingrisch, Michael |
author_facet | Stüber, Anna Theresa Coors, Stefan Schachtner, Balthasar Weber, Tobias Rügamer, David Bender, Andreas Mittermeier, Andreas Öcal, Osman Seidensticker, Max Ricke, Jens Bischl, Bernd Ingrisch, Michael |
author_sort | Stüber, Anna Theresa |
collection | PubMed |
description | OBJECTIVES: Optimizing a machine learning (ML) pipeline for radiomics analysis involves numerous choices in data set composition, preprocessing, and model selection. Objective identification of the optimal setup is complicated by correlated features, interdependency structures, and a multitude of available ML algorithms. Therefore, we present a radiomics-based benchmarking framework to optimize a comprehensive ML pipeline for the prediction of overall survival. This study is conducted on an image set of patients with hepatic metastases of colorectal cancer, for which radiomics features of the whole liver and of metastases from computed tomography images were calculated. A mixed model approach was used to find the optimal pipeline configuration and to identify the added prognostic value of radiomics features. MATERIALS AND METHODS: In this study, a large-scale ML benchmark pipeline consisting of preprocessing, feature selection, dimensionality reduction, hyperparameter optimization, and training of different models was developed for radiomics-based survival analysis. Portal-venous computed tomography imaging data from a previous prospective randomized trial evaluating radioembolization of liver metastases of colorectal cancer were quantitatively accessible through a radiomics approach. One thousand two hundred eighteen radiomics features of hepatic metastases and the whole liver were calculated, and 19 clinical parameters (age, sex, laboratory values, and treatment) were available for each patient. Three ML algorithms—a regression model with elastic net regularization (glmnet), a random survival forest (RSF), and a gradient tree-boosting technique (xgboost)—were evaluated for 5 combinations of clinical data, tumor radiomics, and whole-liver features. Hyperparameter optimization and model evaluation were optimized toward the performance metric integrated Brier score via nested cross-validation. To address dependency structures in the benchmark setup, a mixed-model approach was developed to compare ML and data configurations and to identify the best-performing model. RESULTS: Within our radiomics-based benchmark experiment, 60 ML pipeline variations were evaluated on clinical data and radiomics features from 491 patients. Descriptive analysis of the benchmark results showed a preference for RSF-based pipelines, especially for the combination of clinical data with radiomics features. This observation was supported by the quantitative analysis via a linear mixed model approach, computed to differentiate the effect of data sets and pipeline configurations on the resulting performance. This revealed the RSF pipelines to consistently perform similar or better than glmnet and xgboost. Further, for the RSF, there was no significantly better-performing pipeline composition regarding the sort of preprocessing or hyperparameter optimization. CONCLUSIONS: Our study introduces a benchmark framework for radiomics-based survival analysis, aimed at identifying the optimal settings with respect to different radiomics data sources and various ML pipeline variations, including preprocessing techniques and learning algorithms. A suitable analysis tool for the benchmark results is provided via a mixed model approach, which showed for our study on patients with intrahepatic liver metastases, that radiomics features captured the patients' clinical situation in a manner comparable to the provided information solely from clinical parameters. However, we did not observe a relevant additional prognostic value obtained by these radiomics features. |
format | Online Article Text |
id | pubmed-10662603 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Lippincott Williams & Wilkins |
record_format | MEDLINE/PubMed |
spelling | pubmed-106626032023-11-21 A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC Stüber, Anna Theresa Coors, Stefan Schachtner, Balthasar Weber, Tobias Rügamer, David Bender, Andreas Mittermeier, Andreas Öcal, Osman Seidensticker, Max Ricke, Jens Bischl, Bernd Ingrisch, Michael Invest Radiol Original Article OBJECTIVES: Optimizing a machine learning (ML) pipeline for radiomics analysis involves numerous choices in data set composition, preprocessing, and model selection. Objective identification of the optimal setup is complicated by correlated features, interdependency structures, and a multitude of available ML algorithms. Therefore, we present a radiomics-based benchmarking framework to optimize a comprehensive ML pipeline for the prediction of overall survival. This study is conducted on an image set of patients with hepatic metastases of colorectal cancer, for which radiomics features of the whole liver and of metastases from computed tomography images were calculated. A mixed model approach was used to find the optimal pipeline configuration and to identify the added prognostic value of radiomics features. MATERIALS AND METHODS: In this study, a large-scale ML benchmark pipeline consisting of preprocessing, feature selection, dimensionality reduction, hyperparameter optimization, and training of different models was developed for radiomics-based survival analysis. Portal-venous computed tomography imaging data from a previous prospective randomized trial evaluating radioembolization of liver metastases of colorectal cancer were quantitatively accessible through a radiomics approach. One thousand two hundred eighteen radiomics features of hepatic metastases and the whole liver were calculated, and 19 clinical parameters (age, sex, laboratory values, and treatment) were available for each patient. Three ML algorithms—a regression model with elastic net regularization (glmnet), a random survival forest (RSF), and a gradient tree-boosting technique (xgboost)—were evaluated for 5 combinations of clinical data, tumor radiomics, and whole-liver features. Hyperparameter optimization and model evaluation were optimized toward the performance metric integrated Brier score via nested cross-validation. To address dependency structures in the benchmark setup, a mixed-model approach was developed to compare ML and data configurations and to identify the best-performing model. RESULTS: Within our radiomics-based benchmark experiment, 60 ML pipeline variations were evaluated on clinical data and radiomics features from 491 patients. Descriptive analysis of the benchmark results showed a preference for RSF-based pipelines, especially for the combination of clinical data with radiomics features. This observation was supported by the quantitative analysis via a linear mixed model approach, computed to differentiate the effect of data sets and pipeline configurations on the resulting performance. This revealed the RSF pipelines to consistently perform similar or better than glmnet and xgboost. Further, for the RSF, there was no significantly better-performing pipeline composition regarding the sort of preprocessing or hyperparameter optimization. CONCLUSIONS: Our study introduces a benchmark framework for radiomics-based survival analysis, aimed at identifying the optimal settings with respect to different radiomics data sources and various ML pipeline variations, including preprocessing techniques and learning algorithms. A suitable analysis tool for the benchmark results is provided via a mixed model approach, which showed for our study on patients with intrahepatic liver metastases, that radiomics features captured the patients' clinical situation in a manner comparable to the provided information solely from clinical parameters. However, we did not observe a relevant additional prognostic value obtained by these radiomics features. Lippincott Williams & Wilkins 2023-12 2023-07-28 /pmc/articles/PMC10662603/ /pubmed/37504498 http://dx.doi.org/10.1097/RLI.0000000000001009 Text en Copyright © 2023 The Author(s). Published by Wolters Kluwer Health, Inc. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. |
spellingShingle | Original Article Stüber, Anna Theresa Coors, Stefan Schachtner, Balthasar Weber, Tobias Rügamer, David Bender, Andreas Mittermeier, Andreas Öcal, Osman Seidensticker, Max Ricke, Jens Bischl, Bernd Ingrisch, Michael A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC |
title | A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC |
title_full | A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC |
title_fullStr | A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC |
title_full_unstemmed | A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC |
title_short | A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC |
title_sort | comprehensive machine learning benchmark study for radiomics-based survival analysis of ct imaging data in patients with hepatic metastases of crc |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10662603/ https://www.ncbi.nlm.nih.gov/pubmed/37504498 http://dx.doi.org/10.1097/RLI.0000000000001009 |
work_keys_str_mv | AT stuberannatheresa acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT coorsstefan acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT schachtnerbalthasar acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT webertobias acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT rugamerdavid acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT benderandreas acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT mittermeierandreas acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT ocalosman acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT seidenstickermax acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT rickejens acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT bischlbernd acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT ingrischmichael acomprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT stuberannatheresa comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT coorsstefan comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT schachtnerbalthasar comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT webertobias comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT rugamerdavid comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT benderandreas comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT mittermeierandreas comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT ocalosman comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT seidenstickermax comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT rickejens comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT bischlbernd comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc AT ingrischmichael comprehensivemachinelearningbenchmarkstudyforradiomicsbasedsurvivalanalysisofctimagingdatainpatientswithhepaticmetastasesofcrc |