Cargando…
Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
BACKGROUND: Radiomics promises to enhance the discriminative performance for clinically significant prostate cancer (csPCa), but still lacks validation in real-life scenarios. This study investigates the classification performance and robustness of machine learning radiomics models in heterogeneous...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AME Publishing Company
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9622454/ https://www.ncbi.nlm.nih.gov/pubmed/36330197 http://dx.doi.org/10.21037/qims-22-265 |
_version_ | 1784821772168724480 |
---|---|
author | Gresser, Eva Schachtner, Balthasar Stüber, Anna Theresa Solyanik, Olga Schreier, Andrea Huber, Thomas Froelich, Matthias Frank Magistro, Giuseppe Kretschmer, Alexander Stief, Christian Ricke, Jens Ingrisch, Michael Nörenberg, Dominik |
author_facet | Gresser, Eva Schachtner, Balthasar Stüber, Anna Theresa Solyanik, Olga Schreier, Andrea Huber, Thomas Froelich, Matthias Frank Magistro, Giuseppe Kretschmer, Alexander Stief, Christian Ricke, Jens Ingrisch, Michael Nörenberg, Dominik |
author_sort | Gresser, Eva |
collection | PubMed |
description | BACKGROUND: Radiomics promises to enhance the discriminative performance for clinically significant prostate cancer (csPCa), but still lacks validation in real-life scenarios. This study investigates the classification performance and robustness of machine learning radiomics models in heterogeneous MRI datasets to characterize suspicious prostate lesions for non-invasive prediction of prostate cancer (PCa) aggressiveness compared to conventional imaging biomarkers. METHODS: A total of 142 patients with clinical suspicion of PCa underwent 1.5T or 3T biparametric MRI (7 scanner types, 14 institutions) and exhibited suspicious lesions [prostate Imaging Reporting and Data System (PI-RADS) score ≥3] in peripheral or transitional zones. Whole-gland and index-lesion segmentations were performed semi-automatically. A total of 1,482 quantitative morphologic, shape, texture, and intensity-based radiomics features were extracted from T2-weighted and apparent diffusion coefficient (ADC)-images and assessed using random forest and logistic regression models. Five-fold cross-validation performance in terms of area under the ROC curve was compared to mean ADC (mADC), PI-RADS and prostate-specific antigen density (PSAD). Bias mitigation techniques targeting the high-dimensional feature space and inherent class imbalance were applied and robustness of results was systematically evaluated. RESULTS: Trained models showed mean area under the curves (AUCs) ranging from 0.78 to 0.83 in csPCa classification. Despite using mitigation techniques, high performance variability of results could be demonstrated. Trained models achieved on average numerically higher classification performance compared to clinical parameters PI-RADS (AUC =0.78), mADC (AUC =0.71) and PSAD (AUC =0.63). CONCLUSIONS: Radiomics models’ classification performance of csPCa was numerically but not significantly higher than PI-RADS scoring. Overall, clinical applicability in heterogeneous MRI datasets is limited because of high variability of results. Performance variability, robustness and reproducibility of radiomics-based measures should be addressed more transparently in future research to enable broad clinical application. |
format | Online Article Text |
id | pubmed-9622454 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | AME Publishing Company |
record_format | MEDLINE/PubMed |
spelling | pubmed-96224542022-11-02 Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets Gresser, Eva Schachtner, Balthasar Stüber, Anna Theresa Solyanik, Olga Schreier, Andrea Huber, Thomas Froelich, Matthias Frank Magistro, Giuseppe Kretschmer, Alexander Stief, Christian Ricke, Jens Ingrisch, Michael Nörenberg, Dominik Quant Imaging Med Surg Original Article BACKGROUND: Radiomics promises to enhance the discriminative performance for clinically significant prostate cancer (csPCa), but still lacks validation in real-life scenarios. This study investigates the classification performance and robustness of machine learning radiomics models in heterogeneous MRI datasets to characterize suspicious prostate lesions for non-invasive prediction of prostate cancer (PCa) aggressiveness compared to conventional imaging biomarkers. METHODS: A total of 142 patients with clinical suspicion of PCa underwent 1.5T or 3T biparametric MRI (7 scanner types, 14 institutions) and exhibited suspicious lesions [prostate Imaging Reporting and Data System (PI-RADS) score ≥3] in peripheral or transitional zones. Whole-gland and index-lesion segmentations were performed semi-automatically. A total of 1,482 quantitative morphologic, shape, texture, and intensity-based radiomics features were extracted from T2-weighted and apparent diffusion coefficient (ADC)-images and assessed using random forest and logistic regression models. Five-fold cross-validation performance in terms of area under the ROC curve was compared to mean ADC (mADC), PI-RADS and prostate-specific antigen density (PSAD). Bias mitigation techniques targeting the high-dimensional feature space and inherent class imbalance were applied and robustness of results was systematically evaluated. RESULTS: Trained models showed mean area under the curves (AUCs) ranging from 0.78 to 0.83 in csPCa classification. Despite using mitigation techniques, high performance variability of results could be demonstrated. Trained models achieved on average numerically higher classification performance compared to clinical parameters PI-RADS (AUC =0.78), mADC (AUC =0.71) and PSAD (AUC =0.63). CONCLUSIONS: Radiomics models’ classification performance of csPCa was numerically but not significantly higher than PI-RADS scoring. Overall, clinical applicability in heterogeneous MRI datasets is limited because of high variability of results. Performance variability, robustness and reproducibility of radiomics-based measures should be addressed more transparently in future research to enable broad clinical application. AME Publishing Company 2022-11 /pmc/articles/PMC9622454/ /pubmed/36330197 http://dx.doi.org/10.21037/qims-22-265 Text en 2022 Quantitative Imaging in Medicine and Surgery. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) . |
spellingShingle | Original Article Gresser, Eva Schachtner, Balthasar Stüber, Anna Theresa Solyanik, Olga Schreier, Andrea Huber, Thomas Froelich, Matthias Frank Magistro, Giuseppe Kretschmer, Alexander Stief, Christian Ricke, Jens Ingrisch, Michael Nörenberg, Dominik Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets |
title | Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets |
title_full | Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets |
title_fullStr | Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets |
title_full_unstemmed | Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets |
title_short | Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets |
title_sort | performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous mri datasets |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9622454/ https://www.ncbi.nlm.nih.gov/pubmed/36330197 http://dx.doi.org/10.21037/qims-22-265 |
work_keys_str_mv | AT gressereva performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT schachtnerbalthasar performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT stuberannatheresa performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT solyanikolga performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT schreierandrea performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT huberthomas performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT froelichmatthiasfrank performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT magistrogiuseppe performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT kretschmeralexander performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT stiefchristian performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT rickejens performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT ingrischmichael performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets AT norenbergdominik performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets |