Cargando…

Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets

BACKGROUND: Radiomics promises to enhance the discriminative performance for clinically significant prostate cancer (csPCa), but still lacks validation in real-life scenarios. This study investigates the classification performance and robustness of machine learning radiomics models in heterogeneous...

Descripción completa

Detalles Bibliográficos
Autores principales: Gresser, Eva, Schachtner, Balthasar, Stüber, Anna Theresa, Solyanik, Olga, Schreier, Andrea, Huber, Thomas, Froelich, Matthias Frank, Magistro, Giuseppe, Kretschmer, Alexander, Stief, Christian, Ricke, Jens, Ingrisch, Michael, Nörenberg, Dominik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AME Publishing Company 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9622454/
https://www.ncbi.nlm.nih.gov/pubmed/36330197
http://dx.doi.org/10.21037/qims-22-265
_version_ 1784821772168724480
author Gresser, Eva
Schachtner, Balthasar
Stüber, Anna Theresa
Solyanik, Olga
Schreier, Andrea
Huber, Thomas
Froelich, Matthias Frank
Magistro, Giuseppe
Kretschmer, Alexander
Stief, Christian
Ricke, Jens
Ingrisch, Michael
Nörenberg, Dominik
author_facet Gresser, Eva
Schachtner, Balthasar
Stüber, Anna Theresa
Solyanik, Olga
Schreier, Andrea
Huber, Thomas
Froelich, Matthias Frank
Magistro, Giuseppe
Kretschmer, Alexander
Stief, Christian
Ricke, Jens
Ingrisch, Michael
Nörenberg, Dominik
author_sort Gresser, Eva
collection PubMed
description BACKGROUND: Radiomics promises to enhance the discriminative performance for clinically significant prostate cancer (csPCa), but still lacks validation in real-life scenarios. This study investigates the classification performance and robustness of machine learning radiomics models in heterogeneous MRI datasets to characterize suspicious prostate lesions for non-invasive prediction of prostate cancer (PCa) aggressiveness compared to conventional imaging biomarkers. METHODS: A total of 142 patients with clinical suspicion of PCa underwent 1.5T or 3T biparametric MRI (7 scanner types, 14 institutions) and exhibited suspicious lesions [prostate Imaging Reporting and Data System (PI-RADS) score ≥3] in peripheral or transitional zones. Whole-gland and index-lesion segmentations were performed semi-automatically. A total of 1,482 quantitative morphologic, shape, texture, and intensity-based radiomics features were extracted from T2-weighted and apparent diffusion coefficient (ADC)-images and assessed using random forest and logistic regression models. Five-fold cross-validation performance in terms of area under the ROC curve was compared to mean ADC (mADC), PI-RADS and prostate-specific antigen density (PSAD). Bias mitigation techniques targeting the high-dimensional feature space and inherent class imbalance were applied and robustness of results was systematically evaluated. RESULTS: Trained models showed mean area under the curves (AUCs) ranging from 0.78 to 0.83 in csPCa classification. Despite using mitigation techniques, high performance variability of results could be demonstrated. Trained models achieved on average numerically higher classification performance compared to clinical parameters PI-RADS (AUC =0.78), mADC (AUC =0.71) and PSAD (AUC =0.63). CONCLUSIONS: Radiomics models’ classification performance of csPCa was numerically but not significantly higher than PI-RADS scoring. Overall, clinical applicability in heterogeneous MRI datasets is limited because of high variability of results. Performance variability, robustness and reproducibility of radiomics-based measures should be addressed more transparently in future research to enable broad clinical application.
format Online
Article
Text
id pubmed-9622454
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher AME Publishing Company
record_format MEDLINE/PubMed
spelling pubmed-96224542022-11-02 Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets Gresser, Eva Schachtner, Balthasar Stüber, Anna Theresa Solyanik, Olga Schreier, Andrea Huber, Thomas Froelich, Matthias Frank Magistro, Giuseppe Kretschmer, Alexander Stief, Christian Ricke, Jens Ingrisch, Michael Nörenberg, Dominik Quant Imaging Med Surg Original Article BACKGROUND: Radiomics promises to enhance the discriminative performance for clinically significant prostate cancer (csPCa), but still lacks validation in real-life scenarios. This study investigates the classification performance and robustness of machine learning radiomics models in heterogeneous MRI datasets to characterize suspicious prostate lesions for non-invasive prediction of prostate cancer (PCa) aggressiveness compared to conventional imaging biomarkers. METHODS: A total of 142 patients with clinical suspicion of PCa underwent 1.5T or 3T biparametric MRI (7 scanner types, 14 institutions) and exhibited suspicious lesions [prostate Imaging Reporting and Data System (PI-RADS) score ≥3] in peripheral or transitional zones. Whole-gland and index-lesion segmentations were performed semi-automatically. A total of 1,482 quantitative morphologic, shape, texture, and intensity-based radiomics features were extracted from T2-weighted and apparent diffusion coefficient (ADC)-images and assessed using random forest and logistic regression models. Five-fold cross-validation performance in terms of area under the ROC curve was compared to mean ADC (mADC), PI-RADS and prostate-specific antigen density (PSAD). Bias mitigation techniques targeting the high-dimensional feature space and inherent class imbalance were applied and robustness of results was systematically evaluated. RESULTS: Trained models showed mean area under the curves (AUCs) ranging from 0.78 to 0.83 in csPCa classification. Despite using mitigation techniques, high performance variability of results could be demonstrated. Trained models achieved on average numerically higher classification performance compared to clinical parameters PI-RADS (AUC =0.78), mADC (AUC =0.71) and PSAD (AUC =0.63). CONCLUSIONS: Radiomics models’ classification performance of csPCa was numerically but not significantly higher than PI-RADS scoring. Overall, clinical applicability in heterogeneous MRI datasets is limited because of high variability of results. Performance variability, robustness and reproducibility of radiomics-based measures should be addressed more transparently in future research to enable broad clinical application. AME Publishing Company 2022-11 /pmc/articles/PMC9622454/ /pubmed/36330197 http://dx.doi.org/10.21037/qims-22-265 Text en 2022 Quantitative Imaging in Medicine and Surgery. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Original Article
Gresser, Eva
Schachtner, Balthasar
Stüber, Anna Theresa
Solyanik, Olga
Schreier, Andrea
Huber, Thomas
Froelich, Matthias Frank
Magistro, Giuseppe
Kretschmer, Alexander
Stief, Christian
Ricke, Jens
Ingrisch, Michael
Nörenberg, Dominik
Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
title Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
title_full Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
title_fullStr Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
title_full_unstemmed Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
title_short Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
title_sort performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous mri datasets
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9622454/
https://www.ncbi.nlm.nih.gov/pubmed/36330197
http://dx.doi.org/10.21037/qims-22-265
work_keys_str_mv AT gressereva performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT schachtnerbalthasar performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT stuberannatheresa performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT solyanikolga performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT schreierandrea performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT huberthomas performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT froelichmatthiasfrank performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT magistrogiuseppe performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT kretschmeralexander performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT stiefchristian performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT rickejens performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT ingrischmichael performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets
AT norenbergdominik performancevariabilityofradiomicsmachinelearningmodelsforthedetectionofclinicallysignificantprostatecancerinheterogeneousmridatasets