Cargando…
Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation
Accurate and consistent segmentation plays an important role in the diagnosis, treatment planning, and monitoring of both High Grade Glioma (HGG), including Glioblastoma Multiforme (GBM), and Low Grade Glioma (LGG). Accuracy of segmentation can be affected by the imaging presentation of glioma, whic...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8526730/ https://www.ncbi.nlm.nih.gov/pubmed/34690680 http://dx.doi.org/10.3389/fnins.2021.740353 |
_version_ | 1784585923876356096 |
---|---|
author | Prabhudesai, Snehal Wang, Nicholas Chandler Ahluwalia, Vinayak Huan, Xun Bapuraj, Jayapalli Rajiv Banovic, Nikola Rao, Arvind |
author_facet | Prabhudesai, Snehal Wang, Nicholas Chandler Ahluwalia, Vinayak Huan, Xun Bapuraj, Jayapalli Rajiv Banovic, Nikola Rao, Arvind |
author_sort | Prabhudesai, Snehal |
collection | PubMed |
description | Accurate and consistent segmentation plays an important role in the diagnosis, treatment planning, and monitoring of both High Grade Glioma (HGG), including Glioblastoma Multiforme (GBM), and Low Grade Glioma (LGG). Accuracy of segmentation can be affected by the imaging presentation of glioma, which greatly varies between the two tumor grade groups. In recent years, researchers have used Machine Learning (ML) to segment tumor rapidly and consistently, as compared to manual segmentation. However, existing ML validation relies heavily on computing summary statistics and rarely tests the generalizability of an algorithm on clinically heterogeneous data. In this work, our goal is to investigate how to holistically evaluate the performance of ML algorithms on a brain tumor segmentation task. We address the need for rigorous evaluation of ML algorithms and present four axes of model evaluation—diagnostic performance, model confidence, robustness, and data quality. We perform a comprehensive evaluation of a glioma segmentation ML algorithm by stratifying data by specific tumor grade groups (GBM and LGG) and evaluate these algorithms on each of the four axes. The main takeaways of our work are—(1) ML algorithms need to be evaluated on out-of-distribution data to assess generalizability, reflective of tumor heterogeneity. (2) Segmentation metrics alone are limited to evaluate the errors made by ML algorithms and their describe their consequences. (3) Adoption of tools in other domains such as robustness (adversarial attacks) and model uncertainty (prediction intervals) lead to a more comprehensive performance evaluation. Such a holistic evaluation framework could shed light on an algorithm's clinical utility and help it evolve into a more clinically valuable tool. |
format | Online Article Text |
id | pubmed-8526730 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-85267302021-10-21 Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation Prabhudesai, Snehal Wang, Nicholas Chandler Ahluwalia, Vinayak Huan, Xun Bapuraj, Jayapalli Rajiv Banovic, Nikola Rao, Arvind Front Neurosci Neuroscience Accurate and consistent segmentation plays an important role in the diagnosis, treatment planning, and monitoring of both High Grade Glioma (HGG), including Glioblastoma Multiforme (GBM), and Low Grade Glioma (LGG). Accuracy of segmentation can be affected by the imaging presentation of glioma, which greatly varies between the two tumor grade groups. In recent years, researchers have used Machine Learning (ML) to segment tumor rapidly and consistently, as compared to manual segmentation. However, existing ML validation relies heavily on computing summary statistics and rarely tests the generalizability of an algorithm on clinically heterogeneous data. In this work, our goal is to investigate how to holistically evaluate the performance of ML algorithms on a brain tumor segmentation task. We address the need for rigorous evaluation of ML algorithms and present four axes of model evaluation—diagnostic performance, model confidence, robustness, and data quality. We perform a comprehensive evaluation of a glioma segmentation ML algorithm by stratifying data by specific tumor grade groups (GBM and LGG) and evaluate these algorithms on each of the four axes. The main takeaways of our work are—(1) ML algorithms need to be evaluated on out-of-distribution data to assess generalizability, reflective of tumor heterogeneity. (2) Segmentation metrics alone are limited to evaluate the errors made by ML algorithms and their describe their consequences. (3) Adoption of tools in other domains such as robustness (adversarial attacks) and model uncertainty (prediction intervals) lead to a more comprehensive performance evaluation. Such a holistic evaluation framework could shed light on an algorithm's clinical utility and help it evolve into a more clinically valuable tool. Frontiers Media S.A. 2021-10-06 /pmc/articles/PMC8526730/ /pubmed/34690680 http://dx.doi.org/10.3389/fnins.2021.740353 Text en Copyright © 2021 Prabhudesai, Wang, Ahluwalia, Huan, Bapuraj, Banovic and Rao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Prabhudesai, Snehal Wang, Nicholas Chandler Ahluwalia, Vinayak Huan, Xun Bapuraj, Jayapalli Rajiv Banovic, Nikola Rao, Arvind Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation |
title | Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation |
title_full | Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation |
title_fullStr | Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation |
title_full_unstemmed | Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation |
title_short | Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation |
title_sort | stratification by tumor grade groups in a holistic evaluation of machine learning for brain tumor segmentation |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8526730/ https://www.ncbi.nlm.nih.gov/pubmed/34690680 http://dx.doi.org/10.3389/fnins.2021.740353 |
work_keys_str_mv | AT prabhudesaisnehal stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation AT wangnicholaschandler stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation AT ahluwaliavinayak stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation AT huanxun stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation AT bapurajjayapallirajiv stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation AT banovicnikola stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation AT raoarvind stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation |