Cargando…

Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation

Accurate and consistent segmentation plays an important role in the diagnosis, treatment planning, and monitoring of both High Grade Glioma (HGG), including Glioblastoma Multiforme (GBM), and Low Grade Glioma (LGG). Accuracy of segmentation can be affected by the imaging presentation of glioma, whic...

Descripción completa

Detalles Bibliográficos
Autores principales: Prabhudesai, Snehal, Wang, Nicholas Chandler, Ahluwalia, Vinayak, Huan, Xun, Bapuraj, Jayapalli Rajiv, Banovic, Nikola, Rao, Arvind
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8526730/
https://www.ncbi.nlm.nih.gov/pubmed/34690680
http://dx.doi.org/10.3389/fnins.2021.740353
_version_ 1784585923876356096
author Prabhudesai, Snehal
Wang, Nicholas Chandler
Ahluwalia, Vinayak
Huan, Xun
Bapuraj, Jayapalli Rajiv
Banovic, Nikola
Rao, Arvind
author_facet Prabhudesai, Snehal
Wang, Nicholas Chandler
Ahluwalia, Vinayak
Huan, Xun
Bapuraj, Jayapalli Rajiv
Banovic, Nikola
Rao, Arvind
author_sort Prabhudesai, Snehal
collection PubMed
description Accurate and consistent segmentation plays an important role in the diagnosis, treatment planning, and monitoring of both High Grade Glioma (HGG), including Glioblastoma Multiforme (GBM), and Low Grade Glioma (LGG). Accuracy of segmentation can be affected by the imaging presentation of glioma, which greatly varies between the two tumor grade groups. In recent years, researchers have used Machine Learning (ML) to segment tumor rapidly and consistently, as compared to manual segmentation. However, existing ML validation relies heavily on computing summary statistics and rarely tests the generalizability of an algorithm on clinically heterogeneous data. In this work, our goal is to investigate how to holistically evaluate the performance of ML algorithms on a brain tumor segmentation task. We address the need for rigorous evaluation of ML algorithms and present four axes of model evaluation—diagnostic performance, model confidence, robustness, and data quality. We perform a comprehensive evaluation of a glioma segmentation ML algorithm by stratifying data by specific tumor grade groups (GBM and LGG) and evaluate these algorithms on each of the four axes. The main takeaways of our work are—(1) ML algorithms need to be evaluated on out-of-distribution data to assess generalizability, reflective of tumor heterogeneity. (2) Segmentation metrics alone are limited to evaluate the errors made by ML algorithms and their describe their consequences. (3) Adoption of tools in other domains such as robustness (adversarial attacks) and model uncertainty (prediction intervals) lead to a more comprehensive performance evaluation. Such a holistic evaluation framework could shed light on an algorithm's clinical utility and help it evolve into a more clinically valuable tool.
format Online
Article
Text
id pubmed-8526730
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-85267302021-10-21 Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation Prabhudesai, Snehal Wang, Nicholas Chandler Ahluwalia, Vinayak Huan, Xun Bapuraj, Jayapalli Rajiv Banovic, Nikola Rao, Arvind Front Neurosci Neuroscience Accurate and consistent segmentation plays an important role in the diagnosis, treatment planning, and monitoring of both High Grade Glioma (HGG), including Glioblastoma Multiforme (GBM), and Low Grade Glioma (LGG). Accuracy of segmentation can be affected by the imaging presentation of glioma, which greatly varies between the two tumor grade groups. In recent years, researchers have used Machine Learning (ML) to segment tumor rapidly and consistently, as compared to manual segmentation. However, existing ML validation relies heavily on computing summary statistics and rarely tests the generalizability of an algorithm on clinically heterogeneous data. In this work, our goal is to investigate how to holistically evaluate the performance of ML algorithms on a brain tumor segmentation task. We address the need for rigorous evaluation of ML algorithms and present four axes of model evaluation—diagnostic performance, model confidence, robustness, and data quality. We perform a comprehensive evaluation of a glioma segmentation ML algorithm by stratifying data by specific tumor grade groups (GBM and LGG) and evaluate these algorithms on each of the four axes. The main takeaways of our work are—(1) ML algorithms need to be evaluated on out-of-distribution data to assess generalizability, reflective of tumor heterogeneity. (2) Segmentation metrics alone are limited to evaluate the errors made by ML algorithms and their describe their consequences. (3) Adoption of tools in other domains such as robustness (adversarial attacks) and model uncertainty (prediction intervals) lead to a more comprehensive performance evaluation. Such a holistic evaluation framework could shed light on an algorithm's clinical utility and help it evolve into a more clinically valuable tool. Frontiers Media S.A. 2021-10-06 /pmc/articles/PMC8526730/ /pubmed/34690680 http://dx.doi.org/10.3389/fnins.2021.740353 Text en Copyright © 2021 Prabhudesai, Wang, Ahluwalia, Huan, Bapuraj, Banovic and Rao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Prabhudesai, Snehal
Wang, Nicholas Chandler
Ahluwalia, Vinayak
Huan, Xun
Bapuraj, Jayapalli Rajiv
Banovic, Nikola
Rao, Arvind
Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation
title Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation
title_full Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation
title_fullStr Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation
title_full_unstemmed Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation
title_short Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation
title_sort stratification by tumor grade groups in a holistic evaluation of machine learning for brain tumor segmentation
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8526730/
https://www.ncbi.nlm.nih.gov/pubmed/34690680
http://dx.doi.org/10.3389/fnins.2021.740353
work_keys_str_mv AT prabhudesaisnehal stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation
AT wangnicholaschandler stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation
AT ahluwaliavinayak stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation
AT huanxun stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation
AT bapurajjayapallirajiv stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation
AT banovicnikola stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation
AT raoarvind stratificationbytumorgradegroupsinaholisticevaluationofmachinelearningforbraintumorsegmentation