Cargando…

Towards a guideline for evaluation metrics in medical image segmentation

In the last decade, research on artificial intelligence has seen rapid growth with deep learning models, especially in the field of medical image segmentation. Various studies demonstrated that these models have powerful prediction capabilities and achieved similar results as clinicians. However, re...

Descripción completa

Detalles Bibliográficos
Autores principales: Müller, Dominik, Soto-Rey, Iñaki, Kramer, Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9208116/
https://www.ncbi.nlm.nih.gov/pubmed/35725483
http://dx.doi.org/10.1186/s13104-022-06096-y
_version_ 1784729671875690496
author Müller, Dominik
Soto-Rey, Iñaki
Kramer, Frank
author_facet Müller, Dominik
Soto-Rey, Iñaki
Kramer, Frank
author_sort Müller, Dominik
collection PubMed
description In the last decade, research on artificial intelligence has seen rapid growth with deep learning models, especially in the field of medical image segmentation. Various studies demonstrated that these models have powerful prediction capabilities and achieved similar results as clinicians. However, recent studies revealed that the evaluation in image segmentation studies lacks reliable model performance assessment and showed statistical bias by incorrect metric implementation or usage. Thus, this work provides an overview and interpretation guide on the following metrics for medical image segmentation evaluation in binary as well as multi-class problems: Dice similarity coefficient, Jaccard, Sensitivity, Specificity, Rand index, ROC curves, Cohen’s Kappa, and Hausdorff distance. Furthermore, common issues like class imbalance and statistical as well as interpretation biases in evaluation are discussed. As a summary, we propose a guideline for standardized medical image segmentation evaluation to improve evaluation quality, reproducibility, and comparability in the research field.
format Online
Article
Text
id pubmed-9208116
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-92081162022-06-21 Towards a guideline for evaluation metrics in medical image segmentation Müller, Dominik Soto-Rey, Iñaki Kramer, Frank BMC Res Notes Commentary In the last decade, research on artificial intelligence has seen rapid growth with deep learning models, especially in the field of medical image segmentation. Various studies demonstrated that these models have powerful prediction capabilities and achieved similar results as clinicians. However, recent studies revealed that the evaluation in image segmentation studies lacks reliable model performance assessment and showed statistical bias by incorrect metric implementation or usage. Thus, this work provides an overview and interpretation guide on the following metrics for medical image segmentation evaluation in binary as well as multi-class problems: Dice similarity coefficient, Jaccard, Sensitivity, Specificity, Rand index, ROC curves, Cohen’s Kappa, and Hausdorff distance. Furthermore, common issues like class imbalance and statistical as well as interpretation biases in evaluation are discussed. As a summary, we propose a guideline for standardized medical image segmentation evaluation to improve evaluation quality, reproducibility, and comparability in the research field. BioMed Central 2022-06-20 /pmc/articles/PMC9208116/ /pubmed/35725483 http://dx.doi.org/10.1186/s13104-022-06096-y Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Commentary
Müller, Dominik
Soto-Rey, Iñaki
Kramer, Frank
Towards a guideline for evaluation metrics in medical image segmentation
title Towards a guideline for evaluation metrics in medical image segmentation
title_full Towards a guideline for evaluation metrics in medical image segmentation
title_fullStr Towards a guideline for evaluation metrics in medical image segmentation
title_full_unstemmed Towards a guideline for evaluation metrics in medical image segmentation
title_short Towards a guideline for evaluation metrics in medical image segmentation
title_sort towards a guideline for evaluation metrics in medical image segmentation
topic Commentary
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9208116/
https://www.ncbi.nlm.nih.gov/pubmed/35725483
http://dx.doi.org/10.1186/s13104-022-06096-y
work_keys_str_mv AT mullerdominik towardsaguidelineforevaluationmetricsinmedicalimagesegmentation
AT sotoreyinaki towardsaguidelineforevaluationmetricsinmedicalimagesegmentation
AT kramerfrank towardsaguidelineforevaluationmetricsinmedicalimagesegmentation