Cargando…

Learning from multiple annotators for medical image segmentation

Supervised machine learning methods have been widely developed for segmentation tasks in recent years. However, the quality of labels has high impact on the predictive performance of these algorithms. This issue is particularly acute in the medical image domain, where both the cost of annotation and...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Le, Tanno, Ryutaro, Xu, Moucheng, Huang, Yawen, Bronik, Kevin, Jin, Chen, Jacob, Joseph, Zheng, Yefeng, Shao, Ling, Ciccarelli, Olga, Barkhof, Frederik, Alexander, Daniel C.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10533416/ https://www.ncbi.nlm.nih.gov/pubmed/37781685 http://dx.doi.org/10.1016/j.patcog.2023.109400

_version_	1785112187585429504
author	Zhang, Le Tanno, Ryutaro Xu, Moucheng Huang, Yawen Bronik, Kevin Jin, Chen Jacob, Joseph Zheng, Yefeng Shao, Ling Ciccarelli, Olga Barkhof, Frederik Alexander, Daniel C.
author_facet	Zhang, Le Tanno, Ryutaro Xu, Moucheng Huang, Yawen Bronik, Kevin Jin, Chen Jacob, Joseph Zheng, Yefeng Shao, Ling Ciccarelli, Olga Barkhof, Frederik Alexander, Daniel C.
author_sort	Zhang, Le
collection	PubMed
description	Supervised machine learning methods have been widely developed for segmentation tasks in recent years. However, the quality of labels has high impact on the predictive performance of these algorithms. This issue is particularly acute in the medical image domain, where both the cost of annotation and the inter-observer variability are high. Different human experts contribute estimates of the ”actual” segmentation labels in a typical label acquisition process, influenced by their personal biases and competency levels. The performance of automatic segmentation algorithms is limited when these noisy labels are used as the expert consensus label. In this work, we use two coupled CNNs to jointly learn, from purely noisy observations alone, the reliability of individual annotators and the expert consensus label distributions. The separation of the two is achieved by maximally describing the annotator’s “unreliable behavior” (we call it “maximally unreliable”) while achieving high fidelity with the noisy training data. We first create a toy segmentation dataset using MNIST and investigate the properties of the proposed algorithm. We then use three public medical imaging segmentation datasets to demonstrate our method’s efficacy, including both simulated (where necessary) and real-world annotations: 1) ISBI2015 (multiple-sclerosis lesions); 2) BraTS (brain tumors); 3) LIDC-IDRI (lung abnormalities). Finally, we create a real-world multiple sclerosis lesion dataset (QSMSC at UCL: Queen Square Multiple Sclerosis Center at UCL, UK) with manual segmentations from 4 different annotators (3 radiologists with different level skills and 1 expert to generate the expert consensus label). In all datasets, our method consistently outperforms competing methods and relevant baselines, especially when the number of annotations is small and the amount of disagreement is large. The studies also reveal that the system is capable of capturing the complicated spatial characteristics of annotators’ mistakes.
format	Online Article Text
id	pubmed-10533416
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-105334162023-09-29 Learning from multiple annotators for medical image segmentation Zhang, Le Tanno, Ryutaro Xu, Moucheng Huang, Yawen Bronik, Kevin Jin, Chen Jacob, Joseph Zheng, Yefeng Shao, Ling Ciccarelli, Olga Barkhof, Frederik Alexander, Daniel C. Pattern Recognit Article Supervised machine learning methods have been widely developed for segmentation tasks in recent years. However, the quality of labels has high impact on the predictive performance of these algorithms. This issue is particularly acute in the medical image domain, where both the cost of annotation and the inter-observer variability are high. Different human experts contribute estimates of the ”actual” segmentation labels in a typical label acquisition process, influenced by their personal biases and competency levels. The performance of automatic segmentation algorithms is limited when these noisy labels are used as the expert consensus label. In this work, we use two coupled CNNs to jointly learn, from purely noisy observations alone, the reliability of individual annotators and the expert consensus label distributions. The separation of the two is achieved by maximally describing the annotator’s “unreliable behavior” (we call it “maximally unreliable”) while achieving high fidelity with the noisy training data. We first create a toy segmentation dataset using MNIST and investigate the properties of the proposed algorithm. We then use three public medical imaging segmentation datasets to demonstrate our method’s efficacy, including both simulated (where necessary) and real-world annotations: 1) ISBI2015 (multiple-sclerosis lesions); 2) BraTS (brain tumors); 3) LIDC-IDRI (lung abnormalities). Finally, we create a real-world multiple sclerosis lesion dataset (QSMSC at UCL: Queen Square Multiple Sclerosis Center at UCL, UK) with manual segmentations from 4 different annotators (3 radiologists with different level skills and 1 expert to generate the expert consensus label). In all datasets, our method consistently outperforms competing methods and relevant baselines, especially when the number of annotations is small and the amount of disagreement is large. The studies also reveal that the system is capable of capturing the complicated spatial characteristics of annotators’ mistakes. Elsevier 2023-06 /pmc/articles/PMC10533416/ /pubmed/37781685 http://dx.doi.org/10.1016/j.patcog.2023.109400 Text en © 2023 The Authors. Published by Elsevier Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhang, Le Tanno, Ryutaro Xu, Moucheng Huang, Yawen Bronik, Kevin Jin, Chen Jacob, Joseph Zheng, Yefeng Shao, Ling Ciccarelli, Olga Barkhof, Frederik Alexander, Daniel C. Learning from multiple annotators for medical image segmentation
title	Learning from multiple annotators for medical image segmentation
title_full	Learning from multiple annotators for medical image segmentation
title_fullStr	Learning from multiple annotators for medical image segmentation
title_full_unstemmed	Learning from multiple annotators for medical image segmentation
title_short	Learning from multiple annotators for medical image segmentation
title_sort	learning from multiple annotators for medical image segmentation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10533416/ https://www.ncbi.nlm.nih.gov/pubmed/37781685 http://dx.doi.org/10.1016/j.patcog.2023.109400
work_keys_str_mv	AT zhangle learningfrommultipleannotatorsformedicalimagesegmentation AT tannoryutaro learningfrommultipleannotatorsformedicalimagesegmentation AT xumoucheng learningfrommultipleannotatorsformedicalimagesegmentation AT huangyawen learningfrommultipleannotatorsformedicalimagesegmentation AT bronikkevin learningfrommultipleannotatorsformedicalimagesegmentation AT jinchen learningfrommultipleannotatorsformedicalimagesegmentation AT jacobjoseph learningfrommultipleannotatorsformedicalimagesegmentation AT zhengyefeng learningfrommultipleannotatorsformedicalimagesegmentation AT shaoling learningfrommultipleannotatorsformedicalimagesegmentation AT ciccarelliolga learningfrommultipleannotatorsformedicalimagesegmentation AT barkhoffrederik learningfrommultipleannotatorsformedicalimagesegmentation AT alexanderdanielc learningfrommultipleannotatorsformedicalimagesegmentation

Learning from multiple annotators for medical image segmentation

Ejemplares similares