Cargando…

Learning from crowds in digital pathology using scalable variational Gaussian processes

The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but...

Descripción completa

Detalles Bibliográficos
Autores principales: López-Pérez, Miguel, Amgad, Mohamed, Morales-Álvarez, Pablo, Ruiz, Pablo, Cooper, Lee A. D., Molina, Rafael, Katsaggelos, Aggelos K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8172863/
https://www.ncbi.nlm.nih.gov/pubmed/34078955
http://dx.doi.org/10.1038/s41598-021-90821-3
_version_ 1783702599600439296
author López-Pérez, Miguel
Amgad, Mohamed
Morales-Álvarez, Pablo
Ruiz, Pablo
Cooper, Lee A. D.
Molina, Rafael
Katsaggelos, Aggelos K.
author_facet López-Pérez, Miguel
Amgad, Mohamed
Morales-Álvarez, Pablo
Ruiz, Pablo
Cooper, Lee A. D.
Molina, Rafael
Katsaggelos, Aggelos K.
author_sort López-Pérez, Miguel
collection PubMed
description The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using gold-standard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the class-conditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well.
format Online
Article
Text
id pubmed-8172863
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-81728632021-06-03 Learning from crowds in digital pathology using scalable variational Gaussian processes López-Pérez, Miguel Amgad, Mohamed Morales-Álvarez, Pablo Ruiz, Pablo Cooper, Lee A. D. Molina, Rafael Katsaggelos, Aggelos K. Sci Rep Article The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using gold-standard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the class-conditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well. Nature Publishing Group UK 2021-06-02 /pmc/articles/PMC8172863/ /pubmed/34078955 http://dx.doi.org/10.1038/s41598-021-90821-3 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
López-Pérez, Miguel
Amgad, Mohamed
Morales-Álvarez, Pablo
Ruiz, Pablo
Cooper, Lee A. D.
Molina, Rafael
Katsaggelos, Aggelos K.
Learning from crowds in digital pathology using scalable variational Gaussian processes
title Learning from crowds in digital pathology using scalable variational Gaussian processes
title_full Learning from crowds in digital pathology using scalable variational Gaussian processes
title_fullStr Learning from crowds in digital pathology using scalable variational Gaussian processes
title_full_unstemmed Learning from crowds in digital pathology using scalable variational Gaussian processes
title_short Learning from crowds in digital pathology using scalable variational Gaussian processes
title_sort learning from crowds in digital pathology using scalable variational gaussian processes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8172863/
https://www.ncbi.nlm.nih.gov/pubmed/34078955
http://dx.doi.org/10.1038/s41598-021-90821-3
work_keys_str_mv AT lopezperezmiguel learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses
AT amgadmohamed learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses
AT moralesalvarezpablo learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses
AT ruizpablo learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses
AT cooperleead learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses
AT molinarafael learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses
AT katsaggelosaggelosk learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses