Cargando…

NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer

BACKGROUND: Deep learning enables accurate high-resolution mapping of cells and tissue structures that can serve as the foundation of interpretable machine-learning models for computational pathology. However, generating adequate labels for these structures is a critical barrier, given the time and...

Descripción completa

Detalles Bibliográficos
Autores principales: Amgad, Mohamed, Atteya, Lamees A, Hussein, Hagar, Mohammed, Kareem Hosny, Hafiz, Ehab, Elsebaie, Maha A T, Alhusseiny, Ahmed M, AlMoslemany, Mohamed Atef, Elmatboly, Abdelmagid M, Pappalardo, Philip A, Sakr, Rokia Adel, Mobadersany, Pooya, Rachid, Ahmad, Saad, Anas M, Alkashash, Ahmad M, Ruhban, Inas A, Alrefai, Anas, Elgazar, Nada M, Abdulkarim, Ali, Farag, Abo-Alela, Etman, Amira, Elsaeed, Ahmed G, Alagha, Yahya, Amer, Yomna A, Raslan, Ahmed M, Nadim, Menatalla K, Elsebaie, Mai A T, Ayad, Ahmed, Hanna, Liza E, Gadallah, Ahmed, Elkady, Mohamed, Drumheller, Bradley, Jaye, David, Manthey, David, Gutman, David A, Elfandy, Habiba, Cooper, Lee A D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9112766/
https://www.ncbi.nlm.nih.gov/pubmed/35579553
http://dx.doi.org/10.1093/gigascience/giac037
_version_ 1784709468135620608
author Amgad, Mohamed
Atteya, Lamees A
Hussein, Hagar
Mohammed, Kareem Hosny
Hafiz, Ehab
Elsebaie, Maha A T
Alhusseiny, Ahmed M
AlMoslemany, Mohamed Atef
Elmatboly, Abdelmagid M
Pappalardo, Philip A
Sakr, Rokia Adel
Mobadersany, Pooya
Rachid, Ahmad
Saad, Anas M
Alkashash, Ahmad M
Ruhban, Inas A
Alrefai, Anas
Elgazar, Nada M
Abdulkarim, Ali
Farag, Abo-Alela
Etman, Amira
Elsaeed, Ahmed G
Alagha, Yahya
Amer, Yomna A
Raslan, Ahmed M
Nadim, Menatalla K
Elsebaie, Mai A T
Ayad, Ahmed
Hanna, Liza E
Gadallah, Ahmed
Elkady, Mohamed
Drumheller, Bradley
Jaye, David
Manthey, David
Gutman, David A
Elfandy, Habiba
Cooper, Lee A D
author_facet Amgad, Mohamed
Atteya, Lamees A
Hussein, Hagar
Mohammed, Kareem Hosny
Hafiz, Ehab
Elsebaie, Maha A T
Alhusseiny, Ahmed M
AlMoslemany, Mohamed Atef
Elmatboly, Abdelmagid M
Pappalardo, Philip A
Sakr, Rokia Adel
Mobadersany, Pooya
Rachid, Ahmad
Saad, Anas M
Alkashash, Ahmad M
Ruhban, Inas A
Alrefai, Anas
Elgazar, Nada M
Abdulkarim, Ali
Farag, Abo-Alela
Etman, Amira
Elsaeed, Ahmed G
Alagha, Yahya
Amer, Yomna A
Raslan, Ahmed M
Nadim, Menatalla K
Elsebaie, Mai A T
Ayad, Ahmed
Hanna, Liza E
Gadallah, Ahmed
Elkady, Mohamed
Drumheller, Bradley
Jaye, David
Manthey, David
Gutman, David A
Elfandy, Habiba
Cooper, Lee A D
author_sort Amgad, Mohamed
collection PubMed
description BACKGROUND: Deep learning enables accurate high-resolution mapping of cells and tissue structures that can serve as the foundation of interpretable machine-learning models for computational pathology. However, generating adequate labels for these structures is a critical barrier, given the time and effort required from pathologists. RESULTS: This article describes a novel collaborative framework for engaging crowds of medical students and pathologists to produce quality labels for cell nuclei. We used this approach to produce the NuCLS dataset, containing >220,000 annotations of cell nuclei in breast cancers. This builds on prior work labeling tissue regions to produce an integrated tissue region- and cell-level annotation dataset for training that is the largest such resource for multi-scale analysis of breast cancer histology. This article presents data and analysis results for single and multi-rater annotations from both non-experts and pathologists. We present a novel workflow that uses algorithmic suggestions to collect accurate segmentation data without the need for laborious manual tracing of nuclei. Our results indicate that even noisy algorithmic suggestions do not adversely affect pathologist accuracy and can help non-experts improve annotation quality. We also present a new approach for inferring truth from multiple raters and show that non-experts can produce accurate annotations for visually distinctive classes. CONCLUSIONS: This study is the most extensive systematic exploration of the large-scale use of wisdom-of-the-crowd approaches to generate data for computational pathology applications.
format Online
Article
Text
id pubmed-9112766
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91127662022-05-18 NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer Amgad, Mohamed Atteya, Lamees A Hussein, Hagar Mohammed, Kareem Hosny Hafiz, Ehab Elsebaie, Maha A T Alhusseiny, Ahmed M AlMoslemany, Mohamed Atef Elmatboly, Abdelmagid M Pappalardo, Philip A Sakr, Rokia Adel Mobadersany, Pooya Rachid, Ahmad Saad, Anas M Alkashash, Ahmad M Ruhban, Inas A Alrefai, Anas Elgazar, Nada M Abdulkarim, Ali Farag, Abo-Alela Etman, Amira Elsaeed, Ahmed G Alagha, Yahya Amer, Yomna A Raslan, Ahmed M Nadim, Menatalla K Elsebaie, Mai A T Ayad, Ahmed Hanna, Liza E Gadallah, Ahmed Elkady, Mohamed Drumheller, Bradley Jaye, David Manthey, David Gutman, David A Elfandy, Habiba Cooper, Lee A D Gigascience Research BACKGROUND: Deep learning enables accurate high-resolution mapping of cells and tissue structures that can serve as the foundation of interpretable machine-learning models for computational pathology. However, generating adequate labels for these structures is a critical barrier, given the time and effort required from pathologists. RESULTS: This article describes a novel collaborative framework for engaging crowds of medical students and pathologists to produce quality labels for cell nuclei. We used this approach to produce the NuCLS dataset, containing >220,000 annotations of cell nuclei in breast cancers. This builds on prior work labeling tissue regions to produce an integrated tissue region- and cell-level annotation dataset for training that is the largest such resource for multi-scale analysis of breast cancer histology. This article presents data and analysis results for single and multi-rater annotations from both non-experts and pathologists. We present a novel workflow that uses algorithmic suggestions to collect accurate segmentation data without the need for laborious manual tracing of nuclei. Our results indicate that even noisy algorithmic suggestions do not adversely affect pathologist accuracy and can help non-experts improve annotation quality. We also present a new approach for inferring truth from multiple raters and show that non-experts can produce accurate annotations for visually distinctive classes. CONCLUSIONS: This study is the most extensive systematic exploration of the large-scale use of wisdom-of-the-crowd approaches to generate data for computational pathology applications. Oxford University Press 2022-05-17 /pmc/articles/PMC9112766/ /pubmed/35579553 http://dx.doi.org/10.1093/gigascience/giac037 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Amgad, Mohamed
Atteya, Lamees A
Hussein, Hagar
Mohammed, Kareem Hosny
Hafiz, Ehab
Elsebaie, Maha A T
Alhusseiny, Ahmed M
AlMoslemany, Mohamed Atef
Elmatboly, Abdelmagid M
Pappalardo, Philip A
Sakr, Rokia Adel
Mobadersany, Pooya
Rachid, Ahmad
Saad, Anas M
Alkashash, Ahmad M
Ruhban, Inas A
Alrefai, Anas
Elgazar, Nada M
Abdulkarim, Ali
Farag, Abo-Alela
Etman, Amira
Elsaeed, Ahmed G
Alagha, Yahya
Amer, Yomna A
Raslan, Ahmed M
Nadim, Menatalla K
Elsebaie, Mai A T
Ayad, Ahmed
Hanna, Liza E
Gadallah, Ahmed
Elkady, Mohamed
Drumheller, Bradley
Jaye, David
Manthey, David
Gutman, David A
Elfandy, Habiba
Cooper, Lee A D
NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer
title NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer
title_full NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer
title_fullStr NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer
title_full_unstemmed NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer
title_short NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer
title_sort nucls: a scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9112766/
https://www.ncbi.nlm.nih.gov/pubmed/35579553
http://dx.doi.org/10.1093/gigascience/giac037
work_keys_str_mv AT amgadmohamed nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT atteyalameesa nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT husseinhagar nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT mohammedkareemhosny nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT hafizehab nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT elsebaiemahaat nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT alhusseinyahmedm nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT almoslemanymohamedatef nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT elmatbolyabdelmagidm nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT pappalardophilipa nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT sakrrokiaadel nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT mobadersanypooya nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT rachidahmad nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT saadanasm nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT alkashashahmadm nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT ruhbaninasa nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT alrefaianas nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT elgazarnadam nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT abdulkarimali nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT faragaboalela nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT etmanamira nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT elsaeedahmedg nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT alaghayahya nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT ameryomnaa nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT raslanahmedm nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT nadimmenatallak nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT elsebaiemaiat nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT ayadahmed nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT hannalizae nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT gadallahahmed nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT elkadymohamed nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT drumhellerbradley nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT jayedavid nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT mantheydavid nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT gutmandavida nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT elfandyhabiba nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer
AT cooperleead nuclsascalablecrowdsourcingapproachanddatasetfornucleusclassificationandsegmentationinbreastcancer