Cargando…

Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms

PURPOSE: Validation of artificial intelligence (AI) algorithms in digital pathology with a reference standard is necessary before widespread clinical use, but few examples focus on creating a reference standard based on pathologist annotations. This work assesses the results of a pilot study that co...

Descripción completa

Detalles Bibliográficos
Autores principales: Elfer, Katherine, Dudgeon, Sarah, Garcia, Victor, Blenman, Kim, Hytopoulos, Evangelos, Wen, Si, Li, Xiaoxian, Ly, Amy, Werness, Bruce, Sheth, Manasi S., Amgad, Mohamed, Gupta, Rajarsi, Saltz, Joel, Hanna, Matthew G., Ehinger, Anna, Peeters, Dieter, Salgado, Roberto, Gallas, Brandon D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Society of Photo-Optical Instrumentation Engineers 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326105/
https://www.ncbi.nlm.nih.gov/pubmed/35911208
http://dx.doi.org/10.1117/1.JMI.9.4.047501
_version_ 1784757203897417728
author Elfer, Katherine
Dudgeon, Sarah
Garcia, Victor
Blenman, Kim
Hytopoulos, Evangelos
Wen, Si
Li, Xiaoxian
Ly, Amy
Werness, Bruce
Sheth, Manasi S.
Amgad, Mohamed
Gupta, Rajarsi
Saltz, Joel
Hanna, Matthew G.
Ehinger, Anna
Peeters, Dieter
Salgado, Roberto
Gallas, Brandon D.
author_facet Elfer, Katherine
Dudgeon, Sarah
Garcia, Victor
Blenman, Kim
Hytopoulos, Evangelos
Wen, Si
Li, Xiaoxian
Ly, Amy
Werness, Bruce
Sheth, Manasi S.
Amgad, Mohamed
Gupta, Rajarsi
Saltz, Joel
Hanna, Matthew G.
Ehinger, Anna
Peeters, Dieter
Salgado, Roberto
Gallas, Brandon D.
author_sort Elfer, Katherine
collection PubMed
description PURPOSE: Validation of artificial intelligence (AI) algorithms in digital pathology with a reference standard is necessary before widespread clinical use, but few examples focus on creating a reference standard based on pathologist annotations. This work assesses the results of a pilot study that collects density estimates of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer biopsy specimens. This work will inform the creation of a validation dataset for the evaluation of AI algorithms fit for a regulatory purpose. APPROACH: Collaborators and crowdsourced pathologists contributed glass slides, digital images, and annotations. Here, “annotations” refer to any marks, segmentations, measurements, or labels a pathologist adds to a report, image, region of interest (ROI), or biological feature. Pathologists estimated sTILs density in 640 ROIs from hematoxylin and eosin stained slides of 64 patients via two modalities: an optical light microscope and two digital image viewing platforms. RESULTS: The pilot study generated 7373 sTILs density estimates from 29 pathologists. Analysis of annotations found the variability of density estimates per ROI increases with the mean; the root mean square differences were 4.46, 14.25, and 26.25 as the mean density ranged from 0% to 10%, 11% to 40%, and 41% to 100%, respectively. The pilot study informs three areas of improvement for future work: technical workflows, annotation platforms, and agreement analysis methods. Upgrades to the workflows and platforms will improve operability and increase annotation speed and consistency. CONCLUSIONS: Exploratory data analysis demonstrates the need to develop new statistical approaches for agreement. The pilot study dataset and analysis methods are publicly available to allow community feedback. The development and results of the validation dataset will be publicly available to serve as an instructive tool that can be replicated by developers and researchers.
format Online
Article
Text
id pubmed-9326105
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Society of Photo-Optical Instrumentation Engineers
record_format MEDLINE/PubMed
spelling pubmed-93261052023-07-27 Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms Elfer, Katherine Dudgeon, Sarah Garcia, Victor Blenman, Kim Hytopoulos, Evangelos Wen, Si Li, Xiaoxian Ly, Amy Werness, Bruce Sheth, Manasi S. Amgad, Mohamed Gupta, Rajarsi Saltz, Joel Hanna, Matthew G. Ehinger, Anna Peeters, Dieter Salgado, Roberto Gallas, Brandon D. J Med Imaging (Bellingham) Digital Pathology PURPOSE: Validation of artificial intelligence (AI) algorithms in digital pathology with a reference standard is necessary before widespread clinical use, but few examples focus on creating a reference standard based on pathologist annotations. This work assesses the results of a pilot study that collects density estimates of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer biopsy specimens. This work will inform the creation of a validation dataset for the evaluation of AI algorithms fit for a regulatory purpose. APPROACH: Collaborators and crowdsourced pathologists contributed glass slides, digital images, and annotations. Here, “annotations” refer to any marks, segmentations, measurements, or labels a pathologist adds to a report, image, region of interest (ROI), or biological feature. Pathologists estimated sTILs density in 640 ROIs from hematoxylin and eosin stained slides of 64 patients via two modalities: an optical light microscope and two digital image viewing platforms. RESULTS: The pilot study generated 7373 sTILs density estimates from 29 pathologists. Analysis of annotations found the variability of density estimates per ROI increases with the mean; the root mean square differences were 4.46, 14.25, and 26.25 as the mean density ranged from 0% to 10%, 11% to 40%, and 41% to 100%, respectively. The pilot study informs three areas of improvement for future work: technical workflows, annotation platforms, and agreement analysis methods. Upgrades to the workflows and platforms will improve operability and increase annotation speed and consistency. CONCLUSIONS: Exploratory data analysis demonstrates the need to develop new statistical approaches for agreement. The pilot study dataset and analysis methods are publicly available to allow community feedback. The development and results of the validation dataset will be publicly available to serve as an instructive tool that can be replicated by developers and researchers. Society of Photo-Optical Instrumentation Engineers 2022-07-27 2022-07 /pmc/articles/PMC9326105/ /pubmed/35911208 http://dx.doi.org/10.1117/1.JMI.9.4.047501 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
spellingShingle Digital Pathology
Elfer, Katherine
Dudgeon, Sarah
Garcia, Victor
Blenman, Kim
Hytopoulos, Evangelos
Wen, Si
Li, Xiaoxian
Ly, Amy
Werness, Bruce
Sheth, Manasi S.
Amgad, Mohamed
Gupta, Rajarsi
Saltz, Joel
Hanna, Matthew G.
Ehinger, Anna
Peeters, Dieter
Salgado, Roberto
Gallas, Brandon D.
Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms
title Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms
title_full Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms
title_fullStr Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms
title_full_unstemmed Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms
title_short Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms
title_sort pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms
topic Digital Pathology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326105/
https://www.ncbi.nlm.nih.gov/pubmed/35911208
http://dx.doi.org/10.1117/1.JMI.9.4.047501
work_keys_str_mv AT elferkatherine pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT dudgeonsarah pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT garciavictor pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT blenmankim pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT hytopoulosevangelos pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT wensi pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT lixiaoxian pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT lyamy pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT wernessbruce pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT shethmanasis pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT amgadmohamed pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT guptarajarsi pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT saltzjoel pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT hannamatthewg pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT ehingeranna pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT peetersdieter pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT salgadoroberto pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms
AT gallasbrandond pilotstudytoevaluatetoolstocollectpathologistannotationsforvalidatingmachinelearningalgorithms