Cargando…

A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study

PURPOSE: Validating artificial intelligence algorithms for clinical use in medical images is a challenging endeavor due to a lack of standard reference data (ground truth). This topic typically occupies a small portion of the discussion in research papers since most of the efforts are focused on dev...

Descripción completa

Detalles Bibliográficos
Autores principales: Dudgeon, Sarah N., Wen, Si, Hanna, Matthew G., Gupta, Rajarsi, Amgad, Mohamed, Sheth, Manasi, Marble, Hetal, Huang, Richard, Herrmann, Markus D., Szu, Clifford H., Tong, Darick, Werness, Bruce, Szu, Evan, Larsimont, Denis, Madabhushi, Anant, Hytopoulos, Evangelos, Chen, Weijie, Singh, Rajendra, Hart, Steven N., Sharma, Ashish, Saltz, Joel, Salgado, Roberto, Gallas, Brandon D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wolters Kluwer - Medknow 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8609287/
https://www.ncbi.nlm.nih.gov/pubmed/34881099
http://dx.doi.org/10.4103/jpi.jpi_83_20
_version_ 1784602899107545088
author Dudgeon, Sarah N.
Wen, Si
Hanna, Matthew G.
Gupta, Rajarsi
Amgad, Mohamed
Sheth, Manasi
Marble, Hetal
Huang, Richard
Herrmann, Markus D.
Szu, Clifford H.
Tong, Darick
Werness, Bruce
Szu, Evan
Larsimont, Denis
Madabhushi, Anant
Hytopoulos, Evangelos
Chen, Weijie
Singh, Rajendra
Hart, Steven N.
Sharma, Ashish
Saltz, Joel
Salgado, Roberto
Gallas, Brandon D.
author_facet Dudgeon, Sarah N.
Wen, Si
Hanna, Matthew G.
Gupta, Rajarsi
Amgad, Mohamed
Sheth, Manasi
Marble, Hetal
Huang, Richard
Herrmann, Markus D.
Szu, Clifford H.
Tong, Darick
Werness, Bruce
Szu, Evan
Larsimont, Denis
Madabhushi, Anant
Hytopoulos, Evangelos
Chen, Weijie
Singh, Rajendra
Hart, Steven N.
Sharma, Ashish
Saltz, Joel
Salgado, Roberto
Gallas, Brandon D.
author_sort Dudgeon, Sarah N.
collection PubMed
description PURPOSE: Validating artificial intelligence algorithms for clinical use in medical images is a challenging endeavor due to a lack of standard reference data (ground truth). This topic typically occupies a small portion of the discussion in research papers since most of the efforts are focused on developing novel algorithms. In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images. We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer. METHODS: We digitized 64 glass slides of hematoxylin- and eosin-stained invasive ductal carcinoma core biopsies prepared at a single clinical site. A collaborating pathologist selected 10 regions of interest (ROIs) per slide for evaluation. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The microscope platform allows the same ROIs to be evaluated in both modes. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. RESULTS: In total, 19 pathologists made 1645 ROI evaluations during a data collection event and the following 2 weeks. The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. CONCLUSION: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will investigate methods to use the dataset as an external validation tool for algorithms. We will also consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the Food and Drug Administration via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.
format Online
Article
Text
id pubmed-8609287
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Wolters Kluwer - Medknow
record_format MEDLINE/PubMed
spelling pubmed-86092872021-12-07 A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study Dudgeon, Sarah N. Wen, Si Hanna, Matthew G. Gupta, Rajarsi Amgad, Mohamed Sheth, Manasi Marble, Hetal Huang, Richard Herrmann, Markus D. Szu, Clifford H. Tong, Darick Werness, Bruce Szu, Evan Larsimont, Denis Madabhushi, Anant Hytopoulos, Evangelos Chen, Weijie Singh, Rajendra Hart, Steven N. Sharma, Ashish Saltz, Joel Salgado, Roberto Gallas, Brandon D. J Pathol Inform Technical Note PURPOSE: Validating artificial intelligence algorithms for clinical use in medical images is a challenging endeavor due to a lack of standard reference data (ground truth). This topic typically occupies a small portion of the discussion in research papers since most of the efforts are focused on developing novel algorithms. In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images. We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer. METHODS: We digitized 64 glass slides of hematoxylin- and eosin-stained invasive ductal carcinoma core biopsies prepared at a single clinical site. A collaborating pathologist selected 10 regions of interest (ROIs) per slide for evaluation. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The microscope platform allows the same ROIs to be evaluated in both modes. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. RESULTS: In total, 19 pathologists made 1645 ROI evaluations during a data collection event and the following 2 weeks. The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. CONCLUSION: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will investigate methods to use the dataset as an external validation tool for algorithms. We will also consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the Food and Drug Administration via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned. Wolters Kluwer - Medknow 2021-11-15 /pmc/articles/PMC8609287/ /pubmed/34881099 http://dx.doi.org/10.4103/jpi.jpi_83_20 Text en Copyright: © 2021 Journal of Pathology Informatics https://creativecommons.org/licenses/by-nc-sa/4.0/This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.
spellingShingle Technical Note
Dudgeon, Sarah N.
Wen, Si
Hanna, Matthew G.
Gupta, Rajarsi
Amgad, Mohamed
Sheth, Manasi
Marble, Hetal
Huang, Richard
Herrmann, Markus D.
Szu, Clifford H.
Tong, Darick
Werness, Bruce
Szu, Evan
Larsimont, Denis
Madabhushi, Anant
Hytopoulos, Evangelos
Chen, Weijie
Singh, Rajendra
Hart, Steven N.
Sharma, Ashish
Saltz, Joel
Salgado, Roberto
Gallas, Brandon D.
A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
title A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
title_full A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
title_fullStr A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
title_full_unstemmed A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
title_short A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
title_sort pathologist-annotated dataset for validating artificial intelligence: a project description and pilot study
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8609287/
https://www.ncbi.nlm.nih.gov/pubmed/34881099
http://dx.doi.org/10.4103/jpi.jpi_83_20
work_keys_str_mv AT dudgeonsarahn apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT wensi apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT hannamatthewg apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT guptarajarsi apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT amgadmohamed apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT shethmanasi apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT marblehetal apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT huangrichard apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT herrmannmarkusd apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT szucliffordh apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT tongdarick apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT wernessbruce apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT szuevan apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT larsimontdenis apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT madabhushianant apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT hytopoulosevangelos apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT chenweijie apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT singhrajendra apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT hartstevenn apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT sharmaashish apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT saltzjoel apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT salgadoroberto apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT gallasbrandond apathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT dudgeonsarahn pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT wensi pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT hannamatthewg pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT guptarajarsi pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT amgadmohamed pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT shethmanasi pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT marblehetal pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT huangrichard pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT herrmannmarkusd pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT szucliffordh pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT tongdarick pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT wernessbruce pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT szuevan pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT larsimontdenis pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT madabhushianant pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT hytopoulosevangelos pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT chenweijie pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT singhrajendra pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT hartstevenn pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT sharmaashish pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT saltzjoel pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT salgadoroberto pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy
AT gallasbrandond pathologistannotateddatasetforvalidatingartificialintelligenceaprojectdescriptionandpilotstudy