Cargando…

Stress Testing Pathology Models with Generated Artifacts

BACKGROUND: Machine learning models provide significant opportunities for improvement in health care, but their “black-box” nature poses many risks. METHODS: We built a custom Python module as part of a framework for generating artifacts that are meant to be tunable and describable to allow for futu...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Nicholas Chandler, Kaplan, Jeremy, Lee, Joonsang, Hodgin, Jeffrey, Udager, Aaron, Rao, Arvind
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wolters Kluwer - Medknow 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8721870/
https://www.ncbi.nlm.nih.gov/pubmed/35070483
http://dx.doi.org/10.4103/jpi.jpi_6_21
Descripción
Sumario:BACKGROUND: Machine learning models provide significant opportunities for improvement in health care, but their “black-box” nature poses many risks. METHODS: We built a custom Python module as part of a framework for generating artifacts that are meant to be tunable and describable to allow for future testing needs. We conducted an analysis of a previously published digital pathology classification model and an internally developed kidney tissue segmentation model, utilizing a variety of generated artifacts including testing their effects. The artifacts simulated were bubbles, tissue folds, uneven illumination, marker lines, uneven sectioning, altered staining, and tissue tears. RESULTS: We found that there is some performance degradation on the tiles with artifacts, particularly with altered stains but also with marker lines, tissue folds, and uneven sectioning. We also found that the response of deep learning models to artifacts could be nonlinear. CONCLUSIONS: Generated artifacts can provide a useful tool for testing and building trust in machine learning models by understanding where these models might fail.