Cargando…

A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons

BACKGROUND: Surgeons who receive reliable feedback on their performance quickly master the skills necessary for surgery. Such performance-based feedback can be provided by a recently-developed artificial intelligence (AI) system that assesses a surgeon’s skills based on a surgical video while simult...

Descripción completa

Detalles Bibliográficos
Autores principales: Kiyasseh, Dani, Laca, Jasper, Haque, Taseen F., Miles, Brian J., Wagner, Christian, Donoho, Daniel A., Anandkumar, Animashree, Hung, Andrew J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10063640/
https://www.ncbi.nlm.nih.gov/pubmed/36997578
http://dx.doi.org/10.1038/s43856-023-00263-3
_version_ 1785017748522270720
author Kiyasseh, Dani
Laca, Jasper
Haque, Taseen F.
Miles, Brian J.
Wagner, Christian
Donoho, Daniel A.
Anandkumar, Animashree
Hung, Andrew J.
author_facet Kiyasseh, Dani
Laca, Jasper
Haque, Taseen F.
Miles, Brian J.
Wagner, Christian
Donoho, Daniel A.
Anandkumar, Animashree
Hung, Andrew J.
author_sort Kiyasseh, Dani
collection PubMed
description BACKGROUND: Surgeons who receive reliable feedback on their performance quickly master the skills necessary for surgery. Such performance-based feedback can be provided by a recently-developed artificial intelligence (AI) system that assesses a surgeon’s skills based on a surgical video while simultaneously highlighting aspects of the video most pertinent to the assessment. However, it remains an open question whether these highlights, or explanations, are equally reliable for all surgeons. METHODS: Here, we systematically quantify the reliability of AI-based explanations on surgical videos from three hospitals across two continents by comparing them to explanations generated by humans experts. To improve the reliability of AI-based explanations, we propose the strategy of training with explanations –TWIX –which uses human explanations as supervision to explicitly teach an AI system to highlight important video frames. RESULTS: We show that while AI-based explanations often align with human explanations, they are not equally reliable for different sub-cohorts of surgeons (e.g., novices vs. experts), a phenomenon we refer to as an explanation bias. We also show that TWIX enhances the reliability of AI-based explanations, mitigates the explanation bias, and improves the performance of AI systems across hospitals. These findings extend to a training environment where medical students can be provided with feedback today. CONCLUSIONS: Our study informs the impending implementation of AI-augmented surgical training and surgeon credentialing programs, and contributes to the safe and fair democratization of surgery.
format Online
Article
Text
id pubmed-10063640
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-100636402023-04-01 A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons Kiyasseh, Dani Laca, Jasper Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. Commun Med (Lond) Article BACKGROUND: Surgeons who receive reliable feedback on their performance quickly master the skills necessary for surgery. Such performance-based feedback can be provided by a recently-developed artificial intelligence (AI) system that assesses a surgeon’s skills based on a surgical video while simultaneously highlighting aspects of the video most pertinent to the assessment. However, it remains an open question whether these highlights, or explanations, are equally reliable for all surgeons. METHODS: Here, we systematically quantify the reliability of AI-based explanations on surgical videos from three hospitals across two continents by comparing them to explanations generated by humans experts. To improve the reliability of AI-based explanations, we propose the strategy of training with explanations –TWIX –which uses human explanations as supervision to explicitly teach an AI system to highlight important video frames. RESULTS: We show that while AI-based explanations often align with human explanations, they are not equally reliable for different sub-cohorts of surgeons (e.g., novices vs. experts), a phenomenon we refer to as an explanation bias. We also show that TWIX enhances the reliability of AI-based explanations, mitigates the explanation bias, and improves the performance of AI systems across hospitals. These findings extend to a training environment where medical students can be provided with feedback today. CONCLUSIONS: Our study informs the impending implementation of AI-augmented surgical training and surgeon credentialing programs, and contributes to the safe and fair democratization of surgery. Nature Publishing Group UK 2023-03-30 /pmc/articles/PMC10063640/ /pubmed/36997578 http://dx.doi.org/10.1038/s43856-023-00263-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Kiyasseh, Dani
Laca, Jasper
Haque, Taseen F.
Miles, Brian J.
Wagner, Christian
Donoho, Daniel A.
Anandkumar, Animashree
Hung, Andrew J.
A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
title A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
title_full A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
title_fullStr A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
title_full_unstemmed A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
title_short A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
title_sort multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10063640/
https://www.ncbi.nlm.nih.gov/pubmed/36997578
http://dx.doi.org/10.1038/s43856-023-00263-3
work_keys_str_mv AT kiyassehdani amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT lacajasper amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT haquetaseenf amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT milesbrianj amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT wagnerchristian amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT donohodaniela amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT anandkumaranimashree amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT hungandrewj amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT kiyassehdani multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT lacajasper multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT haquetaseenf multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT milesbrianj multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT wagnerchristian multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT donohodaniela multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT anandkumaranimashree multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons
AT hungandrewj multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons