Cargando…
A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons
BACKGROUND: Surgeons who receive reliable feedback on their performance quickly master the skills necessary for surgery. Such performance-based feedback can be provided by a recently-developed artificial intelligence (AI) system that assesses a surgeon’s skills based on a surgical video while simult...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10063640/ https://www.ncbi.nlm.nih.gov/pubmed/36997578 http://dx.doi.org/10.1038/s43856-023-00263-3 |
_version_ | 1785017748522270720 |
---|---|
author | Kiyasseh, Dani Laca, Jasper Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. |
author_facet | Kiyasseh, Dani Laca, Jasper Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. |
author_sort | Kiyasseh, Dani |
collection | PubMed |
description | BACKGROUND: Surgeons who receive reliable feedback on their performance quickly master the skills necessary for surgery. Such performance-based feedback can be provided by a recently-developed artificial intelligence (AI) system that assesses a surgeon’s skills based on a surgical video while simultaneously highlighting aspects of the video most pertinent to the assessment. However, it remains an open question whether these highlights, or explanations, are equally reliable for all surgeons. METHODS: Here, we systematically quantify the reliability of AI-based explanations on surgical videos from three hospitals across two continents by comparing them to explanations generated by humans experts. To improve the reliability of AI-based explanations, we propose the strategy of training with explanations –TWIX –which uses human explanations as supervision to explicitly teach an AI system to highlight important video frames. RESULTS: We show that while AI-based explanations often align with human explanations, they are not equally reliable for different sub-cohorts of surgeons (e.g., novices vs. experts), a phenomenon we refer to as an explanation bias. We also show that TWIX enhances the reliability of AI-based explanations, mitigates the explanation bias, and improves the performance of AI systems across hospitals. These findings extend to a training environment where medical students can be provided with feedback today. CONCLUSIONS: Our study informs the impending implementation of AI-augmented surgical training and surgeon credentialing programs, and contributes to the safe and fair democratization of surgery. |
format | Online Article Text |
id | pubmed-10063640 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-100636402023-04-01 A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons Kiyasseh, Dani Laca, Jasper Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. Commun Med (Lond) Article BACKGROUND: Surgeons who receive reliable feedback on their performance quickly master the skills necessary for surgery. Such performance-based feedback can be provided by a recently-developed artificial intelligence (AI) system that assesses a surgeon’s skills based on a surgical video while simultaneously highlighting aspects of the video most pertinent to the assessment. However, it remains an open question whether these highlights, or explanations, are equally reliable for all surgeons. METHODS: Here, we systematically quantify the reliability of AI-based explanations on surgical videos from three hospitals across two continents by comparing them to explanations generated by humans experts. To improve the reliability of AI-based explanations, we propose the strategy of training with explanations –TWIX –which uses human explanations as supervision to explicitly teach an AI system to highlight important video frames. RESULTS: We show that while AI-based explanations often align with human explanations, they are not equally reliable for different sub-cohorts of surgeons (e.g., novices vs. experts), a phenomenon we refer to as an explanation bias. We also show that TWIX enhances the reliability of AI-based explanations, mitigates the explanation bias, and improves the performance of AI systems across hospitals. These findings extend to a training environment where medical students can be provided with feedback today. CONCLUSIONS: Our study informs the impending implementation of AI-augmented surgical training and surgeon credentialing programs, and contributes to the safe and fair democratization of surgery. Nature Publishing Group UK 2023-03-30 /pmc/articles/PMC10063640/ /pubmed/36997578 http://dx.doi.org/10.1038/s43856-023-00263-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Kiyasseh, Dani Laca, Jasper Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons |
title | A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons |
title_full | A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons |
title_fullStr | A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons |
title_full_unstemmed | A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons |
title_short | A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons |
title_sort | multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10063640/ https://www.ncbi.nlm.nih.gov/pubmed/36997578 http://dx.doi.org/10.1038/s43856-023-00263-3 |
work_keys_str_mv | AT kiyassehdani amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT lacajasper amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT haquetaseenf amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT milesbrianj amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT wagnerchristian amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT donohodaniela amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT anandkumaranimashree amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT hungandrewj amultiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT kiyassehdani multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT lacajasper multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT haquetaseenf multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT milesbrianj multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT wagnerchristian multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT donohodaniela multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT anandkumaranimashree multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons AT hungandrewj multiinstitutionalstudyusingartificialintelligencetoprovidereliableandfairfeedbacktosurgeons |