Cargando…

Considerations in the reliability and fairness audits of predictive models for advance care planning

Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we condu...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Jonathan, Sattler, Amelia, Wang, Samantha, Khaki, Ali Raza, Callahan, Alison, Fleming, Scott, Fong, Rebecca, Ehlert, Benjamin, Li, Ron C., Shieh, Lisa, Ramchandran, Kavitha, Gensheimer, Michael F., Chobot, Sarah, Pfohl, Stephen, Li, Siyun, Shum, Kenny, Parikh, Nitin, Desai, Priya, Seevaratnam, Briththa, Hanson, Melanie, Smith, Margaret, Xu, Yizhe, Gokhale, Arjun, Lin, Steven, Pfeffer, Michael A., Teuteberg, Winifred, Shah, Nigam H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9634737/
https://www.ncbi.nlm.nih.gov/pubmed/36339512
http://dx.doi.org/10.3389/fdgth.2022.943768
_version_ 1784824561051631616
author Lu, Jonathan
Sattler, Amelia
Wang, Samantha
Khaki, Ali Raza
Callahan, Alison
Fleming, Scott
Fong, Rebecca
Ehlert, Benjamin
Li, Ron C.
Shieh, Lisa
Ramchandran, Kavitha
Gensheimer, Michael F.
Chobot, Sarah
Pfohl, Stephen
Li, Siyun
Shum, Kenny
Parikh, Nitin
Desai, Priya
Seevaratnam, Briththa
Hanson, Melanie
Smith, Margaret
Xu, Yizhe
Gokhale, Arjun
Lin, Steven
Pfeffer, Michael A.
Teuteberg, Winifred
Shah, Nigam H.
author_facet Lu, Jonathan
Sattler, Amelia
Wang, Samantha
Khaki, Ali Raza
Callahan, Alison
Fleming, Scott
Fong, Rebecca
Ehlert, Benjamin
Li, Ron C.
Shieh, Lisa
Ramchandran, Kavitha
Gensheimer, Michael F.
Chobot, Sarah
Pfohl, Stephen
Li, Siyun
Shum, Kenny
Parikh, Nitin
Desai, Priya
Seevaratnam, Briththa
Hanson, Melanie
Smith, Margaret
Xu, Yizhe
Gokhale, Arjun
Lin, Steven
Pfeffer, Michael A.
Teuteberg, Winifred
Shah, Nigam H.
author_sort Lu, Jonathan
collection PubMed
description Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration. We assessed the Epic End-of-Life (EOL) Index model and an internally developed Stanford Hospital Medicine (HM) Advance Care Planning (ACP) model in 3 practice settings: Primary Care, Inpatient Oncology and Hospital Medicine, using clinicians' answers to the surprise question (“Would you be surprised if [patient X] passed away in [Y years]?”) as a surrogate outcome. For performance, the models had positive predictive value (PPV) at or above 0.76 in all settings. In Hospital Medicine and Inpatient Oncology, the Stanford HM ACP model had higher sensitivity (0.69, 0.89 respectively) than the EOL model (0.20, 0.27), and better calibration (O/E 1.5, 1.7) than the EOL model (O/E 2.5, 3.0). The Epic EOL model flagged fewer patients (11%, 21% respectively) than the Stanford HM ACP model (38%, 75%). There were no differences in performance and calibration by sex. Both models had lower sensitivity in Hispanic/Latino male patients with Race listed as “Other.” 10 clinicians were surveyed after a presentation summarizing the audit. 10/10 reported that summary statistics, overall performance, and subgroup performance would affect their decision to use the model to guide care; 9/10 said the same for overall and subgroup calibration. The most commonly identified barriers for routinely conducting such reliability and fairness audits were poor demographic data quality and lack of data access. This audit required 115 person-hours across 8–10 months. Our recommendations for performing reliability and fairness audits include verifying data validity, analyzing model performance on intersectional subgroups, and collecting clinician-patient linkages as necessary for label generation by clinicians. Those responsible for AI models should require such audits before model deployment and mediate between model auditors and impacted stakeholders.
format Online
Article
Text
id pubmed-9634737
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96347372022-11-05 Considerations in the reliability and fairness audits of predictive models for advance care planning Lu, Jonathan Sattler, Amelia Wang, Samantha Khaki, Ali Raza Callahan, Alison Fleming, Scott Fong, Rebecca Ehlert, Benjamin Li, Ron C. Shieh, Lisa Ramchandran, Kavitha Gensheimer, Michael F. Chobot, Sarah Pfohl, Stephen Li, Siyun Shum, Kenny Parikh, Nitin Desai, Priya Seevaratnam, Briththa Hanson, Melanie Smith, Margaret Xu, Yizhe Gokhale, Arjun Lin, Steven Pfeffer, Michael A. Teuteberg, Winifred Shah, Nigam H. Front Digit Health Digital Health Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration. We assessed the Epic End-of-Life (EOL) Index model and an internally developed Stanford Hospital Medicine (HM) Advance Care Planning (ACP) model in 3 practice settings: Primary Care, Inpatient Oncology and Hospital Medicine, using clinicians' answers to the surprise question (“Would you be surprised if [patient X] passed away in [Y years]?”) as a surrogate outcome. For performance, the models had positive predictive value (PPV) at or above 0.76 in all settings. In Hospital Medicine and Inpatient Oncology, the Stanford HM ACP model had higher sensitivity (0.69, 0.89 respectively) than the EOL model (0.20, 0.27), and better calibration (O/E 1.5, 1.7) than the EOL model (O/E 2.5, 3.0). The Epic EOL model flagged fewer patients (11%, 21% respectively) than the Stanford HM ACP model (38%, 75%). There were no differences in performance and calibration by sex. Both models had lower sensitivity in Hispanic/Latino male patients with Race listed as “Other.” 10 clinicians were surveyed after a presentation summarizing the audit. 10/10 reported that summary statistics, overall performance, and subgroup performance would affect their decision to use the model to guide care; 9/10 said the same for overall and subgroup calibration. The most commonly identified barriers for routinely conducting such reliability and fairness audits were poor demographic data quality and lack of data access. This audit required 115 person-hours across 8–10 months. Our recommendations for performing reliability and fairness audits include verifying data validity, analyzing model performance on intersectional subgroups, and collecting clinician-patient linkages as necessary for label generation by clinicians. Those responsible for AI models should require such audits before model deployment and mediate between model auditors and impacted stakeholders. Frontiers Media S.A. 2022-09-12 /pmc/articles/PMC9634737/ /pubmed/36339512 http://dx.doi.org/10.3389/fdgth.2022.943768 Text en © 2022 Lu, Sattler, Wang, Khaki, Callahan, Fleming, Fong, Ehlert, Li, Shieh, Ramchandran, Gensheimer, Chobot, Pfohl, Li, Shum, Parikh, Desai, Seevaratnam, Hanson, Smith, Xu, Gokhale, Lin, Pfeffer, Teuteberg and Shah. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (https://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Digital Health
Lu, Jonathan
Sattler, Amelia
Wang, Samantha
Khaki, Ali Raza
Callahan, Alison
Fleming, Scott
Fong, Rebecca
Ehlert, Benjamin
Li, Ron C.
Shieh, Lisa
Ramchandran, Kavitha
Gensheimer, Michael F.
Chobot, Sarah
Pfohl, Stephen
Li, Siyun
Shum, Kenny
Parikh, Nitin
Desai, Priya
Seevaratnam, Briththa
Hanson, Melanie
Smith, Margaret
Xu, Yizhe
Gokhale, Arjun
Lin, Steven
Pfeffer, Michael A.
Teuteberg, Winifred
Shah, Nigam H.
Considerations in the reliability and fairness audits of predictive models for advance care planning
title Considerations in the reliability and fairness audits of predictive models for advance care planning
title_full Considerations in the reliability and fairness audits of predictive models for advance care planning
title_fullStr Considerations in the reliability and fairness audits of predictive models for advance care planning
title_full_unstemmed Considerations in the reliability and fairness audits of predictive models for advance care planning
title_short Considerations in the reliability and fairness audits of predictive models for advance care planning
title_sort considerations in the reliability and fairness audits of predictive models for advance care planning
topic Digital Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9634737/
https://www.ncbi.nlm.nih.gov/pubmed/36339512
http://dx.doi.org/10.3389/fdgth.2022.943768
work_keys_str_mv AT lujonathan considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT sattleramelia considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT wangsamantha considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT khakialiraza considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT callahanalison considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT flemingscott considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT fongrebecca considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT ehlertbenjamin considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT lironc considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT shiehlisa considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT ramchandrankavitha considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT gensheimermichaelf considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT chobotsarah considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT pfohlstephen considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT lisiyun considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT shumkenny considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT parikhnitin considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT desaipriya considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT seevaratnambriththa considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT hansonmelanie considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT smithmargaret considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT xuyizhe considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT gokhalearjun considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT linsteven considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT pfeffermichaela considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT teutebergwinifred considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning
AT shahnigamh considerationsinthereliabilityandfairnessauditsofpredictivemodelsforadvancecareplanning