Cargando…

An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes

OBJECTIVE: The increasing translation of artificial intelligence (AI)/machine learning (ML) models into clinical practice brings an increased risk of direct harm from modeling bias; however, bias remains incompletely measured in many medical AI applications. This article aims to provide a framework...

Descripción completa

Detalles Bibliográficos
Autores principales: Estiri, Hossein, Strasser, Zachary H, Rashidian, Sina, Klann, Jeffrey G, Wagholikar, Kavishwar B, McCoy, Thomas H, Murphy, Shawn N
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9277645/
https://www.ncbi.nlm.nih.gov/pubmed/35511151
http://dx.doi.org/10.1093/jamia/ocac070
_version_ 1784746026962255872
author Estiri, Hossein
Strasser, Zachary H
Rashidian, Sina
Klann, Jeffrey G
Wagholikar, Kavishwar B
McCoy, Thomas H
Murphy, Shawn N
author_facet Estiri, Hossein
Strasser, Zachary H
Rashidian, Sina
Klann, Jeffrey G
Wagholikar, Kavishwar B
McCoy, Thomas H
Murphy, Shawn N
author_sort Estiri, Hossein
collection PubMed
description OBJECTIVE: The increasing translation of artificial intelligence (AI)/machine learning (ML) models into clinical practice brings an increased risk of direct harm from modeling bias; however, bias remains incompletely measured in many medical AI applications. This article aims to provide a framework for objective evaluation of medical AI from multiple aspects, focusing on binary classification models. MATERIALS AND METHODS: Using data from over 56 000 Mass General Brigham (MGB) patients with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), we evaluate unrecognized bias in 4 AI models developed during the early months of the pandemic in Boston, Massachusetts that predict risks of hospital admission, ICU admission, mechanical ventilation, and death after a SARS-CoV-2 infection purely based on their pre-infection longitudinal medical records. Models were evaluated both retrospectively and prospectively using model-level metrics of discrimination, accuracy, and reliability, and a novel individual-level metric for error. RESULTS: We found inconsistent instances of model-level bias in the prediction models. From an individual-level aspect, however, we found most all models performing with slightly higher error rates for older patients. DISCUSSION: While a model can be biased against certain protected groups (ie, perform worse) in certain tasks, it can be at the same time biased towards another protected group (ie, perform better). As such, current bias evaluation studies may lack a full depiction of the variable effects of a model on its subpopulations. CONCLUSION: Only a holistic evaluation, a diligent search for unrecognized bias, can provide enough information for an unbiased judgment of AI bias that can invigorate follow-up investigations on identifying the underlying roots of bias and ultimately make a change.
format Online
Article
Text
id pubmed-9277645
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92776452022-07-18 An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes Estiri, Hossein Strasser, Zachary H Rashidian, Sina Klann, Jeffrey G Wagholikar, Kavishwar B McCoy, Thomas H Murphy, Shawn N J Am Med Inform Assoc Research and Applications OBJECTIVE: The increasing translation of artificial intelligence (AI)/machine learning (ML) models into clinical practice brings an increased risk of direct harm from modeling bias; however, bias remains incompletely measured in many medical AI applications. This article aims to provide a framework for objective evaluation of medical AI from multiple aspects, focusing on binary classification models. MATERIALS AND METHODS: Using data from over 56 000 Mass General Brigham (MGB) patients with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), we evaluate unrecognized bias in 4 AI models developed during the early months of the pandemic in Boston, Massachusetts that predict risks of hospital admission, ICU admission, mechanical ventilation, and death after a SARS-CoV-2 infection purely based on their pre-infection longitudinal medical records. Models were evaluated both retrospectively and prospectively using model-level metrics of discrimination, accuracy, and reliability, and a novel individual-level metric for error. RESULTS: We found inconsistent instances of model-level bias in the prediction models. From an individual-level aspect, however, we found most all models performing with slightly higher error rates for older patients. DISCUSSION: While a model can be biased against certain protected groups (ie, perform worse) in certain tasks, it can be at the same time biased towards another protected group (ie, perform better). As such, current bias evaluation studies may lack a full depiction of the variable effects of a model on its subpopulations. CONCLUSION: Only a holistic evaluation, a diligent search for unrecognized bias, can provide enough information for an unbiased judgment of AI bias that can invigorate follow-up investigations on identifying the underlying roots of bias and ultimately make a change. Oxford University Press 2022-05-12 /pmc/articles/PMC9277645/ /pubmed/35511151 http://dx.doi.org/10.1093/jamia/ocac070 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Estiri, Hossein
Strasser, Zachary H
Rashidian, Sina
Klann, Jeffrey G
Wagholikar, Kavishwar B
McCoy, Thomas H
Murphy, Shawn N
An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes
title An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes
title_full An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes
title_fullStr An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes
title_full_unstemmed An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes
title_short An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes
title_sort objective framework for evaluating unrecognized bias in medical ai models predicting covid-19 outcomes
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9277645/
https://www.ncbi.nlm.nih.gov/pubmed/35511151
http://dx.doi.org/10.1093/jamia/ocac070
work_keys_str_mv AT estirihossein anobjectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT strasserzacharyh anobjectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT rashidiansina anobjectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT klannjeffreyg anobjectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT wagholikarkavishwarb anobjectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT mccoythomash anobjectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT murphyshawnn anobjectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT estirihossein objectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT strasserzacharyh objectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT rashidiansina objectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT klannjeffreyg objectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT wagholikarkavishwarb objectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT mccoythomash objectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes
AT murphyshawnn objectiveframeworkforevaluatingunrecognizedbiasinmedicalaimodelspredictingcovid19outcomes