Cargando…
Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review
BACKGROUND: While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-representative of underlying diversity, limits generalisability and risks biased AI-based d...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931338/ https://www.ncbi.nlm.nih.gov/pubmed/36812532 http://dx.doi.org/10.1371/journal.pdig.0000022 |
_version_ | 1784889227588141056 |
---|---|
author | Celi, Leo Anthony Cellini, Jacqueline Charpignon, Marie-Laure Dee, Edward Christopher Dernoncourt, Franck Eber, Rene Mitchell, William Greig Moukheiber, Lama Schirmer, Julian Situ, Julia Paguio, Joseph Park, Joel Wawira, Judy Gichoya Yao, Seth |
author_facet | Celi, Leo Anthony Cellini, Jacqueline Charpignon, Marie-Laure Dee, Edward Christopher Dernoncourt, Franck Eber, Rene Mitchell, William Greig Moukheiber, Lama Schirmer, Julian Situ, Julia Paguio, Joseph Park, Joel Wawira, Judy Gichoya Yao, Seth |
author_sort | Celi, Leo Anthony |
collection | PubMed |
description | BACKGROUND: While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-representative of underlying diversity, limits generalisability and risks biased AI-based decisions. Here, we describe the landscape of AI in clinical medicine to delineate population and data-source disparities. METHODS: We performed a scoping review of clinical papers published in PubMed in 2019 using AI techniques. We assessed differences in dataset country source, clinical specialty, and author nationality, sex, and expertise. A manually tagged subsample of PubMed articles was used to train a model, leveraging transfer-learning techniques (building upon an existing BioBERT model) to predict eligibility for inclusion (original, human, clinical AI literature). Of all eligible articles, database country source and clinical specialty were manually labelled. A BioBERT-based model predicted first/last author expertise. Author nationality was determined using corresponding affiliated institution information using Entrez Direct. And first/last author sex was evaluated using the Gendarize.io API. RESULTS: Our search yielded 30,576 articles, of which 7,314 (23.9%) were eligible for further analysis. Most databases came from the US (40.8%) and China (13.7%). Radiology was the most represented clinical specialty (40.4%), followed by pathology (9.1%). Authors were primarily from either China (24.0%) or the US (18.4%). First and last authors were predominately data experts (i.e., statisticians) (59.6% and 53.9% respectively) rather than clinicians. And the majority of first/last authors were male (74.1%). INTERPRETATION: U.S. and Chinese datasets and authors were disproportionately overrepresented in clinical AI, and almost all of the top 10 databases and author nationalities were from high income countries (HICs). AI techniques were most commonly employed for image-rich specialties, and authors were predominantly male, with non-clinical backgrounds. Development of technological infrastructure in data-poor regions, and diligence in external validation and model re-calibration prior to clinical implementation in the short-term, are crucial in ensuring clinical AI is meaningful for broader populations, and to avoid perpetuating global health inequity. |
format | Online Article Text |
id | pubmed-9931338 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-99313382023-02-16 Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review Celi, Leo Anthony Cellini, Jacqueline Charpignon, Marie-Laure Dee, Edward Christopher Dernoncourt, Franck Eber, Rene Mitchell, William Greig Moukheiber, Lama Schirmer, Julian Situ, Julia Paguio, Joseph Park, Joel Wawira, Judy Gichoya Yao, Seth PLOS Digit Health Research Article BACKGROUND: While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-representative of underlying diversity, limits generalisability and risks biased AI-based decisions. Here, we describe the landscape of AI in clinical medicine to delineate population and data-source disparities. METHODS: We performed a scoping review of clinical papers published in PubMed in 2019 using AI techniques. We assessed differences in dataset country source, clinical specialty, and author nationality, sex, and expertise. A manually tagged subsample of PubMed articles was used to train a model, leveraging transfer-learning techniques (building upon an existing BioBERT model) to predict eligibility for inclusion (original, human, clinical AI literature). Of all eligible articles, database country source and clinical specialty were manually labelled. A BioBERT-based model predicted first/last author expertise. Author nationality was determined using corresponding affiliated institution information using Entrez Direct. And first/last author sex was evaluated using the Gendarize.io API. RESULTS: Our search yielded 30,576 articles, of which 7,314 (23.9%) were eligible for further analysis. Most databases came from the US (40.8%) and China (13.7%). Radiology was the most represented clinical specialty (40.4%), followed by pathology (9.1%). Authors were primarily from either China (24.0%) or the US (18.4%). First and last authors were predominately data experts (i.e., statisticians) (59.6% and 53.9% respectively) rather than clinicians. And the majority of first/last authors were male (74.1%). INTERPRETATION: U.S. and Chinese datasets and authors were disproportionately overrepresented in clinical AI, and almost all of the top 10 databases and author nationalities were from high income countries (HICs). AI techniques were most commonly employed for image-rich specialties, and authors were predominantly male, with non-clinical backgrounds. Development of technological infrastructure in data-poor regions, and diligence in external validation and model re-calibration prior to clinical implementation in the short-term, are crucial in ensuring clinical AI is meaningful for broader populations, and to avoid perpetuating global health inequity. Public Library of Science 2022-03-31 /pmc/articles/PMC9931338/ /pubmed/36812532 http://dx.doi.org/10.1371/journal.pdig.0000022 Text en © 2022 Celi et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Celi, Leo Anthony Cellini, Jacqueline Charpignon, Marie-Laure Dee, Edward Christopher Dernoncourt, Franck Eber, Rene Mitchell, William Greig Moukheiber, Lama Schirmer, Julian Situ, Julia Paguio, Joseph Park, Joel Wawira, Judy Gichoya Yao, Seth Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review |
title | Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review |
title_full | Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review |
title_fullStr | Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review |
title_full_unstemmed | Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review |
title_short | Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review |
title_sort | sources of bias in artificial intelligence that perpetuate healthcare disparities—a global review |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931338/ https://www.ncbi.nlm.nih.gov/pubmed/36812532 http://dx.doi.org/10.1371/journal.pdig.0000022 |
work_keys_str_mv | AT celileoanthony sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT cellinijacqueline sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT charpignonmarielaure sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT deeedwardchristopher sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT dernoncourtfranck sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT eberrene sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT mitchellwilliamgreig sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT moukheiberlama sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT schirmerjulian sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT situjulia sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT paguiojoseph sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT parkjoel sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT wawirajudygichoya sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT yaoseth sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview AT sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview |