Cargando…
Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1)
Large Language Models (LLM) are AI tools that can respond human-like to voice or free-text commands without training on specific tasks. However, concerns have been raised about their potential racial bias in healthcare tasks. In this study, ChatGPT was used to generate healthcare-related text for pa...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10491360/ https://www.ncbi.nlm.nih.gov/pubmed/37693388 http://dx.doi.org/10.1101/2023.08.28.23294730 |
_version_ | 1785104044644106240 |
---|---|
author | Hanna, John J. Wakene, Abdi D. Lehmann, Christoph U. Medford, Richard J. |
author_facet | Hanna, John J. Wakene, Abdi D. Lehmann, Christoph U. Medford, Richard J. |
author_sort | Hanna, John J. |
collection | PubMed |
description | Large Language Models (LLM) are AI tools that can respond human-like to voice or free-text commands without training on specific tasks. However, concerns have been raised about their potential racial bias in healthcare tasks. In this study, ChatGPT was used to generate healthcare-related text for patients with HIV, analyzing data from 100 deidentified electronic health record encounters. Each patient’s data were fed four times with all information remaining the same except for race/ethnicity (African American, Asian, Hispanic White, Non-Hispanic White). The text output was analyzed for sentiment, subjectivity, reading ease, and most used words by race/ethnicity and insurance type. Results showed that instructions for African American, Asian, Hispanic White, and Non-Hispanic White patients had an average polarity of 0.14, 0.14, 0.15, and 0.14, respectively, with an average subjectivity of 0.46 for all races/ethnicities. The differences in polarity and subjectivity across races/ethnicities were not statistically significant. However, there was a statistically significant difference in word frequency across races/ethnicities and a statistically significant difference in subjectivity across insurance types with commercial insurance eliciting the most subjective responses and Medicare and other payer types the lowest. The study suggests that ChatGPT is relatively invariant to race/ethnicity and insurance type in terms of linguistic and readability measures. Further studies are needed to validate these results and assess their implications. |
format | Online Article Text |
id | pubmed-10491360 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-104913602023-09-09 Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1) Hanna, John J. Wakene, Abdi D. Lehmann, Christoph U. Medford, Richard J. medRxiv Article Large Language Models (LLM) are AI tools that can respond human-like to voice or free-text commands without training on specific tasks. However, concerns have been raised about their potential racial bias in healthcare tasks. In this study, ChatGPT was used to generate healthcare-related text for patients with HIV, analyzing data from 100 deidentified electronic health record encounters. Each patient’s data were fed four times with all information remaining the same except for race/ethnicity (African American, Asian, Hispanic White, Non-Hispanic White). The text output was analyzed for sentiment, subjectivity, reading ease, and most used words by race/ethnicity and insurance type. Results showed that instructions for African American, Asian, Hispanic White, and Non-Hispanic White patients had an average polarity of 0.14, 0.14, 0.15, and 0.14, respectively, with an average subjectivity of 0.46 for all races/ethnicities. The differences in polarity and subjectivity across races/ethnicities were not statistically significant. However, there was a statistically significant difference in word frequency across races/ethnicities and a statistically significant difference in subjectivity across insurance types with commercial insurance eliciting the most subjective responses and Medicare and other payer types the lowest. The study suggests that ChatGPT is relatively invariant to race/ethnicity and insurance type in terms of linguistic and readability measures. Further studies are needed to validate these results and assess their implications. Cold Spring Harbor Laboratory 2023-08-28 /pmc/articles/PMC10491360/ /pubmed/37693388 http://dx.doi.org/10.1101/2023.08.28.23294730 Text en https://creativecommons.org/licenses/by-nd/4.0/This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Hanna, John J. Wakene, Abdi D. Lehmann, Christoph U. Medford, Richard J. Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1) |
title | Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1) |
title_full | Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1) |
title_fullStr | Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1) |
title_full_unstemmed | Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1) |
title_short | Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT(1) |
title_sort | assessing racial and ethnic bias in text generation for healthcare-related tasks by chatgpt(1) |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10491360/ https://www.ncbi.nlm.nih.gov/pubmed/37693388 http://dx.doi.org/10.1101/2023.08.28.23294730 |
work_keys_str_mv | AT hannajohnj assessingracialandethnicbiasintextgenerationforhealthcarerelatedtasksbychatgpt1 AT wakeneabdid assessingracialandethnicbiasintextgenerationforhealthcarerelatedtasksbychatgpt1 AT lehmannchristophu assessingracialandethnicbiasintextgenerationforhealthcarerelatedtasksbychatgpt1 AT medfordrichardj assessingracialandethnicbiasintextgenerationforhealthcarerelatedtasksbychatgpt1 |