Cargando…

Avoiding bias when inferring race using name-based approaches

Racial disparity in academia is a widely acknowledged problem. The quantitative understanding of racial-based systemic inequalities is an important step towards a more equitable research system. However, because of the lack of robust information on authors’ race, few large-scale analyses have been p...

Descripción completa

Detalles Bibliográficos
Autores principales: Kozlowski, Diego, Murray, Dakota S., Bell, Alexis, Hulsey, Will, Larivière, Vincent, Monroe-White, Thema, Sugimoto, Cassidy R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8887775/
https://www.ncbi.nlm.nih.gov/pubmed/35231059
http://dx.doi.org/10.1371/journal.pone.0264270
_version_ 1784660980339310592
author Kozlowski, Diego
Murray, Dakota S.
Bell, Alexis
Hulsey, Will
Larivière, Vincent
Monroe-White, Thema
Sugimoto, Cassidy R.
author_facet Kozlowski, Diego
Murray, Dakota S.
Bell, Alexis
Hulsey, Will
Larivière, Vincent
Monroe-White, Thema
Sugimoto, Cassidy R.
author_sort Kozlowski, Diego
collection PubMed
description Racial disparity in academia is a widely acknowledged problem. The quantitative understanding of racial-based systemic inequalities is an important step towards a more equitable research system. However, because of the lack of robust information on authors’ race, few large-scale analyses have been performed on this topic. Algorithmic approaches offer one solution, using known information about authors, such as their names, to infer their perceived race. As with any other algorithm, the process of racial inference can generate biases if it is not carefully considered. The goal of this article is to assess the extent to which algorithmic bias is introduced using different approaches for name-based racial inference. We use information from the U.S. Census and mortgage applications to infer the race of U.S. affiliated authors in the Web of Science. We estimate the effects of using given and family names, thresholds or continuous distributions, and imputation. Our results demonstrate that the validity of name-based inference varies by race/ethnicity and that threshold approaches underestimate Black authors and overestimate White authors. We conclude with recommendations to avoid potential biases. This article lays the foundation for more systematic and less-biased investigations into racial disparities in science.
format Online
Article
Text
id pubmed-8887775
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-88877752022-03-02 Avoiding bias when inferring race using name-based approaches Kozlowski, Diego Murray, Dakota S. Bell, Alexis Hulsey, Will Larivière, Vincent Monroe-White, Thema Sugimoto, Cassidy R. PLoS One Research Article Racial disparity in academia is a widely acknowledged problem. The quantitative understanding of racial-based systemic inequalities is an important step towards a more equitable research system. However, because of the lack of robust information on authors’ race, few large-scale analyses have been performed on this topic. Algorithmic approaches offer one solution, using known information about authors, such as their names, to infer their perceived race. As with any other algorithm, the process of racial inference can generate biases if it is not carefully considered. The goal of this article is to assess the extent to which algorithmic bias is introduced using different approaches for name-based racial inference. We use information from the U.S. Census and mortgage applications to infer the race of U.S. affiliated authors in the Web of Science. We estimate the effects of using given and family names, thresholds or continuous distributions, and imputation. Our results demonstrate that the validity of name-based inference varies by race/ethnicity and that threshold approaches underestimate Black authors and overestimate White authors. We conclude with recommendations to avoid potential biases. This article lays the foundation for more systematic and less-biased investigations into racial disparities in science. Public Library of Science 2022-03-01 /pmc/articles/PMC8887775/ /pubmed/35231059 http://dx.doi.org/10.1371/journal.pone.0264270 Text en © 2022 Kozlowski et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Kozlowski, Diego
Murray, Dakota S.
Bell, Alexis
Hulsey, Will
Larivière, Vincent
Monroe-White, Thema
Sugimoto, Cassidy R.
Avoiding bias when inferring race using name-based approaches
title Avoiding bias when inferring race using name-based approaches
title_full Avoiding bias when inferring race using name-based approaches
title_fullStr Avoiding bias when inferring race using name-based approaches
title_full_unstemmed Avoiding bias when inferring race using name-based approaches
title_short Avoiding bias when inferring race using name-based approaches
title_sort avoiding bias when inferring race using name-based approaches
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8887775/
https://www.ncbi.nlm.nih.gov/pubmed/35231059
http://dx.doi.org/10.1371/journal.pone.0264270
work_keys_str_mv AT kozlowskidiego avoidingbiaswheninferringraceusingnamebasedapproaches
AT murraydakotas avoidingbiaswheninferringraceusingnamebasedapproaches
AT bellalexis avoidingbiaswheninferringraceusingnamebasedapproaches
AT hulseywill avoidingbiaswheninferringraceusingnamebasedapproaches
AT larivierevincent avoidingbiaswheninferringraceusingnamebasedapproaches
AT monroewhitethema avoidingbiaswheninferringraceusingnamebasedapproaches
AT sugimotocassidyr avoidingbiaswheninferringraceusingnamebasedapproaches