Cargando…

Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework

We present a topic modelling and data visualization methodology to examine gender-based disparities in news articles by topic. Existing research in topic modelling is largely focused on the text mining of closed corpora, i.e., those that include a fixed collection of composite texts. We showcase a m...

Descripción completa

Detalles Bibliográficos
Autores principales: Rao, Prashanth, Taboada, Maite
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8242240/
https://www.ncbi.nlm.nih.gov/pubmed/34222857
http://dx.doi.org/10.3389/frai.2021.664737
_version_ 1783715591069106176
author Rao, Prashanth
Taboada, Maite
author_facet Rao, Prashanth
Taboada, Maite
author_sort Rao, Prashanth
collection PubMed
description We present a topic modelling and data visualization methodology to examine gender-based disparities in news articles by topic. Existing research in topic modelling is largely focused on the text mining of closed corpora, i.e., those that include a fixed collection of composite texts. We showcase a methodology to discover topics via Latent Dirichlet Allocation, which can reliably produce human-interpretable topics over an open news corpus that continually grows with time. Our system generates topics, or distributions of keywords, for news articles on a monthly basis, to consistently detect key events and trends aligned with events in the real world. Findings from 2 years worth of news articles in mainstream English-language Canadian media indicate that certain topics feature either women or men more prominently and exhibit different types of language. Perhaps unsurprisingly, topics such as lifestyle, entertainment, and healthcare tend to be prominent in articles that quote more women than men. Topics such as sports, politics, and business are characteristic of articles that quote more men than women. The data shows a self-reinforcing gendered division of duties and representation in society. Quoting female sources more frequently in a caregiving role and quoting male sources more frequently in political and business roles enshrines women’s status as caregivers and men’s status as leaders and breadwinners. Our results can help journalists and policy makers better understand the unequal gender representation of those quoted in the news and facilitate news organizations’ efforts to achieve gender parity in their sources. The proposed methodology is robust, reproducible, and scalable to very large corpora, and can be used for similar studies involving unsupervised topic modelling and language analyses.
format Online
Article
Text
id pubmed-8242240
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-82422402021-07-01 Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework Rao, Prashanth Taboada, Maite Front Artif Intell Artificial Intelligence We present a topic modelling and data visualization methodology to examine gender-based disparities in news articles by topic. Existing research in topic modelling is largely focused on the text mining of closed corpora, i.e., those that include a fixed collection of composite texts. We showcase a methodology to discover topics via Latent Dirichlet Allocation, which can reliably produce human-interpretable topics over an open news corpus that continually grows with time. Our system generates topics, or distributions of keywords, for news articles on a monthly basis, to consistently detect key events and trends aligned with events in the real world. Findings from 2 years worth of news articles in mainstream English-language Canadian media indicate that certain topics feature either women or men more prominently and exhibit different types of language. Perhaps unsurprisingly, topics such as lifestyle, entertainment, and healthcare tend to be prominent in articles that quote more women than men. Topics such as sports, politics, and business are characteristic of articles that quote more men than women. The data shows a self-reinforcing gendered division of duties and representation in society. Quoting female sources more frequently in a caregiving role and quoting male sources more frequently in political and business roles enshrines women’s status as caregivers and men’s status as leaders and breadwinners. Our results can help journalists and policy makers better understand the unequal gender representation of those quoted in the news and facilitate news organizations’ efforts to achieve gender parity in their sources. The proposed methodology is robust, reproducible, and scalable to very large corpora, and can be used for similar studies involving unsupervised topic modelling and language analyses. Frontiers Media S.A. 2021-06-16 /pmc/articles/PMC8242240/ /pubmed/34222857 http://dx.doi.org/10.3389/frai.2021.664737 Text en Copyright © 2021 Rao and Taboada. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Artificial Intelligence
Rao, Prashanth
Taboada, Maite
Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework
title Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework
title_full Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework
title_fullStr Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework
title_full_unstemmed Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework
title_short Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework
title_sort gender bias in the news: a scalable topic modelling and visualization framework
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8242240/
https://www.ncbi.nlm.nih.gov/pubmed/34222857
http://dx.doi.org/10.3389/frai.2021.664737
work_keys_str_mv AT raoprashanth genderbiasinthenewsascalabletopicmodellingandvisualizationframework
AT taboadamaite genderbiasinthenewsascalabletopicmodellingandvisualizationframework