Cargando…

“Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels

Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we...

Descripción completa

Detalles Bibliográficos
Autores principales: Seewann, Lena, Verwiebe, Roland, Buder, Claudia, Fritsch, Nina-Sophie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9515904/
https://www.ncbi.nlm.nih.gov/pubmed/36188727
http://dx.doi.org/10.3389/fdata.2022.908636
_version_ 1784798594303262720
author Seewann, Lena
Verwiebe, Roland
Buder, Claudia
Fritsch, Nina-Sophie
author_facet Seewann, Lena
Verwiebe, Roland
Buder, Claudia
Fritsch, Nina-Sophie
author_sort Seewann, Lena
collection PubMed
description Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we are interested in estimating the gender of German social media creators. By using the example of a random sample of 200 YouTube channels, we compare several classification methods, namely (1) a survey among university staff, (2) a name dictionary method with the World Gender Name Dictionary as a reference list, (3) an algorithmic approach using the website gender-api.com, and (4) a Multinomial Naïve Bayes (MNB) machine learning technique. These different methods identify gender attributes based on YouTube channel names and descriptions in German but are adaptable to other languages. Our contribution will evaluate the share of identifiable channels, accuracy and meaningfulness of classification, as well as limits and benefits of each approach. We aim to address methodological challenges connected to classifying gender attributes for YouTube channels as well as related to reinforcing stereotypes and ethical implications.
format Online
Article
Text
id pubmed-9515904
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95159042022-09-29 “Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels Seewann, Lena Verwiebe, Roland Buder, Claudia Fritsch, Nina-Sophie Front Big Data Big Data Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we are interested in estimating the gender of German social media creators. By using the example of a random sample of 200 YouTube channels, we compare several classification methods, namely (1) a survey among university staff, (2) a name dictionary method with the World Gender Name Dictionary as a reference list, (3) an algorithmic approach using the website gender-api.com, and (4) a Multinomial Naïve Bayes (MNB) machine learning technique. These different methods identify gender attributes based on YouTube channel names and descriptions in German but are adaptable to other languages. Our contribution will evaluate the share of identifiable channels, accuracy and meaningfulness of classification, as well as limits and benefits of each approach. We aim to address methodological challenges connected to classifying gender attributes for YouTube channels as well as related to reinforcing stereotypes and ethical implications. Frontiers Media S.A. 2022-09-14 /pmc/articles/PMC9515904/ /pubmed/36188727 http://dx.doi.org/10.3389/fdata.2022.908636 Text en Copyright © 2022 Seewann, Verwiebe, Buder and Fritsch. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Seewann, Lena
Verwiebe, Roland
Buder, Claudia
Fritsch, Nina-Sophie
“Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels
title “Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels
title_full “Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels
title_fullStr “Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels
title_full_unstemmed “Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels
title_short “Broadcast your gender.” A comparison of four text-based classification methods of German YouTube channels
title_sort “broadcast your gender.” a comparison of four text-based classification methods of german youtube channels
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9515904/
https://www.ncbi.nlm.nih.gov/pubmed/36188727
http://dx.doi.org/10.3389/fdata.2022.908636
work_keys_str_mv AT seewannlena broadcastyourgenderacomparisonoffourtextbasedclassificationmethodsofgermanyoutubechannels
AT verwieberoland broadcastyourgenderacomparisonoffourtextbasedclassificationmethodsofgermanyoutubechannels
AT buderclaudia broadcastyourgenderacomparisonoffourtextbasedclassificationmethodsofgermanyoutubechannels
AT fritschninasophie broadcastyourgenderacomparisonoffourtextbasedclassificationmethodsofgermanyoutubechannels