Cargando…

Addressing religious hate online: from taxonomy creation to automated detection

Abusive language in online social media is a pervasive and harmful phenomenon which calls for automatic computational approaches to be successfully contained. Previous studies have introduced corpora and natural language processing approaches for specific kinds of online abuse, mainly focusing on mi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ramponi, Alan, Testa, Benedetta, Tonelli, Sara, Jezek, Elisabetta
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2022
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280248/ https://www.ncbi.nlm.nih.gov/pubmed/37346317 http://dx.doi.org/10.7717/peerj-cs.1128

_version_	1785060756692140032
author	Ramponi, Alan Testa, Benedetta Tonelli, Sara Jezek, Elisabetta
author_facet	Ramponi, Alan Testa, Benedetta Tonelli, Sara Jezek, Elisabetta
author_sort	Ramponi, Alan
collection	PubMed
description	Abusive language in online social media is a pervasive and harmful phenomenon which calls for automatic computational approaches to be successfully contained. Previous studies have introduced corpora and natural language processing approaches for specific kinds of online abuse, mainly focusing on misogyny and racism. A current underexplored area in this context is religious hate, for which efforts in data and methods to date have been rather scattered. This is exacerbated by different annotation schemes that available datasets use, which inevitably lead to poor repurposing of data in wider contexts. Furthermore, religious hate is very much dependent on country-specific factors, including the presence and visibility of religious minorities, societal issues, historical background, and current political decisions. Motivated by the lack of annotated data specifically tailoring religion and the poor interoperability of current datasets, in this article we propose a fine-grained labeling scheme for religious hate speech detection. Such scheme lies on a wider and highly-interoperable taxonomy of abusive language, and covers the three main monotheistic religions: Judaism, Christianity and Islam. Moreover, we introduce a Twitter dataset in two languages—English and Italian—that has been annotated following the proposed annotation scheme. We experiment with several classification algorithms on the annotated dataset, from traditional machine learning classifiers to recent transformer-based language models, assessing the difficulty of two tasks: abusive language detection and religious hate speech detection. Finally, we investigate the cross-lingual transferability of multilingual models on the tasks, shedding light on the viability of repurposing our dataset for religious hate speech detection on low-resource languages. We release the annotated data and publicly distribute the code for our classification experiments at https://github.com/dhfbk/religious-hate-speech.
format	Online Article Text
id	pubmed-10280248
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-102802482023-06-21 Addressing religious hate online: from taxonomy creation to automated detection Ramponi, Alan Testa, Benedetta Tonelli, Sara Jezek, Elisabetta PeerJ Comput Sci Artificial Intelligence Abusive language in online social media is a pervasive and harmful phenomenon which calls for automatic computational approaches to be successfully contained. Previous studies have introduced corpora and natural language processing approaches for specific kinds of online abuse, mainly focusing on misogyny and racism. A current underexplored area in this context is religious hate, for which efforts in data and methods to date have been rather scattered. This is exacerbated by different annotation schemes that available datasets use, which inevitably lead to poor repurposing of data in wider contexts. Furthermore, religious hate is very much dependent on country-specific factors, including the presence and visibility of religious minorities, societal issues, historical background, and current political decisions. Motivated by the lack of annotated data specifically tailoring religion and the poor interoperability of current datasets, in this article we propose a fine-grained labeling scheme for religious hate speech detection. Such scheme lies on a wider and highly-interoperable taxonomy of abusive language, and covers the three main monotheistic religions: Judaism, Christianity and Islam. Moreover, we introduce a Twitter dataset in two languages—English and Italian—that has been annotated following the proposed annotation scheme. We experiment with several classification algorithms on the annotated dataset, from traditional machine learning classifiers to recent transformer-based language models, assessing the difficulty of two tasks: abusive language detection and religious hate speech detection. Finally, we investigate the cross-lingual transferability of multilingual models on the tasks, shedding light on the viability of repurposing our dataset for religious hate speech detection on low-resource languages. We release the annotated data and publicly distribute the code for our classification experiments at https://github.com/dhfbk/religious-hate-speech. PeerJ Inc. 2022-12-15 /pmc/articles/PMC10280248/ /pubmed/37346317 http://dx.doi.org/10.7717/peerj-cs.1128 Text en ©2022 Ramponi et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Artificial Intelligence Ramponi, Alan Testa, Benedetta Tonelli, Sara Jezek, Elisabetta Addressing religious hate online: from taxonomy creation to automated detection
title	Addressing religious hate online: from taxonomy creation to automated detection
title_full	Addressing religious hate online: from taxonomy creation to automated detection
title_fullStr	Addressing religious hate online: from taxonomy creation to automated detection
title_full_unstemmed	Addressing religious hate online: from taxonomy creation to automated detection
title_short	Addressing religious hate online: from taxonomy creation to automated detection
title_sort	addressing religious hate online: from taxonomy creation to automated detection
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280248/ https://www.ncbi.nlm.nih.gov/pubmed/37346317 http://dx.doi.org/10.7717/peerj-cs.1128
work_keys_str_mv	AT ramponialan addressingreligioushateonlinefromtaxonomycreationtoautomateddetection AT testabenedetta addressingreligioushateonlinefromtaxonomycreationtoautomateddetection AT tonellisara addressingreligioushateonlinefromtaxonomycreationtoautomateddetection AT jezekelisabetta addressingreligioushateonlinefromtaxonomycreationtoautomateddetection

Addressing religious hate online: from taxonomy creation to automated detection

Ejemplares similares