Cargando…

Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled

Hate speech recognizers (HSRs) can be the panacea for containing hate in social media or can result in the biggest form of prejudice-based censorship hindering people to express their true selves. In this paper, we hypothesized how massive use of syntax can reduce the prejudice effect in HSRs. To ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Mastromattei, Michele, Ranaldi, Leonardo, Fallucchi, Francesca, Zanzotto, Fabio Massimo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044272/
https://www.ncbi.nlm.nih.gov/pubmed/35494863
http://dx.doi.org/10.7717/peerj-cs.859
_version_ 1784695070113398784
author Mastromattei, Michele
Ranaldi, Leonardo
Fallucchi, Francesca
Zanzotto, Fabio Massimo
author_facet Mastromattei, Michele
Ranaldi, Leonardo
Fallucchi, Francesca
Zanzotto, Fabio Massimo
author_sort Mastromattei, Michele
collection PubMed
description Hate speech recognizers (HSRs) can be the panacea for containing hate in social media or can result in the biggest form of prejudice-based censorship hindering people to express their true selves. In this paper, we hypothesized how massive use of syntax can reduce the prejudice effect in HSRs. To explore this hypothesis, we propose Unintended-bias Visualizer based on Kermit modeling (KERM-HATE): a syntax-based HSR, which is endowed with syntax heat parse trees used as a post-hoc explanation of classifications. KERM-HATE significantly outperforms BERT-based, RoBERTa-based and XLNet-based HSR on standard datasets. Surprisingly this result is not sufficient. In fact, the post-hoc analysis on novel datasets on recent divisive topics shows that even KERM-HATE carries the prejudice distilled from the initial corpus. Therefore, although tests on standard datasets may show higher performance, syntax alone cannot drive the “attention” of HSRs to ethically-unbiased features.
format Online
Article
Text
id pubmed-9044272
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-90442722022-04-28 Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled Mastromattei, Michele Ranaldi, Leonardo Fallucchi, Francesca Zanzotto, Fabio Massimo PeerJ Comput Sci Artificial Intelligence Hate speech recognizers (HSRs) can be the panacea for containing hate in social media or can result in the biggest form of prejudice-based censorship hindering people to express their true selves. In this paper, we hypothesized how massive use of syntax can reduce the prejudice effect in HSRs. To explore this hypothesis, we propose Unintended-bias Visualizer based on Kermit modeling (KERM-HATE): a syntax-based HSR, which is endowed with syntax heat parse trees used as a post-hoc explanation of classifications. KERM-HATE significantly outperforms BERT-based, RoBERTa-based and XLNet-based HSR on standard datasets. Surprisingly this result is not sufficient. In fact, the post-hoc analysis on novel datasets on recent divisive topics shows that even KERM-HATE carries the prejudice distilled from the initial corpus. Therefore, although tests on standard datasets may show higher performance, syntax alone cannot drive the “attention” of HSRs to ethically-unbiased features. PeerJ Inc. 2022-02-03 /pmc/articles/PMC9044272/ /pubmed/35494863 http://dx.doi.org/10.7717/peerj-cs.859 Text en ©2022 Mastromattei et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Mastromattei, Michele
Ranaldi, Leonardo
Fallucchi, Francesca
Zanzotto, Fabio Massimo
Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
title Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
title_full Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
title_fullStr Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
title_full_unstemmed Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
title_short Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
title_sort syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044272/
https://www.ncbi.nlm.nih.gov/pubmed/35494863
http://dx.doi.org/10.7717/peerj-cs.859
work_keys_str_mv AT mastromatteimichele syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled
AT ranaldileonardo syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled
AT fallucchifrancesca syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled
AT zanzottofabiomassimo syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled