Cargando…
Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled
Hate speech recognizers (HSRs) can be the panacea for containing hate in social media or can result in the biggest form of prejudice-based censorship hindering people to express their true selves. In this paper, we hypothesized how massive use of syntax can reduce the prejudice effect in HSRs. To ex...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044272/ https://www.ncbi.nlm.nih.gov/pubmed/35494863 http://dx.doi.org/10.7717/peerj-cs.859 |
_version_ | 1784695070113398784 |
---|---|
author | Mastromattei, Michele Ranaldi, Leonardo Fallucchi, Francesca Zanzotto, Fabio Massimo |
author_facet | Mastromattei, Michele Ranaldi, Leonardo Fallucchi, Francesca Zanzotto, Fabio Massimo |
author_sort | Mastromattei, Michele |
collection | PubMed |
description | Hate speech recognizers (HSRs) can be the panacea for containing hate in social media or can result in the biggest form of prejudice-based censorship hindering people to express their true selves. In this paper, we hypothesized how massive use of syntax can reduce the prejudice effect in HSRs. To explore this hypothesis, we propose Unintended-bias Visualizer based on Kermit modeling (KERM-HATE): a syntax-based HSR, which is endowed with syntax heat parse trees used as a post-hoc explanation of classifications. KERM-HATE significantly outperforms BERT-based, RoBERTa-based and XLNet-based HSR on standard datasets. Surprisingly this result is not sufficient. In fact, the post-hoc analysis on novel datasets on recent divisive topics shows that even KERM-HATE carries the prejudice distilled from the initial corpus. Therefore, although tests on standard datasets may show higher performance, syntax alone cannot drive the “attention” of HSRs to ethically-unbiased features. |
format | Online Article Text |
id | pubmed-9044272 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-90442722022-04-28 Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled Mastromattei, Michele Ranaldi, Leonardo Fallucchi, Francesca Zanzotto, Fabio Massimo PeerJ Comput Sci Artificial Intelligence Hate speech recognizers (HSRs) can be the panacea for containing hate in social media or can result in the biggest form of prejudice-based censorship hindering people to express their true selves. In this paper, we hypothesized how massive use of syntax can reduce the prejudice effect in HSRs. To explore this hypothesis, we propose Unintended-bias Visualizer based on Kermit modeling (KERM-HATE): a syntax-based HSR, which is endowed with syntax heat parse trees used as a post-hoc explanation of classifications. KERM-HATE significantly outperforms BERT-based, RoBERTa-based and XLNet-based HSR on standard datasets. Surprisingly this result is not sufficient. In fact, the post-hoc analysis on novel datasets on recent divisive topics shows that even KERM-HATE carries the prejudice distilled from the initial corpus. Therefore, although tests on standard datasets may show higher performance, syntax alone cannot drive the “attention” of HSRs to ethically-unbiased features. PeerJ Inc. 2022-02-03 /pmc/articles/PMC9044272/ /pubmed/35494863 http://dx.doi.org/10.7717/peerj-cs.859 Text en ©2022 Mastromattei et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Artificial Intelligence Mastromattei, Michele Ranaldi, Leonardo Fallucchi, Francesca Zanzotto, Fabio Massimo Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled |
title | Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled |
title_full | Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled |
title_fullStr | Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled |
title_full_unstemmed | Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled |
title_short | Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled |
title_sort | syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044272/ https://www.ncbi.nlm.nih.gov/pubmed/35494863 http://dx.doi.org/10.7717/peerj-cs.859 |
work_keys_str_mv | AT mastromatteimichele syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled AT ranaldileonardo syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled AT fallucchifrancesca syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled AT zanzottofabiomassimo syntaxandprejudiceethicallychargedbiasesofasyntaxbasedhatespeechrecognizerunveiled |