Cargando…

Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.

Many people have begun to use social media platforms due to the increased use of the Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks and drawbacks, such as Hate speech. People in multilingual societies, such as India, frequently mix their native lan...

Descripción completa

Detalles Bibliográficos
Autores principales: Biradar, Shankar, Saumya, Sunil, chauhan, Arun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Vienna 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9308896/
https://www.ncbi.nlm.nih.gov/pubmed/35911486
http://dx.doi.org/10.1007/s13278-022-00920-w
_version_ 1784753041598054400
author Biradar, Shankar
Saumya, Sunil
chauhan, Arun
author_facet Biradar, Shankar
Saumya, Sunil
chauhan, Arun
author_sort Biradar, Shankar
collection PubMed
description Many people have begun to use social media platforms due to the increased use of the Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks and drawbacks, such as Hate speech. People in multilingual societies, such as India, frequently mix their native language with English while speaking, so detecting hate content in such bilingual code-mixed data has drawn the larger interest of the research community. The majority of previous work focuses on high-resource language such as English, but very few researchers have concentrated on the mixed bilingual data like Hinglish. In this study, we investigated the performance of transformer models like IndicBERT and multilingual Bidirectional Encoder Representation(mBERT), as well as transfer learning from pre-trained language models like ULMFiT and Bidirectional encoder Representation(BERT), to find hateful content in Hinglish. Also, Transformer-based Interpreter and Feature extraction model on Deep Neural Network (TIF-DNN), is proposed in this work. The experimental results found that our proposed model outperforms existing state-of-art methods for Hate speech identification in Hinglish language with an accuracy of 73%.
format Online
Article
Text
id pubmed-9308896
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Vienna
record_format MEDLINE/PubMed
spelling pubmed-93088962022-07-25 Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. Biradar, Shankar Saumya, Sunil chauhan, Arun Soc Netw Anal Min Original Article Many people have begun to use social media platforms due to the increased use of the Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks and drawbacks, such as Hate speech. People in multilingual societies, such as India, frequently mix their native language with English while speaking, so detecting hate content in such bilingual code-mixed data has drawn the larger interest of the research community. The majority of previous work focuses on high-resource language such as English, but very few researchers have concentrated on the mixed bilingual data like Hinglish. In this study, we investigated the performance of transformer models like IndicBERT and multilingual Bidirectional Encoder Representation(mBERT), as well as transfer learning from pre-trained language models like ULMFiT and Bidirectional encoder Representation(BERT), to find hateful content in Hinglish. Also, Transformer-based Interpreter and Feature extraction model on Deep Neural Network (TIF-DNN), is proposed in this work. The experimental results found that our proposed model outperforms existing state-of-art methods for Hate speech identification in Hinglish language with an accuracy of 73%. Springer Vienna 2022-07-24 2022 /pmc/articles/PMC9308896/ /pubmed/35911486 http://dx.doi.org/10.1007/s13278-022-00920-w Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Article
Biradar, Shankar
Saumya, Sunil
chauhan, Arun
Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
title Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
title_full Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
title_fullStr Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
title_full_unstemmed Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
title_short Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
title_sort fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9308896/
https://www.ncbi.nlm.nih.gov/pubmed/35911486
http://dx.doi.org/10.1007/s13278-022-00920-w
work_keys_str_mv AT biradarshankar fightinghatespeechfrombilingualhinglishspeakersperspectiveatransformerandtranslationbasedapproach
AT saumyasunil fightinghatespeechfrombilingualhinglishspeakersperspectiveatransformerandtranslationbasedapproach
AT chauhanarun fightinghatespeechfrombilingualhinglishspeakersperspectiveatransformerandtranslationbasedapproach