Cargando…
Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach.
Many people have begun to use social media platforms due to the increased use of the Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks and drawbacks, such as Hate speech. People in multilingual societies, such as India, frequently mix their native lan...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Vienna
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9308896/ https://www.ncbi.nlm.nih.gov/pubmed/35911486 http://dx.doi.org/10.1007/s13278-022-00920-w |
_version_ | 1784753041598054400 |
---|---|
author | Biradar, Shankar Saumya, Sunil chauhan, Arun |
author_facet | Biradar, Shankar Saumya, Sunil chauhan, Arun |
author_sort | Biradar, Shankar |
collection | PubMed |
description | Many people have begun to use social media platforms due to the increased use of the Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks and drawbacks, such as Hate speech. People in multilingual societies, such as India, frequently mix their native language with English while speaking, so detecting hate content in such bilingual code-mixed data has drawn the larger interest of the research community. The majority of previous work focuses on high-resource language such as English, but very few researchers have concentrated on the mixed bilingual data like Hinglish. In this study, we investigated the performance of transformer models like IndicBERT and multilingual Bidirectional Encoder Representation(mBERT), as well as transfer learning from pre-trained language models like ULMFiT and Bidirectional encoder Representation(BERT), to find hateful content in Hinglish. Also, Transformer-based Interpreter and Feature extraction model on Deep Neural Network (TIF-DNN), is proposed in this work. The experimental results found that our proposed model outperforms existing state-of-art methods for Hate speech identification in Hinglish language with an accuracy of 73%. |
format | Online Article Text |
id | pubmed-9308896 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Vienna |
record_format | MEDLINE/PubMed |
spelling | pubmed-93088962022-07-25 Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. Biradar, Shankar Saumya, Sunil chauhan, Arun Soc Netw Anal Min Original Article Many people have begun to use social media platforms due to the increased use of the Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks and drawbacks, such as Hate speech. People in multilingual societies, such as India, frequently mix their native language with English while speaking, so detecting hate content in such bilingual code-mixed data has drawn the larger interest of the research community. The majority of previous work focuses on high-resource language such as English, but very few researchers have concentrated on the mixed bilingual data like Hinglish. In this study, we investigated the performance of transformer models like IndicBERT and multilingual Bidirectional Encoder Representation(mBERT), as well as transfer learning from pre-trained language models like ULMFiT and Bidirectional encoder Representation(BERT), to find hateful content in Hinglish. Also, Transformer-based Interpreter and Feature extraction model on Deep Neural Network (TIF-DNN), is proposed in this work. The experimental results found that our proposed model outperforms existing state-of-art methods for Hate speech identification in Hinglish language with an accuracy of 73%. Springer Vienna 2022-07-24 2022 /pmc/articles/PMC9308896/ /pubmed/35911486 http://dx.doi.org/10.1007/s13278-022-00920-w Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Biradar, Shankar Saumya, Sunil chauhan, Arun Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. |
title | Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. |
title_full | Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. |
title_fullStr | Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. |
title_full_unstemmed | Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. |
title_short | Fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. |
title_sort | fighting hate speech from bilingual hinglish speaker’s perspective, a transformer- and translation-based approach. |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9308896/ https://www.ncbi.nlm.nih.gov/pubmed/35911486 http://dx.doi.org/10.1007/s13278-022-00920-w |
work_keys_str_mv | AT biradarshankar fightinghatespeechfrombilingualhinglishspeakersperspectiveatransformerandtranslationbasedapproach AT saumyasunil fightinghatespeechfrombilingualhinglishspeakersperspectiveatransformerandtranslationbasedapproach AT chauhanarun fightinghatespeechfrombilingualhinglishspeakersperspectiveatransformerandtranslationbasedapproach |