Cargando…

COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic

It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullyi...

Descripción completa

Detalles Bibliográficos
Autores principales: Paul, Sayanta, Saha, Sriparna, Singh, Jyoti Prakash
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8742666/
https://www.ncbi.nlm.nih.gov/pubmed/35035263
http://dx.doi.org/10.1007/s11042-021-11601-9
_version_ 1784629765307629568
author Paul, Sayanta
Saha, Sriparna
Singh, Jyoti Prakash
author_facet Paul, Sayanta
Saha, Sriparna
Singh, Jyoti Prakash
author_sort Paul, Sayanta
collection PubMed
description It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullying has become a more evolving area of research over posts or comments in social media platforms. In multilingual societies like India, code-switched texts comprise the majority of the Internet. Identifying the online bullying of the code-switched user is bit challenging than monolingual cases. As a first step towards enabling the development of approaches for cyberbullying detection, we developed a new code-switched dataset, collected from Twitter utterances annotated with binary labels. To demonstrate the utility of the proposed dataset, we build different machine learning (Support Vector Machine & Logistic Regression) and deep learning (Multilayer Perceptron, Convolution Neural Network, BiLSTM, BERT) algorithms to detect cyberbullying of English-Hindi (En-Hi) code-switched text. Our proposed model integrates different hand-crafted features and is enriched by sequential and semantic patterns generated by different state-of-the-art deep neural network models. Initial experimental results of the proposed deep ensemble model on our code-switched data reveal that our approach yields state-of-the-art results, i.e., 0.93 in terms of macro-averaged F1 score. The dataset and codes of the present study will be made publicly available on the paper’s companion repository [https://github.com/95sayanta/COVID-19-and-Cyberbullying].
format Online
Article
Text
id pubmed-8742666
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-87426662022-01-10 COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic Paul, Sayanta Saha, Sriparna Singh, Jyoti Prakash Multimed Tools Appl 1209: Recent Advances on Social Media Analytics and Multimedia Systems: Issues and Challenges It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullying has become a more evolving area of research over posts or comments in social media platforms. In multilingual societies like India, code-switched texts comprise the majority of the Internet. Identifying the online bullying of the code-switched user is bit challenging than monolingual cases. As a first step towards enabling the development of approaches for cyberbullying detection, we developed a new code-switched dataset, collected from Twitter utterances annotated with binary labels. To demonstrate the utility of the proposed dataset, we build different machine learning (Support Vector Machine & Logistic Regression) and deep learning (Multilayer Perceptron, Convolution Neural Network, BiLSTM, BERT) algorithms to detect cyberbullying of English-Hindi (En-Hi) code-switched text. Our proposed model integrates different hand-crafted features and is enriched by sequential and semantic patterns generated by different state-of-the-art deep neural network models. Initial experimental results of the proposed deep ensemble model on our code-switched data reveal that our approach yields state-of-the-art results, i.e., 0.93 in terms of macro-averaged F1 score. The dataset and codes of the present study will be made publicly available on the paper’s companion repository [https://github.com/95sayanta/COVID-19-and-Cyberbullying]. Springer US 2022-01-08 2023 /pmc/articles/PMC8742666/ /pubmed/35035263 http://dx.doi.org/10.1007/s11042-021-11601-9 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle 1209: Recent Advances on Social Media Analytics and Multimedia Systems: Issues and Challenges
Paul, Sayanta
Saha, Sriparna
Singh, Jyoti Prakash
COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
title COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
title_full COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
title_fullStr COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
title_full_unstemmed COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
title_short COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
title_sort covid-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
topic 1209: Recent Advances on Social Media Analytics and Multimedia Systems: Issues and Challenges
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8742666/
https://www.ncbi.nlm.nih.gov/pubmed/35035263
http://dx.doi.org/10.1007/s11042-021-11601-9
work_keys_str_mv AT paulsayanta covid19andcyberbullyingdeepensemblemodeltoidentifycyberbullyingfromcodeswitchedlanguagesduringthepandemic
AT sahasriparna covid19andcyberbullyingdeepensemblemodeltoidentifycyberbullyingfromcodeswitchedlanguagesduringthepandemic
AT singhjyotiprakash covid19andcyberbullyingdeepensemblemodeltoidentifycyberbullyingfromcodeswitchedlanguagesduringthepandemic