Cargando…
COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullyi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8742666/ https://www.ncbi.nlm.nih.gov/pubmed/35035263 http://dx.doi.org/10.1007/s11042-021-11601-9 |
_version_ | 1784629765307629568 |
---|---|
author | Paul, Sayanta Saha, Sriparna Singh, Jyoti Prakash |
author_facet | Paul, Sayanta Saha, Sriparna Singh, Jyoti Prakash |
author_sort | Paul, Sayanta |
collection | PubMed |
description | It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullying has become a more evolving area of research over posts or comments in social media platforms. In multilingual societies like India, code-switched texts comprise the majority of the Internet. Identifying the online bullying of the code-switched user is bit challenging than monolingual cases. As a first step towards enabling the development of approaches for cyberbullying detection, we developed a new code-switched dataset, collected from Twitter utterances annotated with binary labels. To demonstrate the utility of the proposed dataset, we build different machine learning (Support Vector Machine & Logistic Regression) and deep learning (Multilayer Perceptron, Convolution Neural Network, BiLSTM, BERT) algorithms to detect cyberbullying of English-Hindi (En-Hi) code-switched text. Our proposed model integrates different hand-crafted features and is enriched by sequential and semantic patterns generated by different state-of-the-art deep neural network models. Initial experimental results of the proposed deep ensemble model on our code-switched data reveal that our approach yields state-of-the-art results, i.e., 0.93 in terms of macro-averaged F1 score. The dataset and codes of the present study will be made publicly available on the paper’s companion repository [https://github.com/95sayanta/COVID-19-and-Cyberbullying]. |
format | Online Article Text |
id | pubmed-8742666 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-87426662022-01-10 COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic Paul, Sayanta Saha, Sriparna Singh, Jyoti Prakash Multimed Tools Appl 1209: Recent Advances on Social Media Analytics and Multimedia Systems: Issues and Challenges It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullying has become a more evolving area of research over posts or comments in social media platforms. In multilingual societies like India, code-switched texts comprise the majority of the Internet. Identifying the online bullying of the code-switched user is bit challenging than monolingual cases. As a first step towards enabling the development of approaches for cyberbullying detection, we developed a new code-switched dataset, collected from Twitter utterances annotated with binary labels. To demonstrate the utility of the proposed dataset, we build different machine learning (Support Vector Machine & Logistic Regression) and deep learning (Multilayer Perceptron, Convolution Neural Network, BiLSTM, BERT) algorithms to detect cyberbullying of English-Hindi (En-Hi) code-switched text. Our proposed model integrates different hand-crafted features and is enriched by sequential and semantic patterns generated by different state-of-the-art deep neural network models. Initial experimental results of the proposed deep ensemble model on our code-switched data reveal that our approach yields state-of-the-art results, i.e., 0.93 in terms of macro-averaged F1 score. The dataset and codes of the present study will be made publicly available on the paper’s companion repository [https://github.com/95sayanta/COVID-19-and-Cyberbullying]. Springer US 2022-01-08 2023 /pmc/articles/PMC8742666/ /pubmed/35035263 http://dx.doi.org/10.1007/s11042-021-11601-9 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | 1209: Recent Advances on Social Media Analytics and Multimedia Systems: Issues and Challenges Paul, Sayanta Saha, Sriparna Singh, Jyoti Prakash COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic |
title | COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic |
title_full | COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic |
title_fullStr | COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic |
title_full_unstemmed | COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic |
title_short | COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic |
title_sort | covid-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic |
topic | 1209: Recent Advances on Social Media Analytics and Multimedia Systems: Issues and Challenges |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8742666/ https://www.ncbi.nlm.nih.gov/pubmed/35035263 http://dx.doi.org/10.1007/s11042-021-11601-9 |
work_keys_str_mv | AT paulsayanta covid19andcyberbullyingdeepensemblemodeltoidentifycyberbullyingfromcodeswitchedlanguagesduringthepandemic AT sahasriparna covid19andcyberbullyingdeepensemblemodeltoidentifycyberbullyingfromcodeswitchedlanguagesduringthepandemic AT singhjyotiprakash covid19andcyberbullyingdeepensemblemodeltoidentifycyberbullyingfromcodeswitchedlanguagesduringthepandemic |