Cargando…
COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullyi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8742666/ https://www.ncbi.nlm.nih.gov/pubmed/35035263 http://dx.doi.org/10.1007/s11042-021-11601-9 |
Sumario: | It has been declared by the World Health Organization (WHO) the novel coronavirus a global pandemic due to an exponential spread in COVID-19 in the past months reaching over 100 million cases and resulting in approximately 3 million deaths worldwide. Amid this pandemic, identification of cyberbullying has become a more evolving area of research over posts or comments in social media platforms. In multilingual societies like India, code-switched texts comprise the majority of the Internet. Identifying the online bullying of the code-switched user is bit challenging than monolingual cases. As a first step towards enabling the development of approaches for cyberbullying detection, we developed a new code-switched dataset, collected from Twitter utterances annotated with binary labels. To demonstrate the utility of the proposed dataset, we build different machine learning (Support Vector Machine & Logistic Regression) and deep learning (Multilayer Perceptron, Convolution Neural Network, BiLSTM, BERT) algorithms to detect cyberbullying of English-Hindi (En-Hi) code-switched text. Our proposed model integrates different hand-crafted features and is enriched by sequential and semantic patterns generated by different state-of-the-art deep neural network models. Initial experimental results of the proposed deep ensemble model on our code-switched data reveal that our approach yields state-of-the-art results, i.e., 0.93 in terms of macro-averaged F1 score. The dataset and codes of the present study will be made publicly available on the paper’s companion repository [https://github.com/95sayanta/COVID-19-and-Cyberbullying]. |
---|