Cargando…

Combating hate speech using an adaptive ensemble learning model with a case study on COVID-19

Social media platforms generate an enormous amount of data every day. Millions of users engage themselves with the posts circulated on these platforms. Despite the social regulations and protocols imposed by these platforms, it is difficult to restrict some objectionable posts carrying hateful conte...

Descripción completa

Detalles Bibliográficos
Autores principales: Agarwal, Shivang, Chowdary, C. Ravindranath
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9759712/
https://www.ncbi.nlm.nih.gov/pubmed/36567759
http://dx.doi.org/10.1016/j.eswa.2021.115632
Descripción
Sumario:Social media platforms generate an enormous amount of data every day. Millions of users engage themselves with the posts circulated on these platforms. Despite the social regulations and protocols imposed by these platforms, it is difficult to restrict some objectionable posts carrying hateful content. Automatic hate speech detection on social media platforms is an essential task that has not been solved efficiently despite multiple attempts by various researchers. It is a challenging task that involves identifying hateful content from social media posts. These posts may reveal hate outrageously, or they may be subjective to the user or a community. Relying on manual inspection delays the process, and the hateful content may remain available online for a long time. The current state-of-the-art methods for tackling hate speech perform well when tested on the same dataset but fail miserably on cross-datasets. Therefore, we propose an ensemble learning-based adaptive model for automatic hate speech detection, improving the cross-dataset generalization. The proposed expert model for hate speech detection works towards overcoming the strong user-bias present in the available annotated datasets. We conduct our experiments under various experimental setups and demonstrate the proposed model’s efficacy on the latest issues such as COVID-19 and US presidential elections. In particular, the loss in performance observed under cross-dataset evaluation is the least among all the models. Also, while restricting the maximum number of tweets per user, we incur no drop in performance.