Cargando…

Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification

The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep...

Descripción completa

Detalles Bibliográficos
Autores principales: Jarquín-Vásquez, Horacio Jesús, Montes-y-Gómez, Manuel, Villaseñor-Pineda, Luis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7297587/
http://dx.doi.org/10.1007/978-3-030-49076-8_27
Descripción
Sumario:The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep learning approaches. A common problem in all these methods is to distinguish the offensive use of swear words from their everyday and humorous usage. To tackle this particular issue we propose an attention-based neural network architecture that captures the word n-grams importance according to their context. The obtained results in four standard collections from Twitter and Facebook are encouraging, they outperform the [Formula: see text] scores from state-of-the-art methods and allow identifying a set of inherently offensive swear words, and others in which its interpretation depends on its context.