Cargando…
Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification
The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7297587/ http://dx.doi.org/10.1007/978-3-030-49076-8_27 |
_version_ | 1783547037774512128 |
---|---|
author | Jarquín-Vásquez, Horacio Jesús Montes-y-Gómez, Manuel Villaseñor-Pineda, Luis |
author_facet | Jarquín-Vásquez, Horacio Jesús Montes-y-Gómez, Manuel Villaseñor-Pineda, Luis |
author_sort | Jarquín-Vásquez, Horacio Jesús |
collection | PubMed |
description | The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep learning approaches. A common problem in all these methods is to distinguish the offensive use of swear words from their everyday and humorous usage. To tackle this particular issue we propose an attention-based neural network architecture that captures the word n-grams importance according to their context. The obtained results in four standard collections from Twitter and Facebook are encouraging, they outperform the [Formula: see text] scores from state-of-the-art methods and allow identifying a set of inherently offensive swear words, and others in which its interpretation depends on its context. |
format | Online Article Text |
id | pubmed-7297587 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72975872020-06-17 Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification Jarquín-Vásquez, Horacio Jesús Montes-y-Gómez, Manuel Villaseñor-Pineda, Luis Pattern Recognition Article The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep learning approaches. A common problem in all these methods is to distinguish the offensive use of swear words from their everyday and humorous usage. To tackle this particular issue we propose an attention-based neural network architecture that captures the word n-grams importance according to their context. The obtained results in four standard collections from Twitter and Facebook are encouraging, they outperform the [Formula: see text] scores from state-of-the-art methods and allow identifying a set of inherently offensive swear words, and others in which its interpretation depends on its context. 2020-04-29 /pmc/articles/PMC7297587/ http://dx.doi.org/10.1007/978-3-030-49076-8_27 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Jarquín-Vásquez, Horacio Jesús Montes-y-Gómez, Manuel Villaseñor-Pineda, Luis Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification |
title | Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification |
title_full | Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification |
title_fullStr | Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification |
title_full_unstemmed | Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification |
title_short | Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification |
title_sort | not all swear words are used equal: attention over word n-grams for abusive language identification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7297587/ http://dx.doi.org/10.1007/978-3-030-49076-8_27 |
work_keys_str_mv | AT jarquinvasquezhoraciojesus notallswearwordsareusedequalattentionoverwordngramsforabusivelanguageidentification AT montesygomezmanuel notallswearwordsareusedequalattentionoverwordngramsforabusivelanguageidentification AT villasenorpinedaluis notallswearwordsareusedequalattentionoverwordngramsforabusivelanguageidentification |