Cargando…

Automatic detection of cyberbullying in social media text

While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of poten...

Descripción completa

Detalles Bibliográficos
Autores principales: Van Hee, Cynthia, Jacobs, Gilles, Emmery, Chris, Desmet, Bart, Lefever, Els, Verhoeven, Ben, De Pauw, Guy, Daelemans, Walter, Hoste, Véronique
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6175271/
https://www.ncbi.nlm.nih.gov/pubmed/30296299
http://dx.doi.org/10.1371/journal.pone.0203794
_version_ 1783361470861410304
author Van Hee, Cynthia
Jacobs, Gilles
Emmery, Chris
Desmet, Bart
Lefever, Els
Verhoeven, Ben
De Pauw, Guy
Daelemans, Walter
Hoste, Véronique
author_facet Van Hee, Cynthia
Jacobs, Gilles
Emmery, Chris
Desmet, Bart
Lefever, Els
Verhoeven, Ben
De Pauw, Guy
Daelemans, Walter
Hoste, Véronique
author_sort Van Hee, Cynthia
collection PubMed
description While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for the task. Experiments on a hold-out test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F(1) score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems.
format Online
Article
Text
id pubmed-6175271
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61752712018-10-19 Automatic detection of cyberbullying in social media text Van Hee, Cynthia Jacobs, Gilles Emmery, Chris Desmet, Bart Lefever, Els Verhoeven, Ben De Pauw, Guy Daelemans, Walter Hoste, Véronique PLoS One Research Article While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for the task. Experiments on a hold-out test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F(1) score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems. Public Library of Science 2018-10-08 /pmc/articles/PMC6175271/ /pubmed/30296299 http://dx.doi.org/10.1371/journal.pone.0203794 Text en © 2018 Van Hee et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Van Hee, Cynthia
Jacobs, Gilles
Emmery, Chris
Desmet, Bart
Lefever, Els
Verhoeven, Ben
De Pauw, Guy
Daelemans, Walter
Hoste, Véronique
Automatic detection of cyberbullying in social media text
title Automatic detection of cyberbullying in social media text
title_full Automatic detection of cyberbullying in social media text
title_fullStr Automatic detection of cyberbullying in social media text
title_full_unstemmed Automatic detection of cyberbullying in social media text
title_short Automatic detection of cyberbullying in social media text
title_sort automatic detection of cyberbullying in social media text
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6175271/
https://www.ncbi.nlm.nih.gov/pubmed/30296299
http://dx.doi.org/10.1371/journal.pone.0203794
work_keys_str_mv AT vanheecynthia automaticdetectionofcyberbullyinginsocialmediatext
AT jacobsgilles automaticdetectionofcyberbullyinginsocialmediatext
AT emmerychris automaticdetectionofcyberbullyinginsocialmediatext
AT desmetbart automaticdetectionofcyberbullyinginsocialmediatext
AT lefeverels automaticdetectionofcyberbullyinginsocialmediatext
AT verhoevenben automaticdetectionofcyberbullyinginsocialmediatext
AT depauwguy automaticdetectionofcyberbullyinginsocialmediatext
AT daelemanswalter automaticdetectionofcyberbullyinginsocialmediatext
AT hosteveronique automaticdetectionofcyberbullyinginsocialmediatext