Cargando…

Natural language model for automatic identification of Intimate Partner Violence reports from Twitter

Intimate partner violence (IPV) is a preventable public health problem that affects millions of people worldwide. Approximately one in four women are estimated to be or have been victims of severe violence at some point in their lives, irrespective of age, ethnicity, and economic status. Victims oft...

Descripción completa

Detalles Bibliográficos
Autores principales: Al-Garadi, Mohammed Ali, Kim, Sangmi, Guo, Yuting, Warren, Elise, Yang, Yuan-Chi, Lakamana, Sahithi, Sarker, Abeed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10065459/
https://www.ncbi.nlm.nih.gov/pubmed/37006948
http://dx.doi.org/10.1016/j.array.2022.100217
_version_ 1785018116923719680
author Al-Garadi, Mohammed Ali
Kim, Sangmi
Guo, Yuting
Warren, Elise
Yang, Yuan-Chi
Lakamana, Sahithi
Sarker, Abeed
author_facet Al-Garadi, Mohammed Ali
Kim, Sangmi
Guo, Yuting
Warren, Elise
Yang, Yuan-Chi
Lakamana, Sahithi
Sarker, Abeed
author_sort Al-Garadi, Mohammed Ali
collection PubMed
description Intimate partner violence (IPV) is a preventable public health problem that affects millions of people worldwide. Approximately one in four women are estimated to be or have been victims of severe violence at some point in their lives, irrespective of age, ethnicity, and economic status. Victims often report IPV experiences on social media, and automatic detection of such reports via machine learning may enable improved surveillance and targeted distribution of support and/or interventions for those in need. However, no artificial intelligence systems for automatic detection currently exists, and we attempted to address this research gap. We collected posts from Twitter using a list of IPV-related keywords, manually reviewed subsets of retrieved posts, and prepared annotation guidelines to categorize tweets into IPV-report or non-IPV-report. We annotated 6,348 tweets in total, with the inter-annotator agreement (IAA) of 0.86 (Cohen’s kappa) among 1,834 double-annotated tweets. The class distribution in the annotated dataset was highly imbalanced, with only 668 posts (~11%) labeled as IPV-report. We then developed an effective natural language processing model to identify IPV-reporting tweets automatically. The developed model achieved classification F(1)-scores of 0.76 for the IPV-report class and 0.97 for the non-IPV-report class. We conducted post-classification analyses to determine the causes of system errors and to ensure that the system did not exhibit biases in its decision making, particularly with respect to race and gender. Our automatic model can be an essential component for a proactive social media-based intervention and support framework, while also aiding population-level surveillance and large-scale cohort studies.
format Online
Article
Text
id pubmed-10065459
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-100654592023-03-31 Natural language model for automatic identification of Intimate Partner Violence reports from Twitter Al-Garadi, Mohammed Ali Kim, Sangmi Guo, Yuting Warren, Elise Yang, Yuan-Chi Lakamana, Sahithi Sarker, Abeed Array (N Y) Article Intimate partner violence (IPV) is a preventable public health problem that affects millions of people worldwide. Approximately one in four women are estimated to be or have been victims of severe violence at some point in their lives, irrespective of age, ethnicity, and economic status. Victims often report IPV experiences on social media, and automatic detection of such reports via machine learning may enable improved surveillance and targeted distribution of support and/or interventions for those in need. However, no artificial intelligence systems for automatic detection currently exists, and we attempted to address this research gap. We collected posts from Twitter using a list of IPV-related keywords, manually reviewed subsets of retrieved posts, and prepared annotation guidelines to categorize tweets into IPV-report or non-IPV-report. We annotated 6,348 tweets in total, with the inter-annotator agreement (IAA) of 0.86 (Cohen’s kappa) among 1,834 double-annotated tweets. The class distribution in the annotated dataset was highly imbalanced, with only 668 posts (~11%) labeled as IPV-report. We then developed an effective natural language processing model to identify IPV-reporting tweets automatically. The developed model achieved classification F(1)-scores of 0.76 for the IPV-report class and 0.97 for the non-IPV-report class. We conducted post-classification analyses to determine the causes of system errors and to ensure that the system did not exhibit biases in its decision making, particularly with respect to race and gender. Our automatic model can be an essential component for a proactive social media-based intervention and support framework, while also aiding population-level surveillance and large-scale cohort studies. 2022-09 2022-07-20 /pmc/articles/PMC10065459/ /pubmed/37006948 http://dx.doi.org/10.1016/j.array.2022.100217 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ).
spellingShingle Article
Al-Garadi, Mohammed Ali
Kim, Sangmi
Guo, Yuting
Warren, Elise
Yang, Yuan-Chi
Lakamana, Sahithi
Sarker, Abeed
Natural language model for automatic identification of Intimate Partner Violence reports from Twitter
title Natural language model for automatic identification of Intimate Partner Violence reports from Twitter
title_full Natural language model for automatic identification of Intimate Partner Violence reports from Twitter
title_fullStr Natural language model for automatic identification of Intimate Partner Violence reports from Twitter
title_full_unstemmed Natural language model for automatic identification of Intimate Partner Violence reports from Twitter
title_short Natural language model for automatic identification of Intimate Partner Violence reports from Twitter
title_sort natural language model for automatic identification of intimate partner violence reports from twitter
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10065459/
https://www.ncbi.nlm.nih.gov/pubmed/37006948
http://dx.doi.org/10.1016/j.array.2022.100217
work_keys_str_mv AT algaradimohammedali naturallanguagemodelforautomaticidentificationofintimatepartnerviolencereportsfromtwitter
AT kimsangmi naturallanguagemodelforautomaticidentificationofintimatepartnerviolencereportsfromtwitter
AT guoyuting naturallanguagemodelforautomaticidentificationofintimatepartnerviolencereportsfromtwitter
AT warrenelise naturallanguagemodelforautomaticidentificationofintimatepartnerviolencereportsfromtwitter
AT yangyuanchi naturallanguagemodelforautomaticidentificationofintimatepartnerviolencereportsfromtwitter
AT lakamanasahithi naturallanguagemodelforautomaticidentificationofintimatepartnerviolencereportsfromtwitter
AT sarkerabeed naturallanguagemodelforautomaticidentificationofintimatepartnerviolencereportsfromtwitter