Cargando…

Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence

Family and Domestic violence (FDV) is a global problem with significant social, economic, and health consequences for victims including increased health care costs, mental trauma, and social stigmatization. In Australia, the estimated annual cost of FDV is $22 billion, with one woman being murdered...

Descripción completa

Detalles Bibliográficos
Autores principales: Karystianis, George, Cabral, Rina Carines, Han, Soyeon Caren, Poon, Josiah, Butler, Tony
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521947/
https://www.ncbi.nlm.nih.gov/pubmed/34713088
http://dx.doi.org/10.3389/fdgth.2021.602683
_version_ 1784584993627963392
author Karystianis, George
Cabral, Rina Carines
Han, Soyeon Caren
Poon, Josiah
Butler, Tony
author_facet Karystianis, George
Cabral, Rina Carines
Han, Soyeon Caren
Poon, Josiah
Butler, Tony
author_sort Karystianis, George
collection PubMed
description Family and Domestic violence (FDV) is a global problem with significant social, economic, and health consequences for victims including increased health care costs, mental trauma, and social stigmatization. In Australia, the estimated annual cost of FDV is $22 billion, with one woman being murdered by a current or former partner every week. Despite this, tools that can predict future FDV based on the features of the person of interest (POI) and victim are lacking. The New South Wales Police Force attends thousands of FDV events each year and records details as fixed fields (e.g., demographic information for individuals involved in the event) and as text narratives which describe abuse types, victim injuries, threats, including the mental health status for POIs and victims. This information within the narratives is mostly untapped for research and reporting purposes. After applying a text mining methodology to extract information from 492,393 FDV event narratives (abuse types, victim injuries, mental illness mentions), we linked these characteristics with the respective fixed fields and with actual mental health diagnoses obtained from the NSW Ministry of Health for the same cohort to form a comprehensive FDV dataset. These data were input into five deep learning models (MLP, LSTM, Bi-LSTM, Bi-GRU, BERT) to predict three FDV offense types (“hands-on,” “hands-off,” “Apprehended Domestic Violence Order (ADVO) breach”). The transformer model with BERT embeddings returned the best performance (69.00% accuracy; 66.76% ROC) for “ADVO breach” in a multilabel classification setup while the binary classification setup generated similar results. “Hands-off” offenses proved the hardest offense type to predict (60.72% accuracy; 57.86% ROC using BERT) but showed potential to improve with fine-tuning of binary classification setups. “Hands-on” offenses benefitted least from the contextual information gained through BERT embeddings in which MLP with categorical embeddings outperformed it in three out of four metrics (65.95% accuracy; 78.03% F1-score; 70.00% precision). The encouraging results indicate that future FDV offenses can be predicted using deep learning on a large corpus of police and health data. Incorporating additional data sources will likely increase the performance which can assist those working on FDV and law enforcement to improve outcomes and better manage FDV events.
format Online
Article
Text
id pubmed-8521947
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-85219472021-10-27 Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence Karystianis, George Cabral, Rina Carines Han, Soyeon Caren Poon, Josiah Butler, Tony Front Digit Health Digital Health Family and Domestic violence (FDV) is a global problem with significant social, economic, and health consequences for victims including increased health care costs, mental trauma, and social stigmatization. In Australia, the estimated annual cost of FDV is $22 billion, with one woman being murdered by a current or former partner every week. Despite this, tools that can predict future FDV based on the features of the person of interest (POI) and victim are lacking. The New South Wales Police Force attends thousands of FDV events each year and records details as fixed fields (e.g., demographic information for individuals involved in the event) and as text narratives which describe abuse types, victim injuries, threats, including the mental health status for POIs and victims. This information within the narratives is mostly untapped for research and reporting purposes. After applying a text mining methodology to extract information from 492,393 FDV event narratives (abuse types, victim injuries, mental illness mentions), we linked these characteristics with the respective fixed fields and with actual mental health diagnoses obtained from the NSW Ministry of Health for the same cohort to form a comprehensive FDV dataset. These data were input into five deep learning models (MLP, LSTM, Bi-LSTM, Bi-GRU, BERT) to predict three FDV offense types (“hands-on,” “hands-off,” “Apprehended Domestic Violence Order (ADVO) breach”). The transformer model with BERT embeddings returned the best performance (69.00% accuracy; 66.76% ROC) for “ADVO breach” in a multilabel classification setup while the binary classification setup generated similar results. “Hands-off” offenses proved the hardest offense type to predict (60.72% accuracy; 57.86% ROC using BERT) but showed potential to improve with fine-tuning of binary classification setups. “Hands-on” offenses benefitted least from the contextual information gained through BERT embeddings in which MLP with categorical embeddings outperformed it in three out of four metrics (65.95% accuracy; 78.03% F1-score; 70.00% precision). The encouraging results indicate that future FDV offenses can be predicted using deep learning on a large corpus of police and health data. Incorporating additional data sources will likely increase the performance which can assist those working on FDV and law enforcement to improve outcomes and better manage FDV events. Frontiers Media S.A. 2021-02-17 /pmc/articles/PMC8521947/ /pubmed/34713088 http://dx.doi.org/10.3389/fdgth.2021.602683 Text en Copyright © 2021 Karystianis, Cabral, Han, Poon and Butler. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Digital Health
Karystianis, George
Cabral, Rina Carines
Han, Soyeon Caren
Poon, Josiah
Butler, Tony
Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence
title Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence
title_full Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence
title_fullStr Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence
title_full_unstemmed Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence
title_short Utilizing Text Mining, Data Linkage and Deep Learning in Police and Health Records to Predict Future Offenses in Family and Domestic Violence
title_sort utilizing text mining, data linkage and deep learning in police and health records to predict future offenses in family and domestic violence
topic Digital Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521947/
https://www.ncbi.nlm.nih.gov/pubmed/34713088
http://dx.doi.org/10.3389/fdgth.2021.602683
work_keys_str_mv AT karystianisgeorge utilizingtextminingdatalinkageanddeeplearninginpoliceandhealthrecordstopredictfutureoffensesinfamilyanddomesticviolence
AT cabralrinacarines utilizingtextminingdatalinkageanddeeplearninginpoliceandhealthrecordstopredictfutureoffensesinfamilyanddomesticviolence
AT hansoyeoncaren utilizingtextminingdatalinkageanddeeplearninginpoliceandhealthrecordstopredictfutureoffensesinfamilyanddomesticviolence
AT poonjosiah utilizingtextminingdatalinkageanddeeplearninginpoliceandhealthrecordstopredictfutureoffensesinfamilyanddomesticviolence
AT butlertony utilizingtextminingdatalinkageanddeeplearninginpoliceandhealthrecordstopredictfutureoffensesinfamilyanddomesticviolence