Cargando…

Hierarchical approaches to Text-based Offense Classification

Researchers working with administrative crime data often must classify offense narratives into a common scheme for analysis purposes. No comprehensive standard currently exists, nor is there a mapping tool to transform raw descriptions into offense types. This paper introduces a new schema, the Unif...

Descripción completa

Detalles Bibliográficos
Autores principales: Choi, Jay, Kilmer, David, Mueller-Smith, Michael, Taheri, Sema A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Association for the Advancement of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9984170/
https://www.ncbi.nlm.nih.gov/pubmed/36867702
http://dx.doi.org/10.1126/sciadv.abq8123
_version_ 1784900693779283968
author Choi, Jay
Kilmer, David
Mueller-Smith, Michael
Taheri, Sema A.
author_facet Choi, Jay
Kilmer, David
Mueller-Smith, Michael
Taheri, Sema A.
author_sort Choi, Jay
collection PubMed
description Researchers working with administrative crime data often must classify offense narratives into a common scheme for analysis purposes. No comprehensive standard currently exists, nor is there a mapping tool to transform raw descriptions into offense types. This paper introduces a new schema, the Uniform Crime Classification Standard (UCCS), and the Text-based Offense Classification (TOC) tool to address these shortcomings. The UCCS schema draws from existing efforts, aiming to better reflect offense severity and improve type disambiguation. The TOC tool is a machine learning algorithm that uses a hierarchical, multilayer perceptron classification framework, built on 313,209 hand-coded offense descriptions from 24 states, to translate raw descriptions into UCCS codes. We test how variations in data processing and modeling approaches affect recall, precision, and F1 scores to assess their relative influence on model performance. The code scheme and classification tool are collaborations between Measures for Justice and the Criminal Justice Administrative Records System.
format Online
Article
Text
id pubmed-9984170
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Association for the Advancement of Science
record_format MEDLINE/PubMed
spelling pubmed-99841702023-03-04 Hierarchical approaches to Text-based Offense Classification Choi, Jay Kilmer, David Mueller-Smith, Michael Taheri, Sema A. Sci Adv Social and Interdisciplinary Sciences Researchers working with administrative crime data often must classify offense narratives into a common scheme for analysis purposes. No comprehensive standard currently exists, nor is there a mapping tool to transform raw descriptions into offense types. This paper introduces a new schema, the Uniform Crime Classification Standard (UCCS), and the Text-based Offense Classification (TOC) tool to address these shortcomings. The UCCS schema draws from existing efforts, aiming to better reflect offense severity and improve type disambiguation. The TOC tool is a machine learning algorithm that uses a hierarchical, multilayer perceptron classification framework, built on 313,209 hand-coded offense descriptions from 24 states, to translate raw descriptions into UCCS codes. We test how variations in data processing and modeling approaches affect recall, precision, and F1 scores to assess their relative influence on model performance. The code scheme and classification tool are collaborations between Measures for Justice and the Criminal Justice Administrative Records System. American Association for the Advancement of Science 2023-03-03 /pmc/articles/PMC9984170/ /pubmed/36867702 http://dx.doi.org/10.1126/sciadv.abq8123 Text en Copyright © 2023 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution License 4.0 (CC BY). https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/) , which permits which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Social and Interdisciplinary Sciences
Choi, Jay
Kilmer, David
Mueller-Smith, Michael
Taheri, Sema A.
Hierarchical approaches to Text-based Offense Classification
title Hierarchical approaches to Text-based Offense Classification
title_full Hierarchical approaches to Text-based Offense Classification
title_fullStr Hierarchical approaches to Text-based Offense Classification
title_full_unstemmed Hierarchical approaches to Text-based Offense Classification
title_short Hierarchical approaches to Text-based Offense Classification
title_sort hierarchical approaches to text-based offense classification
topic Social and Interdisciplinary Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9984170/
https://www.ncbi.nlm.nih.gov/pubmed/36867702
http://dx.doi.org/10.1126/sciadv.abq8123
work_keys_str_mv AT choijay hierarchicalapproachestotextbasedoffenseclassification
AT kilmerdavid hierarchicalapproachestotextbasedoffenseclassification
AT muellersmithmichael hierarchicalapproachestotextbasedoffenseclassification
AT taherisemaa hierarchicalapproachestotextbasedoffenseclassification