Cargando…
Hierarchical approaches to Text-based Offense Classification
Researchers working with administrative crime data often must classify offense narratives into a common scheme for analysis purposes. No comprehensive standard currently exists, nor is there a mapping tool to transform raw descriptions into offense types. This paper introduces a new schema, the Unif...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Association for the Advancement of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9984170/ https://www.ncbi.nlm.nih.gov/pubmed/36867702 http://dx.doi.org/10.1126/sciadv.abq8123 |
_version_ | 1784900693779283968 |
---|---|
author | Choi, Jay Kilmer, David Mueller-Smith, Michael Taheri, Sema A. |
author_facet | Choi, Jay Kilmer, David Mueller-Smith, Michael Taheri, Sema A. |
author_sort | Choi, Jay |
collection | PubMed |
description | Researchers working with administrative crime data often must classify offense narratives into a common scheme for analysis purposes. No comprehensive standard currently exists, nor is there a mapping tool to transform raw descriptions into offense types. This paper introduces a new schema, the Uniform Crime Classification Standard (UCCS), and the Text-based Offense Classification (TOC) tool to address these shortcomings. The UCCS schema draws from existing efforts, aiming to better reflect offense severity and improve type disambiguation. The TOC tool is a machine learning algorithm that uses a hierarchical, multilayer perceptron classification framework, built on 313,209 hand-coded offense descriptions from 24 states, to translate raw descriptions into UCCS codes. We test how variations in data processing and modeling approaches affect recall, precision, and F1 scores to assess their relative influence on model performance. The code scheme and classification tool are collaborations between Measures for Justice and the Criminal Justice Administrative Records System. |
format | Online Article Text |
id | pubmed-9984170 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Association for the Advancement of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-99841702023-03-04 Hierarchical approaches to Text-based Offense Classification Choi, Jay Kilmer, David Mueller-Smith, Michael Taheri, Sema A. Sci Adv Social and Interdisciplinary Sciences Researchers working with administrative crime data often must classify offense narratives into a common scheme for analysis purposes. No comprehensive standard currently exists, nor is there a mapping tool to transform raw descriptions into offense types. This paper introduces a new schema, the Uniform Crime Classification Standard (UCCS), and the Text-based Offense Classification (TOC) tool to address these shortcomings. The UCCS schema draws from existing efforts, aiming to better reflect offense severity and improve type disambiguation. The TOC tool is a machine learning algorithm that uses a hierarchical, multilayer perceptron classification framework, built on 313,209 hand-coded offense descriptions from 24 states, to translate raw descriptions into UCCS codes. We test how variations in data processing and modeling approaches affect recall, precision, and F1 scores to assess their relative influence on model performance. The code scheme and classification tool are collaborations between Measures for Justice and the Criminal Justice Administrative Records System. American Association for the Advancement of Science 2023-03-03 /pmc/articles/PMC9984170/ /pubmed/36867702 http://dx.doi.org/10.1126/sciadv.abq8123 Text en Copyright © 2023 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution License 4.0 (CC BY). https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/) , which permits which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Social and Interdisciplinary Sciences Choi, Jay Kilmer, David Mueller-Smith, Michael Taheri, Sema A. Hierarchical approaches to Text-based Offense Classification |
title | Hierarchical approaches to Text-based Offense Classification |
title_full | Hierarchical approaches to Text-based Offense Classification |
title_fullStr | Hierarchical approaches to Text-based Offense Classification |
title_full_unstemmed | Hierarchical approaches to Text-based Offense Classification |
title_short | Hierarchical approaches to Text-based Offense Classification |
title_sort | hierarchical approaches to text-based offense classification |
topic | Social and Interdisciplinary Sciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9984170/ https://www.ncbi.nlm.nih.gov/pubmed/36867702 http://dx.doi.org/10.1126/sciadv.abq8123 |
work_keys_str_mv | AT choijay hierarchicalapproachestotextbasedoffenseclassification AT kilmerdavid hierarchicalapproachestotextbasedoffenseclassification AT muellersmithmichael hierarchicalapproachestotextbasedoffenseclassification AT taherisemaa hierarchicalapproachestotextbasedoffenseclassification |