Cargando…

Examining the impact of cross-domain learning on crime prediction

Nowadays, urban data such as demographics, infrastructure, and criminal records are becoming more accessible to researchers. This has led to improvements in quantitative crime research for predicting future crime occurrence by identifying factors and knowledge from instances that contribute to crimi...

Descripción completa

Detalles Bibliográficos
Autores principales: Bappee, Fateha Khanam, Soares, Amilcar, Petry, Lucas May, Matwin, Stan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8570338/
https://www.ncbi.nlm.nih.gov/pubmed/34760434
http://dx.doi.org/10.1186/s40537-021-00489-9
Descripción
Sumario:Nowadays, urban data such as demographics, infrastructure, and criminal records are becoming more accessible to researchers. This has led to improvements in quantitative crime research for predicting future crime occurrence by identifying factors and knowledge from instances that contribute to criminal activities. While crime distribution in the geographic space is asymmetric, there are often analog, implicit criminogenic factors hidden in the data. And, since the data are not as available or comprehensive, especially for smaller cities, it is challenging to build a uniform framework for all geographic regions. This paper addresses the crime prediction task from a cross-domain perspective to tackle the data insufficiency problem in a small city. We create a uniform outline for Halifax, Nova Scotia, one of Canada’s geographic regions, by adapting and learning knowledge from two different domains, Toronto and Vancouver, which belong to different but related distributions with Halifax. For transferring knowledge among source and target domains, we propose applying instance-based transfer learning settings. Each setting is directed to learning knowledge based on a seasonal perspective with cross-domain data fusion. We choose ensemble learning methods for model building as it has generalization capabilities over new data. We evaluate the classification performance for both single and multi-domain representations and compare the results with baseline models. Our findings exhibit the satisfactory performance of our proposed data-driven approach by integrating multiple sources of data.