Cargando…

Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents

To undertake a reliable analysis of injury severity in road traffic accidents, a complete understanding of important attributes is essential. As a result of the shift from traditional statistical parametric procedures to computer-aided methods, machine learning approaches have become an important as...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Shuguang, Khattak, Afaq, Matara, Caroline Mongina, Hussain, Arshad, Farooq, Asim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8809572/
https://www.ncbi.nlm.nih.gov/pubmed/35108288
http://dx.doi.org/10.1371/journal.pone.0262941
_version_ 1784644046044528640
author Zhang, Shuguang
Khattak, Afaq
Matara, Caroline Mongina
Hussain, Arshad
Farooq, Asim
author_facet Zhang, Shuguang
Khattak, Afaq
Matara, Caroline Mongina
Hussain, Arshad
Farooq, Asim
author_sort Zhang, Shuguang
collection PubMed
description To undertake a reliable analysis of injury severity in road traffic accidents, a complete understanding of important attributes is essential. As a result of the shift from traditional statistical parametric procedures to computer-aided methods, machine learning approaches have become an important aspect in predicting the severity of road traffic injuries. The paper presents a hybrid feature selection-based machine learning classification approach for detecting significant attributes and predicting injury severity in single and multiple-vehicle accidents. To begin, we employed a Random Forests (RF) classifier in conjunction with an intrinsic wrapper-based feature selection approach called the Boruta Algorithm (BA) to find the relevant important attributes that determine injury severity. The influential attributes were then fed into a set of four classifiers to accurately predict injury severity (Naive Bayes (NB), K-Nearest Neighbor (K-NN), Binary Logistic Regression (BLR), and Extreme Gradient Boosting (XGBoost)). According to BA’s experimental investigation, the vehicle type was the most influential factor, followed by the month of the year, the driver’s age, and the alignment of the road segment. The driver’s gender, the presence of a median, and the presence of a shoulder were all found to be unimportant. According to classifier performance measures, XGBoost surpasses the other classifiers in terms of prediction performance. Using the specified attributes, the accuracy, Cohen’s Kappa, F1-Measure, and AUC-ROC values of the XGBoost were 82.10%, 0.607, 0.776, and 0.880 for single vehicle accidents and 79.52%, 0.569, 0.752, and 0.86 for multiple-vehicle accidents, respectively.
format Online
Article
Text
id pubmed-8809572
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-88095722022-02-03 Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents Zhang, Shuguang Khattak, Afaq Matara, Caroline Mongina Hussain, Arshad Farooq, Asim PLoS One Research Article To undertake a reliable analysis of injury severity in road traffic accidents, a complete understanding of important attributes is essential. As a result of the shift from traditional statistical parametric procedures to computer-aided methods, machine learning approaches have become an important aspect in predicting the severity of road traffic injuries. The paper presents a hybrid feature selection-based machine learning classification approach for detecting significant attributes and predicting injury severity in single and multiple-vehicle accidents. To begin, we employed a Random Forests (RF) classifier in conjunction with an intrinsic wrapper-based feature selection approach called the Boruta Algorithm (BA) to find the relevant important attributes that determine injury severity. The influential attributes were then fed into a set of four classifiers to accurately predict injury severity (Naive Bayes (NB), K-Nearest Neighbor (K-NN), Binary Logistic Regression (BLR), and Extreme Gradient Boosting (XGBoost)). According to BA’s experimental investigation, the vehicle type was the most influential factor, followed by the month of the year, the driver’s age, and the alignment of the road segment. The driver’s gender, the presence of a median, and the presence of a shoulder were all found to be unimportant. According to classifier performance measures, XGBoost surpasses the other classifiers in terms of prediction performance. Using the specified attributes, the accuracy, Cohen’s Kappa, F1-Measure, and AUC-ROC values of the XGBoost were 82.10%, 0.607, 0.776, and 0.880 for single vehicle accidents and 79.52%, 0.569, 0.752, and 0.86 for multiple-vehicle accidents, respectively. Public Library of Science 2022-02-02 /pmc/articles/PMC8809572/ /pubmed/35108288 http://dx.doi.org/10.1371/journal.pone.0262941 Text en © 2022 Zhang et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zhang, Shuguang
Khattak, Afaq
Matara, Caroline Mongina
Hussain, Arshad
Farooq, Asim
Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents
title Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents
title_full Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents
title_fullStr Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents
title_full_unstemmed Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents
title_short Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents
title_sort hybrid feature selection-based machine learning classification system for the prediction of injury severity in single and multiple-vehicle accidents
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8809572/
https://www.ncbi.nlm.nih.gov/pubmed/35108288
http://dx.doi.org/10.1371/journal.pone.0262941
work_keys_str_mv AT zhangshuguang hybridfeatureselectionbasedmachinelearningclassificationsystemforthepredictionofinjuryseverityinsingleandmultiplevehicleaccidents
AT khattakafaq hybridfeatureselectionbasedmachinelearningclassificationsystemforthepredictionofinjuryseverityinsingleandmultiplevehicleaccidents
AT mataracarolinemongina hybridfeatureselectionbasedmachinelearningclassificationsystemforthepredictionofinjuryseverityinsingleandmultiplevehicleaccidents
AT hussainarshad hybridfeatureselectionbasedmachinelearningclassificationsystemforthepredictionofinjuryseverityinsingleandmultiplevehicleaccidents
AT farooqasim hybridfeatureselectionbasedmachinelearningclassificationsystemforthepredictionofinjuryseverityinsingleandmultiplevehicleaccidents