Cargando…

Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms

Ransomware-related cyber-attacks have been on the rise over the last decade, disturbing organizations considerably. Developing new and better ways to detect this type of malware is necessary. This research applies dynamic analysis and machine learning to identify the ever-evolving ransomware signatu...

Descripción completa

Detalles Bibliográficos
Autores principales: Herrera-Silva, Juan A., Hernández-Álvarez, Myriam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920393/
https://www.ncbi.nlm.nih.gov/pubmed/36772092
http://dx.doi.org/10.3390/s23031053
_version_ 1784887059136118784
author Herrera-Silva, Juan A.
Hernández-Álvarez, Myriam
author_facet Herrera-Silva, Juan A.
Hernández-Álvarez, Myriam
author_sort Herrera-Silva, Juan A.
collection PubMed
description Ransomware-related cyber-attacks have been on the rise over the last decade, disturbing organizations considerably. Developing new and better ways to detect this type of malware is necessary. This research applies dynamic analysis and machine learning to identify the ever-evolving ransomware signatures using selected dynamic features. Since most of the attributes are shared by diverse ransomware-affected samples, our study can be used for detecting current and even new variants of the threat. This research has the following objectives: (1) Execute experiments with encryptor and locker ransomware combined with goodware to generate JSON files with dynamic parameters using a sandbox. (2) Analyze and select the most relevant and non-redundant dynamic features for identifying encryptor and locker ransomware from goodware. (3) Generate and make public a dynamic features dataset that includes these selected parameters for samples of different artifacts. (4) Apply the dynamic feature dataset to obtain models with machine learning algorithms. Five platforms, 20 ransomware, and 20 goodware artifacts were evaluated. The final feature dataset is composed of 2000 registers of 50 characteristics each. This dataset allows for a machine learning detection with a 10-fold cross-evaluation with an average accuracy superior to 0.99 for gradient boosted regression trees, random forest, and neural networks.
format Online
Article
Text
id pubmed-9920393
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99203932023-02-12 Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms Herrera-Silva, Juan A. Hernández-Álvarez, Myriam Sensors (Basel) Article Ransomware-related cyber-attacks have been on the rise over the last decade, disturbing organizations considerably. Developing new and better ways to detect this type of malware is necessary. This research applies dynamic analysis and machine learning to identify the ever-evolving ransomware signatures using selected dynamic features. Since most of the attributes are shared by diverse ransomware-affected samples, our study can be used for detecting current and even new variants of the threat. This research has the following objectives: (1) Execute experiments with encryptor and locker ransomware combined with goodware to generate JSON files with dynamic parameters using a sandbox. (2) Analyze and select the most relevant and non-redundant dynamic features for identifying encryptor and locker ransomware from goodware. (3) Generate and make public a dynamic features dataset that includes these selected parameters for samples of different artifacts. (4) Apply the dynamic feature dataset to obtain models with machine learning algorithms. Five platforms, 20 ransomware, and 20 goodware artifacts were evaluated. The final feature dataset is composed of 2000 registers of 50 characteristics each. This dataset allows for a machine learning detection with a 10-fold cross-evaluation with an average accuracy superior to 0.99 for gradient boosted regression trees, random forest, and neural networks. MDPI 2023-01-17 /pmc/articles/PMC9920393/ /pubmed/36772092 http://dx.doi.org/10.3390/s23031053 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Herrera-Silva, Juan A.
Hernández-Álvarez, Myriam
Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms
title Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms
title_full Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms
title_fullStr Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms
title_full_unstemmed Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms
title_short Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms
title_sort dynamic feature dataset for ransomware detection using machine learning algorithms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920393/
https://www.ncbi.nlm.nih.gov/pubmed/36772092
http://dx.doi.org/10.3390/s23031053
work_keys_str_mv AT herrerasilvajuana dynamicfeaturedatasetforransomwaredetectionusingmachinelearningalgorithms
AT hernandezalvarezmyriam dynamicfeaturedatasetforransomwaredetectionusingmachinelearningalgorithms