Cargando…
On the Impact of Network Data Balancing in Cybersecurity Applications
Machine learning methods are now widely used to detect a wide range of cyberattacks. Nevertheless, the commonly used algorithms come with challenges of their own - one of them lies in network dataset characteristics. The dataset should be well-balanced in terms of the number of malicious data sample...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7303680/ http://dx.doi.org/10.1007/978-3-030-50423-6_15 |
_version_ | 1783548111662088192 |
---|---|
author | Pawlicki, Marek Choraś, Michał Kozik, Rafał Hołubowicz, Witold |
author_facet | Pawlicki, Marek Choraś, Michał Kozik, Rafał Hołubowicz, Witold |
author_sort | Pawlicki, Marek |
collection | PubMed |
description | Machine learning methods are now widely used to detect a wide range of cyberattacks. Nevertheless, the commonly used algorithms come with challenges of their own - one of them lies in network dataset characteristics. The dataset should be well-balanced in terms of the number of malicious data samples vs. benign traffic samples to achieve adequate results. When the data is not balanced, numerous machine learning approaches show a tendency to classify minority class samples as majority class samples. Since usually in network traffic data there are significantly fewer malicious samples than benign samples, in this work the problem of learning from imbalanced network traffic data in the cybersecurity domain is addressed. A number of balancing approaches is evaluated along with their impact on different machine learning algorithms. |
format | Online Article Text |
id | pubmed-7303680 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-73036802020-06-19 On the Impact of Network Data Balancing in Cybersecurity Applications Pawlicki, Marek Choraś, Michał Kozik, Rafał Hołubowicz, Witold Computational Science – ICCS 2020 Article Machine learning methods are now widely used to detect a wide range of cyberattacks. Nevertheless, the commonly used algorithms come with challenges of their own - one of them lies in network dataset characteristics. The dataset should be well-balanced in terms of the number of malicious data samples vs. benign traffic samples to achieve adequate results. When the data is not balanced, numerous machine learning approaches show a tendency to classify minority class samples as majority class samples. Since usually in network traffic data there are significantly fewer malicious samples than benign samples, in this work the problem of learning from imbalanced network traffic data in the cybersecurity domain is addressed. A number of balancing approaches is evaluated along with their impact on different machine learning algorithms. 2020-05-23 /pmc/articles/PMC7303680/ http://dx.doi.org/10.1007/978-3-030-50423-6_15 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Pawlicki, Marek Choraś, Michał Kozik, Rafał Hołubowicz, Witold On the Impact of Network Data Balancing in Cybersecurity Applications |
title | On the Impact of Network Data Balancing in Cybersecurity Applications |
title_full | On the Impact of Network Data Balancing in Cybersecurity Applications |
title_fullStr | On the Impact of Network Data Balancing in Cybersecurity Applications |
title_full_unstemmed | On the Impact of Network Data Balancing in Cybersecurity Applications |
title_short | On the Impact of Network Data Balancing in Cybersecurity Applications |
title_sort | on the impact of network data balancing in cybersecurity applications |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7303680/ http://dx.doi.org/10.1007/978-3-030-50423-6_15 |
work_keys_str_mv | AT pawlickimarek ontheimpactofnetworkdatabalancingincybersecurityapplications AT chorasmichał ontheimpactofnetworkdatabalancingincybersecurityapplications AT kozikrafał ontheimpactofnetworkdatabalancingincybersecurityapplications AT hołubowiczwitold ontheimpactofnetworkdatabalancingincybersecurityapplications |