Cargando…

Data set and machine learning models for the classification of network traffic originators

The widespread adoption of encryption in computer network traffic is increasing the difficulty of analyzing such traffic for security purposes. The data set presented in this data article is composed of network statistics computed on captures of TCP flows, originated by executing various network str...

Descripción completa

Detalles Bibliográficos
Autores principales: Canavese, Daniele, Regano, Leonardo, Basile, Cataldo, Ciravegna, Gabriele, Lioy, Antonio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8920864/
https://www.ncbi.nlm.nih.gov/pubmed/35300388
http://dx.doi.org/10.1016/j.dib.2022.107968
_version_ 1784669216319733760
author Canavese, Daniele
Regano, Leonardo
Basile, Cataldo
Ciravegna, Gabriele
Lioy, Antonio
author_facet Canavese, Daniele
Regano, Leonardo
Basile, Cataldo
Ciravegna, Gabriele
Lioy, Antonio
author_sort Canavese, Daniele
collection PubMed
description The widespread adoption of encryption in computer network traffic is increasing the difficulty of analyzing such traffic for security purposes. The data set presented in this data article is composed of network statistics computed on captures of TCP flows, originated by executing various network stress and web crawling tools, along with statistics of benign web browsing traffic. Furthermore, this data article describes a set of Machine Learning models, trained using the described data set, which can classify network traffic by the tool category (network stress tool, web crawler, web browser), the specific tool (e.g., Firefox), and also the tool version (e.g., Firefox 68) used to generate it. These models are compatible with the analysis of traffic with encrypted payload since statistics are evaluated only on the TCP headers of the packets. The data presented in this article can be useful to train and assess the performance of new Machine Learning models for tool classification.
format Online
Article
Text
id pubmed-8920864
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-89208642022-03-16 Data set and machine learning models for the classification of network traffic originators Canavese, Daniele Regano, Leonardo Basile, Cataldo Ciravegna, Gabriele Lioy, Antonio Data Brief Data Article The widespread adoption of encryption in computer network traffic is increasing the difficulty of analyzing such traffic for security purposes. The data set presented in this data article is composed of network statistics computed on captures of TCP flows, originated by executing various network stress and web crawling tools, along with statistics of benign web browsing traffic. Furthermore, this data article describes a set of Machine Learning models, trained using the described data set, which can classify network traffic by the tool category (network stress tool, web crawler, web browser), the specific tool (e.g., Firefox), and also the tool version (e.g., Firefox 68) used to generate it. These models are compatible with the analysis of traffic with encrypted payload since statistics are evaluated only on the TCP headers of the packets. The data presented in this article can be useful to train and assess the performance of new Machine Learning models for tool classification. Elsevier 2022-03-03 /pmc/articles/PMC8920864/ /pubmed/35300388 http://dx.doi.org/10.1016/j.dib.2022.107968 Text en © 2022 Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Canavese, Daniele
Regano, Leonardo
Basile, Cataldo
Ciravegna, Gabriele
Lioy, Antonio
Data set and machine learning models for the classification of network traffic originators
title Data set and machine learning models for the classification of network traffic originators
title_full Data set and machine learning models for the classification of network traffic originators
title_fullStr Data set and machine learning models for the classification of network traffic originators
title_full_unstemmed Data set and machine learning models for the classification of network traffic originators
title_short Data set and machine learning models for the classification of network traffic originators
title_sort data set and machine learning models for the classification of network traffic originators
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8920864/
https://www.ncbi.nlm.nih.gov/pubmed/35300388
http://dx.doi.org/10.1016/j.dib.2022.107968
work_keys_str_mv AT canavesedaniele datasetandmachinelearningmodelsfortheclassificationofnetworktrafficoriginators
AT reganoleonardo datasetandmachinelearningmodelsfortheclassificationofnetworktrafficoriginators
AT basilecataldo datasetandmachinelearningmodelsfortheclassificationofnetworktrafficoriginators
AT ciravegnagabriele datasetandmachinelearningmodelsfortheclassificationofnetworktrafficoriginators
AT lioyantonio datasetandmachinelearningmodelsfortheclassificationofnetworktrafficoriginators