Cargando…
Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks †
In recent years, anomaly detection and machine learning for intrusion detection systems have been used to detect anomalies on Internet of Things networks. These systems rely on machine and deep learning to improve the detection accuracy. However, the robustness of the model depends on the number of...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9608938/ https://www.ncbi.nlm.nih.gov/pubmed/36298077 http://dx.doi.org/10.3390/s22207726 |
_version_ | 1784818890691313664 |
---|---|
author | Manzano Sanchez, Ricardo Alejandro Zaman, Marzia Goel, Nishith Naik, Kshirasagar Joshi, Rohit |
author_facet | Manzano Sanchez, Ricardo Alejandro Zaman, Marzia Goel, Nishith Naik, Kshirasagar Joshi, Rohit |
author_sort | Manzano Sanchez, Ricardo Alejandro |
collection | PubMed |
description | In recent years, anomaly detection and machine learning for intrusion detection systems have been used to detect anomalies on Internet of Things networks. These systems rely on machine and deep learning to improve the detection accuracy. However, the robustness of the model depends on the number of datasamples available, quality of the data, and the distribution of the data classes. In the present paper, we focused specifically on the amount of data and class imbalanced since both parameters are key in IoT due to the fact that network traffic is increasing exponentially. For this reason, we propose a framework that uses a big data methodology with Hadoop–Spark to train and test multi-class and binary classification with one-vs-rest strategy for intrusion detection using the entire BoT IoT dataset. Thus, we evaluate all the algorithms available in Hadoop–Spark in terms of accuracy and processing time. In addition, since the BoT IoT dataset used is highly imbalanced, we also improve the accuracy for detecting minority classes by generating more datasamples using a Conditional Tabular Generative Adversarial Network (CTGAN). In general, our proposed model outperforms other published models including our previous model. Using our proposed methodology, the F1-score of one of the minority class, i.e., Theft attack was improved from 42% to 99%. |
format | Online Article Text |
id | pubmed-9608938 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-96089382022-10-28 Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks † Manzano Sanchez, Ricardo Alejandro Zaman, Marzia Goel, Nishith Naik, Kshirasagar Joshi, Rohit Sensors (Basel) Article In recent years, anomaly detection and machine learning for intrusion detection systems have been used to detect anomalies on Internet of Things networks. These systems rely on machine and deep learning to improve the detection accuracy. However, the robustness of the model depends on the number of datasamples available, quality of the data, and the distribution of the data classes. In the present paper, we focused specifically on the amount of data and class imbalanced since both parameters are key in IoT due to the fact that network traffic is increasing exponentially. For this reason, we propose a framework that uses a big data methodology with Hadoop–Spark to train and test multi-class and binary classification with one-vs-rest strategy for intrusion detection using the entire BoT IoT dataset. Thus, we evaluate all the algorithms available in Hadoop–Spark in terms of accuracy and processing time. In addition, since the BoT IoT dataset used is highly imbalanced, we also improve the accuracy for detecting minority classes by generating more datasamples using a Conditional Tabular Generative Adversarial Network (CTGAN). In general, our proposed model outperforms other published models including our previous model. Using our proposed methodology, the F1-score of one of the minority class, i.e., Theft attack was improved from 42% to 99%. MDPI 2022-10-12 /pmc/articles/PMC9608938/ /pubmed/36298077 http://dx.doi.org/10.3390/s22207726 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Manzano Sanchez, Ricardo Alejandro Zaman, Marzia Goel, Nishith Naik, Kshirasagar Joshi, Rohit Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks † |
title | Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks † |
title_full | Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks † |
title_fullStr | Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks † |
title_full_unstemmed | Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks † |
title_short | Towards Developing a Robust Intrusion Detection Model Using Hadoop–Spark and Data Augmentation for IoT Networks † |
title_sort | towards developing a robust intrusion detection model using hadoop–spark and data augmentation for iot networks † |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9608938/ https://www.ncbi.nlm.nih.gov/pubmed/36298077 http://dx.doi.org/10.3390/s22207726 |
work_keys_str_mv | AT manzanosanchezricardoalejandro towardsdevelopingarobustintrusiondetectionmodelusinghadoopsparkanddataaugmentationforiotnetworks AT zamanmarzia towardsdevelopingarobustintrusiondetectionmodelusinghadoopsparkanddataaugmentationforiotnetworks AT goelnishith towardsdevelopingarobustintrusiondetectionmodelusinghadoopsparkanddataaugmentationforiotnetworks AT naikkshirasagar towardsdevelopingarobustintrusiondetectionmodelusinghadoopsparkanddataaugmentationforiotnetworks AT joshirohit towardsdevelopingarobustintrusiondetectionmodelusinghadoopsparkanddataaugmentationforiotnetworks |