Cargando…

Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques

With the increase in available data from computer systems and their security threats, interest in anomaly detection has increased as well in recent years. The need to diagnose faults and cyberattacks has also focused scientific research on the automated classification of outliers in big data, as man...

Descripción completa

Detalles Bibliográficos
Autores principales: Cavallaro, Claudia, Cutello, Vincenzo, Pavone, Mario, Zito, Francesco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10470118/
https://www.ncbi.nlm.nih.gov/pubmed/37663272
http://dx.doi.org/10.3389/fdata.2023.1179625
_version_ 1785099612773679104
author Cavallaro, Claudia
Cutello, Vincenzo
Pavone, Mario
Zito, Francesco
author_facet Cavallaro, Claudia
Cutello, Vincenzo
Pavone, Mario
Zito, Francesco
author_sort Cavallaro, Claudia
collection PubMed
description With the increase in available data from computer systems and their security threats, interest in anomaly detection has increased as well in recent years. The need to diagnose faults and cyberattacks has also focused scientific research on the automated classification of outliers in big data, as manual labeling is difficult in practice due to their huge volumes. The results obtained from data analysis can be used to generate alarms that anticipate anomalies and thus prevent system failures and attacks. Therefore, anomaly detection has the purpose of reducing maintenance costs as well as making decisions based on reports. During the last decade, the approaches proposed in the literature to classify unknown anomalies in log analysis, process analysis, and time series have been mainly based on machine learning and deep learning techniques. In this study, we provide an overview of current state-of-the-art methodologies, highlighting their advantages and disadvantages and the new challenges. In particular, we will see that there is no absolute best method, i.e., for any given dataset a different method may achieve the best result. Finally, we describe how the use of metaheuristics within machine learning algorithms makes it possible to have more robust and efficient tools.
format Online
Article
Text
id pubmed-10470118
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-104701182023-09-01 Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques Cavallaro, Claudia Cutello, Vincenzo Pavone, Mario Zito, Francesco Front Big Data Big Data With the increase in available data from computer systems and their security threats, interest in anomaly detection has increased as well in recent years. The need to diagnose faults and cyberattacks has also focused scientific research on the automated classification of outliers in big data, as manual labeling is difficult in practice due to their huge volumes. The results obtained from data analysis can be used to generate alarms that anticipate anomalies and thus prevent system failures and attacks. Therefore, anomaly detection has the purpose of reducing maintenance costs as well as making decisions based on reports. During the last decade, the approaches proposed in the literature to classify unknown anomalies in log analysis, process analysis, and time series have been mainly based on machine learning and deep learning techniques. In this study, we provide an overview of current state-of-the-art methodologies, highlighting their advantages and disadvantages and the new challenges. In particular, we will see that there is no absolute best method, i.e., for any given dataset a different method may achieve the best result. Finally, we describe how the use of metaheuristics within machine learning algorithms makes it possible to have more robust and efficient tools. Frontiers Media S.A. 2023-08-17 /pmc/articles/PMC10470118/ /pubmed/37663272 http://dx.doi.org/10.3389/fdata.2023.1179625 Text en Copyright © 2023 Cavallaro, Cutello, Pavone and Zito. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Cavallaro, Claudia
Cutello, Vincenzo
Pavone, Mario
Zito, Francesco
Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques
title Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques
title_full Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques
title_fullStr Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques
title_full_unstemmed Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques
title_short Discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques
title_sort discovering anomalies in big data: a review focused on the application of metaheuristics and machine learning techniques
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10470118/
https://www.ncbi.nlm.nih.gov/pubmed/37663272
http://dx.doi.org/10.3389/fdata.2023.1179625
work_keys_str_mv AT cavallaroclaudia discoveringanomaliesinbigdataareviewfocusedontheapplicationofmetaheuristicsandmachinelearningtechniques
AT cutellovincenzo discoveringanomaliesinbigdataareviewfocusedontheapplicationofmetaheuristicsandmachinelearningtechniques
AT pavonemario discoveringanomaliesinbigdataareviewfocusedontheapplicationofmetaheuristicsandmachinelearningtechniques
AT zitofrancesco discoveringanomaliesinbigdataareviewfocusedontheapplicationofmetaheuristicsandmachinelearningtechniques