Cargando…

A Comparative Study of Time Series Anomaly Detection Models for Industrial Control Systems

Anomaly detection has been known as an effective technique to detect faults or cyber-attacks in industrial control systems (ICS). Therefore, many anomaly detection models have been proposed for ICS. However, most models have been implemented and evaluated under specific circumstances, which leads to...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Bedeuro, Alawami, Mohsen Ali, Kim, Eunsoo, Oh, Sanghak, Park, Jeongyong, Kim, Hyoungshick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9921147/
https://www.ncbi.nlm.nih.gov/pubmed/36772349
http://dx.doi.org/10.3390/s23031310
Descripción
Sumario:Anomaly detection has been known as an effective technique to detect faults or cyber-attacks in industrial control systems (ICS). Therefore, many anomaly detection models have been proposed for ICS. However, most models have been implemented and evaluated under specific circumstances, which leads to confusion about choosing the best model in a real-world situation. In other words, there still needs to be a comprehensive comparison of state-of-the-art anomaly detection models with common experimental configurations. To address this problem, we conduct a comparative study of five representative time series anomaly detection models: InterFusion, RANSynCoder, GDN, LSTM-ED, and USAD. We specifically compare the performance analysis of the models in detection accuracy, training, and testing times with two publicly available datasets: SWaT and HAI. The experimental results show that the best model results are inconsistent with the datasets. For SWaT, InterFusion achieves the highest [Formula: see text]- [Formula: see text] of 90.7% while RANSynCoder achieves the highest [Formula: see text]- [Formula: see text] of 82.9% for HAI. We also investigate the effects of the training set size on the performance of anomaly detection models. We found that about 40% of the entire training set would be sufficient to build a model producing a similar performance compared to using the entire training set.