Cargando…

CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks

System logs are a crucial component of system maintainability, as they record the status of the system and essential events for troubleshooting and maintenance when necessary. Therefore, anomaly detection of system logs is crucial. Recent research has focused on extracting semantic information from...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Gaoqi, Luktarhan, Nurbol, Wu, Haojie, Shi, Zhaolei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255444/
https://www.ncbi.nlm.nih.gov/pubmed/37299767
http://dx.doi.org/10.3390/s23115042
_version_ 1785056873538387968
author Tian, Gaoqi
Luktarhan, Nurbol
Wu, Haojie
Shi, Zhaolei
author_facet Tian, Gaoqi
Luktarhan, Nurbol
Wu, Haojie
Shi, Zhaolei
author_sort Tian, Gaoqi
collection PubMed
description System logs are a crucial component of system maintainability, as they record the status of the system and essential events for troubleshooting and maintenance when necessary. Therefore, anomaly detection of system logs is crucial. Recent research has focused on extracting semantic information from unstructured log messages for log anomaly detection tasks. Since BERT models work well in natural language processing, this paper proposes an approach called CLDTLog, which introduces contrastive learning and dual-objective tasks in a BERT pre-trained model and performs anomaly detection on system logs through a fully connected layer. This approach does not require log parsing and thus can avoid the uncertainty caused by log parsing. We trained the CLDTLog model on two log datasets (HDFS and BGL) and achieved F1 scores of 0.9971 and 0.9999 on the HDFS and BGL datasets, respectively, which performed better than all known methods. In addition, when using only 1% of the BGL dataset as training data, CLDTLog still achieves an F1 score of 0.9993, showing excellent generalization performance with a significant reduction of the training cost.
format Online
Article
Text
id pubmed-10255444
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102554442023-06-10 CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks Tian, Gaoqi Luktarhan, Nurbol Wu, Haojie Shi, Zhaolei Sensors (Basel) Article System logs are a crucial component of system maintainability, as they record the status of the system and essential events for troubleshooting and maintenance when necessary. Therefore, anomaly detection of system logs is crucial. Recent research has focused on extracting semantic information from unstructured log messages for log anomaly detection tasks. Since BERT models work well in natural language processing, this paper proposes an approach called CLDTLog, which introduces contrastive learning and dual-objective tasks in a BERT pre-trained model and performs anomaly detection on system logs through a fully connected layer. This approach does not require log parsing and thus can avoid the uncertainty caused by log parsing. We trained the CLDTLog model on two log datasets (HDFS and BGL) and achieved F1 scores of 0.9971 and 0.9999 on the HDFS and BGL datasets, respectively, which performed better than all known methods. In addition, when using only 1% of the BGL dataset as training data, CLDTLog still achieves an F1 score of 0.9993, showing excellent generalization performance with a significant reduction of the training cost. MDPI 2023-05-24 /pmc/articles/PMC10255444/ /pubmed/37299767 http://dx.doi.org/10.3390/s23115042 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Tian, Gaoqi
Luktarhan, Nurbol
Wu, Haojie
Shi, Zhaolei
CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks
title CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks
title_full CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks
title_fullStr CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks
title_full_unstemmed CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks
title_short CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks
title_sort cldtlog: system log anomaly detection method based on contrastive learning and dual objective tasks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255444/
https://www.ncbi.nlm.nih.gov/pubmed/37299767
http://dx.doi.org/10.3390/s23115042
work_keys_str_mv AT tiangaoqi cldtlogsystemloganomalydetectionmethodbasedoncontrastivelearninganddualobjectivetasks
AT luktarhannurbol cldtlogsystemloganomalydetectionmethodbasedoncontrastivelearninganddualobjectivetasks
AT wuhaojie cldtlogsystemloganomalydetectionmethodbasedoncontrastivelearninganddualobjectivetasks
AT shizhaolei cldtlogsystemloganomalydetectionmethodbasedoncontrastivelearninganddualobjectivetasks