Cargando…

TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams

Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outli...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Jen-Wei, Zhong, Meng-Xun, Jaysawal, Bijay Prasad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7602581/
https://www.ncbi.nlm.nih.gov/pubmed/33076325
http://dx.doi.org/10.3390/s20205829
_version_ 1783603715779854336
author Huang, Jen-Wei
Zhong, Meng-Xun
Jaysawal, Bijay Prasad
author_facet Huang, Jen-Wei
Zhong, Meng-Xun
Jaysawal, Bijay Prasad
author_sort Huang, Jen-Wei
collection PubMed
description Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outlier factor (LOF) algorithms do not consider variations in data that change over time. For example, there may appear a new cluster of data points over time in the data stream. Therefore, we present a novel algorithm for streaming data, referred to as time-aware density-based incremental local outlier detection (TADILOF) to overcome this issue. In addition, we have developed a means for estimating the LOF score, termed "approximate LOF," based on historical information following the removal of outdated data. The results of experiments demonstrate that TADILOF outperforms current state-of-the-art methods in terms of AUC while achieving similar performance in terms of execution time. Moreover, we present an application of the proposed scheme to the development of an air-quality monitoring system.
format Online
Article
Text
id pubmed-7602581
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-76025812020-11-01 TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams Huang, Jen-Wei Zhong, Meng-Xun Jaysawal, Bijay Prasad Sensors (Basel) Article Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outlier factor (LOF) algorithms do not consider variations in data that change over time. For example, there may appear a new cluster of data points over time in the data stream. Therefore, we present a novel algorithm for streaming data, referred to as time-aware density-based incremental local outlier detection (TADILOF) to overcome this issue. In addition, we have developed a means for estimating the LOF score, termed "approximate LOF," based on historical information following the removal of outdated data. The results of experiments demonstrate that TADILOF outperforms current state-of-the-art methods in terms of AUC while achieving similar performance in terms of execution time. Moreover, we present an application of the proposed scheme to the development of an air-quality monitoring system. MDPI 2020-10-15 /pmc/articles/PMC7602581/ /pubmed/33076325 http://dx.doi.org/10.3390/s20205829 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Huang, Jen-Wei
Zhong, Meng-Xun
Jaysawal, Bijay Prasad
TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams
title TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams
title_full TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams
title_fullStr TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams
title_full_unstemmed TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams
title_short TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams
title_sort tadilof: time aware density-based incremental local outlier detection in data streams
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7602581/
https://www.ncbi.nlm.nih.gov/pubmed/33076325
http://dx.doi.org/10.3390/s20205829
work_keys_str_mv AT huangjenwei tadiloftimeawaredensitybasedincrementallocaloutlierdetectionindatastreams
AT zhongmengxun tadiloftimeawaredensitybasedincrementallocaloutlierdetectionindatastreams
AT jaysawalbijayprasad tadiloftimeawaredensitybasedincrementallocaloutlierdetectionindatastreams