Cargando…

Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data

The aim of using atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such “interesting” parts of data, universal approaches are required, since it is not known in advance wha...

Descripción completa

Detalles Bibliográficos
Autores principales: Sabeti, Elyas, Høst-Madsen, Anders
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514700/
https://www.ncbi.nlm.nih.gov/pubmed/33266935
http://dx.doi.org/10.3390/e21030219
_version_ 1783586648647270400
author Sabeti, Elyas
Høst-Madsen, Anders
author_facet Sabeti, Elyas
Høst-Madsen, Anders
author_sort Sabeti, Elyas
collection PubMed
description The aim of using atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such “interesting” parts of data, universal approaches are required, since it is not known in advance what we are looking for. We therefore base the atypicality criterion on codelength. In a prior paper we developed the methodology for discrete-valued data, and the current paper extends this to real-valued data. This is done by using minimum description length (MDL). We develop the information-theoretic methodology for a number of “universal” signal processing models, and finally apply them to recorded hydrophone data and heart rate variability (HRV) signal.
format Online
Article
Text
id pubmed-7514700
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75147002020-11-09 Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data Sabeti, Elyas Høst-Madsen, Anders Entropy (Basel) Article The aim of using atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such “interesting” parts of data, universal approaches are required, since it is not known in advance what we are looking for. We therefore base the atypicality criterion on codelength. In a prior paper we developed the methodology for discrete-valued data, and the current paper extends this to real-valued data. This is done by using minimum description length (MDL). We develop the information-theoretic methodology for a number of “universal” signal processing models, and finally apply them to recorded hydrophone data and heart rate variability (HRV) signal. MDPI 2019-02-26 /pmc/articles/PMC7514700/ /pubmed/33266935 http://dx.doi.org/10.3390/e21030219 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Sabeti, Elyas
Høst-Madsen, Anders
Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data
title Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data
title_full Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data
title_fullStr Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data
title_full_unstemmed Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data
title_short Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data
title_sort data discovery and anomaly detection using atypicality for real-valued data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514700/
https://www.ncbi.nlm.nih.gov/pubmed/33266935
http://dx.doi.org/10.3390/e21030219
work_keys_str_mv AT sabetielyas datadiscoveryandanomalydetectionusingatypicalityforrealvalueddata
AT høstmadsenanders datadiscoveryandanomalydetectionusingatypicalityforrealvalueddata