Cargando…

Outlier Detection with Reinforcement Learning for Costly to Verify Data

Outliers are often present in data and many algorithms exist to find these outliers. Often we can verify these outliers to determine whether they are data errors or not. Unfortunately, checking such points is time-consuming and the underlying issues leading to the data error can change over time. An...

Descripción completa

Detalles Bibliográficos
Autores principales: Nijhuis, Michiel, van Lelyveld, Iman
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296860/
https://www.ncbi.nlm.nih.gov/pubmed/37372186
http://dx.doi.org/10.3390/e25060842
_version_ 1785063747493036032
author Nijhuis, Michiel
van Lelyveld, Iman
author_facet Nijhuis, Michiel
van Lelyveld, Iman
author_sort Nijhuis, Michiel
collection PubMed
description Outliers are often present in data and many algorithms exist to find these outliers. Often we can verify these outliers to determine whether they are data errors or not. Unfortunately, checking such points is time-consuming and the underlying issues leading to the data error can change over time. An outlier detection approach should therefore be able to optimally use the knowledge gained from the verification of the ground truth and adjust accordingly. With advances in machine learning, this can be achieved by applying reinforcement learning on a statistical outlier detection approach. The approach uses an ensemble of proven outlier detection methods in combination with a reinforcement learning approach to tune the coefficients of the ensemble with every additional bit of data. The performance and the applicability of the reinforcement learning outlier detection approach are illustrated using granular data reported by Dutch insurers and pension funds under the Solvency II and FTK frameworks. The application shows that outliers can be identified by the ensemble learner. Moreover, applying the reinforcement learner on top of the ensemble model can further improve the results by optimising the coefficients of the ensemble learner.
format Online
Article
Text
id pubmed-10296860
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102968602023-06-28 Outlier Detection with Reinforcement Learning for Costly to Verify Data Nijhuis, Michiel van Lelyveld, Iman Entropy (Basel) Article Outliers are often present in data and many algorithms exist to find these outliers. Often we can verify these outliers to determine whether they are data errors or not. Unfortunately, checking such points is time-consuming and the underlying issues leading to the data error can change over time. An outlier detection approach should therefore be able to optimally use the knowledge gained from the verification of the ground truth and adjust accordingly. With advances in machine learning, this can be achieved by applying reinforcement learning on a statistical outlier detection approach. The approach uses an ensemble of proven outlier detection methods in combination with a reinforcement learning approach to tune the coefficients of the ensemble with every additional bit of data. The performance and the applicability of the reinforcement learning outlier detection approach are illustrated using granular data reported by Dutch insurers and pension funds under the Solvency II and FTK frameworks. The application shows that outliers can be identified by the ensemble learner. Moreover, applying the reinforcement learner on top of the ensemble model can further improve the results by optimising the coefficients of the ensemble learner. MDPI 2023-05-25 /pmc/articles/PMC10296860/ /pubmed/37372186 http://dx.doi.org/10.3390/e25060842 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Nijhuis, Michiel
van Lelyveld, Iman
Outlier Detection with Reinforcement Learning for Costly to Verify Data
title Outlier Detection with Reinforcement Learning for Costly to Verify Data
title_full Outlier Detection with Reinforcement Learning for Costly to Verify Data
title_fullStr Outlier Detection with Reinforcement Learning for Costly to Verify Data
title_full_unstemmed Outlier Detection with Reinforcement Learning for Costly to Verify Data
title_short Outlier Detection with Reinforcement Learning for Costly to Verify Data
title_sort outlier detection with reinforcement learning for costly to verify data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296860/
https://www.ncbi.nlm.nih.gov/pubmed/37372186
http://dx.doi.org/10.3390/e25060842
work_keys_str_mv AT nijhuismichiel outlierdetectionwithreinforcementlearningforcostlytoverifydata
AT vanlelyveldiman outlierdetectionwithreinforcementlearningforcostlytoverifydata