Cargando…

Outlier Detection with Reinforcement Learning for Costly to Verify Data

Outliers are often present in data and many algorithms exist to find these outliers. Often we can verify these outliers to determine whether they are data errors or not. Unfortunately, checking such points is time-consuming and the underlying issues leading to the data error can change over time. An...

Descripción completa

Detalles Bibliográficos
Autores principales: Nijhuis, Michiel, van Lelyveld, Iman
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296860/
https://www.ncbi.nlm.nih.gov/pubmed/37372186
http://dx.doi.org/10.3390/e25060842
Descripción
Sumario:Outliers are often present in data and many algorithms exist to find these outliers. Often we can verify these outliers to determine whether they are data errors or not. Unfortunately, checking such points is time-consuming and the underlying issues leading to the data error can change over time. An outlier detection approach should therefore be able to optimally use the knowledge gained from the verification of the ground truth and adjust accordingly. With advances in machine learning, this can be achieved by applying reinforcement learning on a statistical outlier detection approach. The approach uses an ensemble of proven outlier detection methods in combination with a reinforcement learning approach to tune the coefficients of the ensemble with every additional bit of data. The performance and the applicability of the reinforcement learning outlier detection approach are illustrated using granular data reported by Dutch insurers and pension funds under the Solvency II and FTK frameworks. The application shows that outliers can be identified by the ensemble learner. Moreover, applying the reinforcement learner on top of the ensemble model can further improve the results by optimising the coefficients of the ensemble learner.