Cargando…

Combining statistical methods for detecting potential outliers in groundwater quality time series

Quality control of large-scale monitoring networks requires the use of automatic procedures to detect potential outliers in an unambiguous and reproducible manner. This paper describes a methodology that combines existing statistical methods to accommodate for the specific characteristics of measure...

Descripción completa

Detalles Bibliográficos
Autores principales: Berendrecht, Wilbert, van Vliet, Mariëlle, Griffioen, Jasper
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9640456/
https://www.ncbi.nlm.nih.gov/pubmed/36344854
http://dx.doi.org/10.1007/s10661-022-10661-0
_version_ 1784825855165333504
author Berendrecht, Wilbert
van Vliet, Mariëlle
Griffioen, Jasper
author_facet Berendrecht, Wilbert
van Vliet, Mariëlle
Griffioen, Jasper
author_sort Berendrecht, Wilbert
collection PubMed
description Quality control of large-scale monitoring networks requires the use of automatic procedures to detect potential outliers in an unambiguous and reproducible manner. This paper describes a methodology that combines existing statistical methods to accommodate for the specific characteristics of measurement data obtained from groundwater quality monitoring networks: the measurement series show a large variety of dynamics and often comprise few (< 25) measurements, the measurement data are not normally distributed, measurement series may contain several outliers, there may be trends in the series, and/or some measurements may be below detection limits. Furthermore, the detection limits may vary in time. The methodology for outlier detection described in this paper uses robust regression on order statistics (ROS) to deal with measured values below the detection limit. In addition, a biweight location estimator is applied to filter out any temporal trends from the series. The subsequent outlier detection is done in z-score space. Tuning parameters are used to attune the robustness and accuracy to the given dataset and the user requirements. The method has been applied to data from the Dutch national groundwater quality monitoring network, which consists of approximately 350 monitoring wells. It proved to work well in general, detecting outliers at the top and bottom of the regular measurement range and around the detection limit. Given the diversity exhibited by measurement series, it is to be expected that the method does not give 100% satisfactory results. Measured values identified by the method as potential outliers will therefore always need to be further assessed on the basis of expert knowledge, consistency with other measurement data and/or additional research.
format Online
Article
Text
id pubmed-9640456
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-96404562022-11-15 Combining statistical methods for detecting potential outliers in groundwater quality time series Berendrecht, Wilbert van Vliet, Mariëlle Griffioen, Jasper Environ Monit Assess Article Quality control of large-scale monitoring networks requires the use of automatic procedures to detect potential outliers in an unambiguous and reproducible manner. This paper describes a methodology that combines existing statistical methods to accommodate for the specific characteristics of measurement data obtained from groundwater quality monitoring networks: the measurement series show a large variety of dynamics and often comprise few (< 25) measurements, the measurement data are not normally distributed, measurement series may contain several outliers, there may be trends in the series, and/or some measurements may be below detection limits. Furthermore, the detection limits may vary in time. The methodology for outlier detection described in this paper uses robust regression on order statistics (ROS) to deal with measured values below the detection limit. In addition, a biweight location estimator is applied to filter out any temporal trends from the series. The subsequent outlier detection is done in z-score space. Tuning parameters are used to attune the robustness and accuracy to the given dataset and the user requirements. The method has been applied to data from the Dutch national groundwater quality monitoring network, which consists of approximately 350 monitoring wells. It proved to work well in general, detecting outliers at the top and bottom of the regular measurement range and around the detection limit. Given the diversity exhibited by measurement series, it is to be expected that the method does not give 100% satisfactory results. Measured values identified by the method as potential outliers will therefore always need to be further assessed on the basis of expert knowledge, consistency with other measurement data and/or additional research. Springer International Publishing 2022-11-08 2023 /pmc/articles/PMC9640456/ /pubmed/36344854 http://dx.doi.org/10.1007/s10661-022-10661-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Berendrecht, Wilbert
van Vliet, Mariëlle
Griffioen, Jasper
Combining statistical methods for detecting potential outliers in groundwater quality time series
title Combining statistical methods for detecting potential outliers in groundwater quality time series
title_full Combining statistical methods for detecting potential outliers in groundwater quality time series
title_fullStr Combining statistical methods for detecting potential outliers in groundwater quality time series
title_full_unstemmed Combining statistical methods for detecting potential outliers in groundwater quality time series
title_short Combining statistical methods for detecting potential outliers in groundwater quality time series
title_sort combining statistical methods for detecting potential outliers in groundwater quality time series
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9640456/
https://www.ncbi.nlm.nih.gov/pubmed/36344854
http://dx.doi.org/10.1007/s10661-022-10661-0
work_keys_str_mv AT berendrechtwilbert combiningstatisticalmethodsfordetectingpotentialoutliersingroundwaterqualitytimeseries
AT vanvlietmarielle combiningstatisticalmethodsfordetectingpotentialoutliersingroundwaterqualitytimeseries
AT griffioenjasper combiningstatisticalmethodsfordetectingpotentialoutliersingroundwaterqualitytimeseries