Cargando…
Combining statistical methods for detecting potential outliers in groundwater quality time series
Quality control of large-scale monitoring networks requires the use of automatic procedures to detect potential outliers in an unambiguous and reproducible manner. This paper describes a methodology that combines existing statistical methods to accommodate for the specific characteristics of measure...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9640456/ https://www.ncbi.nlm.nih.gov/pubmed/36344854 http://dx.doi.org/10.1007/s10661-022-10661-0 |
_version_ | 1784825855165333504 |
---|---|
author | Berendrecht, Wilbert van Vliet, Mariëlle Griffioen, Jasper |
author_facet | Berendrecht, Wilbert van Vliet, Mariëlle Griffioen, Jasper |
author_sort | Berendrecht, Wilbert |
collection | PubMed |
description | Quality control of large-scale monitoring networks requires the use of automatic procedures to detect potential outliers in an unambiguous and reproducible manner. This paper describes a methodology that combines existing statistical methods to accommodate for the specific characteristics of measurement data obtained from groundwater quality monitoring networks: the measurement series show a large variety of dynamics and often comprise few (< 25) measurements, the measurement data are not normally distributed, measurement series may contain several outliers, there may be trends in the series, and/or some measurements may be below detection limits. Furthermore, the detection limits may vary in time. The methodology for outlier detection described in this paper uses robust regression on order statistics (ROS) to deal with measured values below the detection limit. In addition, a biweight location estimator is applied to filter out any temporal trends from the series. The subsequent outlier detection is done in z-score space. Tuning parameters are used to attune the robustness and accuracy to the given dataset and the user requirements. The method has been applied to data from the Dutch national groundwater quality monitoring network, which consists of approximately 350 monitoring wells. It proved to work well in general, detecting outliers at the top and bottom of the regular measurement range and around the detection limit. Given the diversity exhibited by measurement series, it is to be expected that the method does not give 100% satisfactory results. Measured values identified by the method as potential outliers will therefore always need to be further assessed on the basis of expert knowledge, consistency with other measurement data and/or additional research. |
format | Online Article Text |
id | pubmed-9640456 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-96404562022-11-15 Combining statistical methods for detecting potential outliers in groundwater quality time series Berendrecht, Wilbert van Vliet, Mariëlle Griffioen, Jasper Environ Monit Assess Article Quality control of large-scale monitoring networks requires the use of automatic procedures to detect potential outliers in an unambiguous and reproducible manner. This paper describes a methodology that combines existing statistical methods to accommodate for the specific characteristics of measurement data obtained from groundwater quality monitoring networks: the measurement series show a large variety of dynamics and often comprise few (< 25) measurements, the measurement data are not normally distributed, measurement series may contain several outliers, there may be trends in the series, and/or some measurements may be below detection limits. Furthermore, the detection limits may vary in time. The methodology for outlier detection described in this paper uses robust regression on order statistics (ROS) to deal with measured values below the detection limit. In addition, a biweight location estimator is applied to filter out any temporal trends from the series. The subsequent outlier detection is done in z-score space. Tuning parameters are used to attune the robustness and accuracy to the given dataset and the user requirements. The method has been applied to data from the Dutch national groundwater quality monitoring network, which consists of approximately 350 monitoring wells. It proved to work well in general, detecting outliers at the top and bottom of the regular measurement range and around the detection limit. Given the diversity exhibited by measurement series, it is to be expected that the method does not give 100% satisfactory results. Measured values identified by the method as potential outliers will therefore always need to be further assessed on the basis of expert knowledge, consistency with other measurement data and/or additional research. Springer International Publishing 2022-11-08 2023 /pmc/articles/PMC9640456/ /pubmed/36344854 http://dx.doi.org/10.1007/s10661-022-10661-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Berendrecht, Wilbert van Vliet, Mariëlle Griffioen, Jasper Combining statistical methods for detecting potential outliers in groundwater quality time series |
title | Combining statistical methods for detecting potential outliers in groundwater quality time series |
title_full | Combining statistical methods for detecting potential outliers in groundwater quality time series |
title_fullStr | Combining statistical methods for detecting potential outliers in groundwater quality time series |
title_full_unstemmed | Combining statistical methods for detecting potential outliers in groundwater quality time series |
title_short | Combining statistical methods for detecting potential outliers in groundwater quality time series |
title_sort | combining statistical methods for detecting potential outliers in groundwater quality time series |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9640456/ https://www.ncbi.nlm.nih.gov/pubmed/36344854 http://dx.doi.org/10.1007/s10661-022-10661-0 |
work_keys_str_mv | AT berendrechtwilbert combiningstatisticalmethodsfordetectingpotentialoutliersingroundwaterqualitytimeseries AT vanvlietmarielle combiningstatisticalmethodsfordetectingpotentialoutliersingroundwaterqualitytimeseries AT griffioenjasper combiningstatisticalmethodsfordetectingpotentialoutliersingroundwaterqualitytimeseries |