Cargando…

How real are observed trends in small correlated datasets?

The eye may perceive a significant trend in plotted time-series data, but if the model errors of nearby data points are correlated, the trend may be an illusion. We examine generalized least-squares (GLS) estimation, finding that error correlation may be underestimated in highly correlated small dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Salamon, S. J., Hansen, H. J., Abbott, D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6458379/
https://www.ncbi.nlm.nih.gov/pubmed/31031993
http://dx.doi.org/10.1098/rsos.181089
_version_ 1783409993832202240
author Salamon, S. J.
Hansen, H. J.
Abbott, D.
author_facet Salamon, S. J.
Hansen, H. J.
Abbott, D.
author_sort Salamon, S. J.
collection PubMed
description The eye may perceive a significant trend in plotted time-series data, but if the model errors of nearby data points are correlated, the trend may be an illusion. We examine generalized least-squares (GLS) estimation, finding that error correlation may be underestimated in highly correlated small datasets by conventional techniques. This risks indicating a significant trend when there is none. A new correlation estimate based on the Durbin–Watson statistic is developed, leading to an improved estimate of autoregression with highly correlated data, thus reducing this risk. These techniques are generalized to randomly located data points in space, through the new concept of the nearest new neighbour path. We describe tests on the validity of the GLS schemes, allowing verification of the models employed. Examples illustrating our method include a 40-year record of atmospheric carbon dioxide, and Antarctic ice core data. While more conservative than existing techniques, our new GLS estimate finds a statistically significant increase in background carbon dioxide concentration, with an accelerating trend. We conclude with an example of a worldwide empirical climate model for radio propagation studies, to illustrate dealing with spatial correlation in unevenly distributed data points over the surface of the Earth. The method is generally applicable, not only to climate-related data, but to many other kinds of problems (e.g. biological, medical and geological data), where there are unequally (or randomly) spaced observations in temporally or spatially distributed datasets.
format Online
Article
Text
id pubmed-6458379
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher The Royal Society
record_format MEDLINE/PubMed
spelling pubmed-64583792019-04-26 How real are observed trends in small correlated datasets? Salamon, S. J. Hansen, H. J. Abbott, D. R Soc Open Sci Earth Science The eye may perceive a significant trend in plotted time-series data, but if the model errors of nearby data points are correlated, the trend may be an illusion. We examine generalized least-squares (GLS) estimation, finding that error correlation may be underestimated in highly correlated small datasets by conventional techniques. This risks indicating a significant trend when there is none. A new correlation estimate based on the Durbin–Watson statistic is developed, leading to an improved estimate of autoregression with highly correlated data, thus reducing this risk. These techniques are generalized to randomly located data points in space, through the new concept of the nearest new neighbour path. We describe tests on the validity of the GLS schemes, allowing verification of the models employed. Examples illustrating our method include a 40-year record of atmospheric carbon dioxide, and Antarctic ice core data. While more conservative than existing techniques, our new GLS estimate finds a statistically significant increase in background carbon dioxide concentration, with an accelerating trend. We conclude with an example of a worldwide empirical climate model for radio propagation studies, to illustrate dealing with spatial correlation in unevenly distributed data points over the surface of the Earth. The method is generally applicable, not only to climate-related data, but to many other kinds of problems (e.g. biological, medical and geological data), where there are unequally (or randomly) spaced observations in temporally or spatially distributed datasets. The Royal Society 2019-03-20 /pmc/articles/PMC6458379/ /pubmed/31031993 http://dx.doi.org/10.1098/rsos.181089 Text en © 2019 The Authors. http://creativecommons.org/licenses/by/4.0/ Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
spellingShingle Earth Science
Salamon, S. J.
Hansen, H. J.
Abbott, D.
How real are observed trends in small correlated datasets?
title How real are observed trends in small correlated datasets?
title_full How real are observed trends in small correlated datasets?
title_fullStr How real are observed trends in small correlated datasets?
title_full_unstemmed How real are observed trends in small correlated datasets?
title_short How real are observed trends in small correlated datasets?
title_sort how real are observed trends in small correlated datasets?
topic Earth Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6458379/
https://www.ncbi.nlm.nih.gov/pubmed/31031993
http://dx.doi.org/10.1098/rsos.181089
work_keys_str_mv AT salamonsj howrealareobservedtrendsinsmallcorrelateddatasets
AT hansenhj howrealareobservedtrendsinsmallcorrelateddatasets
AT abbottd howrealareobservedtrendsinsmallcorrelateddatasets