Cargando…

Statistical tests for non-independent partitions of large autocorrelated datasets

Large sets of autocorrelated data are common in fields such as remote sensing and genomics. For example, remote sensing can produce maps of information for millions of pixels, and the information from nearby pixels will likely be spatially autocorrelated. Although there are well-established statisti...

Descripción completa

Detalles Bibliográficos
Autores principales: Ives, Anthony R., Zhu, Likai, Wang, Fangfang, Zhu, Jun, Morrow, Clay J., Radeloff, Volker C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8957054/
https://www.ncbi.nlm.nih.gov/pubmed/35345788
http://dx.doi.org/10.1016/j.mex.2022.101660
_version_ 1784676689410785280
author Ives, Anthony R.
Zhu, Likai
Wang, Fangfang
Zhu, Jun
Morrow, Clay J.
Radeloff, Volker C.
author_facet Ives, Anthony R.
Zhu, Likai
Wang, Fangfang
Zhu, Jun
Morrow, Clay J.
Radeloff, Volker C.
author_sort Ives, Anthony R.
collection PubMed
description Large sets of autocorrelated data are common in fields such as remote sensing and genomics. For example, remote sensing can produce maps of information for millions of pixels, and the information from nearby pixels will likely be spatially autocorrelated. Although there are well-established statistical methods for testing hypotheses using autocorrelated data, these methods become computationally impractical for large datasets. • The method developed here makes it feasible to perform F-tests, likelihood ratio tests, and t-tests for large autocorrelated datasets. The method involves subsetting the dataset into partitions, analyzing each partition separately, and then combining the separate tests to give an overall test. • The separate statistical tests on partitions are non-independent, because the points in different partitions are not independent. Therefore, combining separate analyses of partitions requires accounting for the non-independence of the test statistics among partitions. • The methods can be applied to a wide range of data, including not only purely spatial data but also spatiotemporal data. For spatiotemporal data, it is possible to estimate coefficients from time-series models at different spatial locations and then analyze the spatial distribution of the estimates. The spatial analysis can be simplified by estimating spatial autocorrelation directly from the spatial autocorrelation among time series.
format Online
Article
Text
id pubmed-8957054
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-89570542022-03-27 Statistical tests for non-independent partitions of large autocorrelated datasets Ives, Anthony R. Zhu, Likai Wang, Fangfang Zhu, Jun Morrow, Clay J. Radeloff, Volker C. MethodsX Method Article Large sets of autocorrelated data are common in fields such as remote sensing and genomics. For example, remote sensing can produce maps of information for millions of pixels, and the information from nearby pixels will likely be spatially autocorrelated. Although there are well-established statistical methods for testing hypotheses using autocorrelated data, these methods become computationally impractical for large datasets. • The method developed here makes it feasible to perform F-tests, likelihood ratio tests, and t-tests for large autocorrelated datasets. The method involves subsetting the dataset into partitions, analyzing each partition separately, and then combining the separate tests to give an overall test. • The separate statistical tests on partitions are non-independent, because the points in different partitions are not independent. Therefore, combining separate analyses of partitions requires accounting for the non-independence of the test statistics among partitions. • The methods can be applied to a wide range of data, including not only purely spatial data but also spatiotemporal data. For spatiotemporal data, it is possible to estimate coefficients from time-series models at different spatial locations and then analyze the spatial distribution of the estimates. The spatial analysis can be simplified by estimating spatial autocorrelation directly from the spatial autocorrelation among time series. Elsevier 2022-03-12 /pmc/articles/PMC8957054/ /pubmed/35345788 http://dx.doi.org/10.1016/j.mex.2022.101660 Text en © 2022 Published by Elsevier B.V. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Method Article
Ives, Anthony R.
Zhu, Likai
Wang, Fangfang
Zhu, Jun
Morrow, Clay J.
Radeloff, Volker C.
Statistical tests for non-independent partitions of large autocorrelated datasets
title Statistical tests for non-independent partitions of large autocorrelated datasets
title_full Statistical tests for non-independent partitions of large autocorrelated datasets
title_fullStr Statistical tests for non-independent partitions of large autocorrelated datasets
title_full_unstemmed Statistical tests for non-independent partitions of large autocorrelated datasets
title_short Statistical tests for non-independent partitions of large autocorrelated datasets
title_sort statistical tests for non-independent partitions of large autocorrelated datasets
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8957054/
https://www.ncbi.nlm.nih.gov/pubmed/35345788
http://dx.doi.org/10.1016/j.mex.2022.101660
work_keys_str_mv AT ivesanthonyr statisticaltestsfornonindependentpartitionsoflargeautocorrelateddatasets
AT zhulikai statisticaltestsfornonindependentpartitionsoflargeautocorrelateddatasets
AT wangfangfang statisticaltestsfornonindependentpartitionsoflargeautocorrelateddatasets
AT zhujun statisticaltestsfornonindependentpartitionsoflargeautocorrelateddatasets
AT morrowclayj statisticaltestsfornonindependentpartitionsoflargeautocorrelateddatasets
AT radeloffvolkerc statisticaltestsfornonindependentpartitionsoflargeautocorrelateddatasets