Cargando…

On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology

Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive cov...

Descripción completa

Detalles Bibliográficos
Autores principales: Conn, Paul B., Johnson, Devin S., Boveng, Peter L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619888/
https://www.ncbi.nlm.nih.gov/pubmed/26496358
http://dx.doi.org/10.1371/journal.pone.0141416
_version_ 1782397206108569600
author Conn, Paul B.
Johnson, Devin S.
Boveng, Peter L.
author_facet Conn, Paul B.
Johnson, Devin S.
Boveng, Peter L.
author_sort Conn, Paul B.
collection PubMed
description Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook’s notion of an independent variable hull (IVH), developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH) can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models).
format Online
Article
Text
id pubmed-4619888
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46198882015-10-29 On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology Conn, Paul B. Johnson, Devin S. Boveng, Peter L. PLoS One Research Article Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook’s notion of an independent variable hull (IVH), developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH) can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models). Public Library of Science 2015-10-23 /pmc/articles/PMC4619888/ /pubmed/26496358 http://dx.doi.org/10.1371/journal.pone.0141416 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Conn, Paul B.
Johnson, Devin S.
Boveng, Peter L.
On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology
title On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology
title_full On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology
title_fullStr On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology
title_full_unstemmed On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology
title_short On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology
title_sort on extrapolating past the range of observed data when making statistical predictions in ecology
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619888/
https://www.ncbi.nlm.nih.gov/pubmed/26496358
http://dx.doi.org/10.1371/journal.pone.0141416
work_keys_str_mv AT connpaulb onextrapolatingpasttherangeofobserveddatawhenmakingstatisticalpredictionsinecology
AT johnsondevins onextrapolatingpasttherangeofobserveddatawhenmakingstatisticalpredictionsinecology
AT bovengpeterl onextrapolatingpasttherangeofobserveddatawhenmakingstatisticalpredictionsinecology