Cargando…
Why sampling ratio matters: Logistic regression and studies of habitat use
Logistic regression (LR) models are among the most frequently used statistical tools in ecology. With LR one can infer if a species’ habitat use is related to environmental factors and estimate the probability of species occurrence based on the values of these factors. However, studies often use ina...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6056037/ https://www.ncbi.nlm.nih.gov/pubmed/30036369 http://dx.doi.org/10.1371/journal.pone.0200742 |
_version_ | 1783341282403287040 |
---|---|
author | Nad’o, Ladislav Kaňuch, Peter |
author_facet | Nad’o, Ladislav Kaňuch, Peter |
author_sort | Nad’o, Ladislav |
collection | PubMed |
description | Logistic regression (LR) models are among the most frequently used statistical tools in ecology. With LR one can infer if a species’ habitat use is related to environmental factors and estimate the probability of species occurrence based on the values of these factors. However, studies often use inadequate sampling with regards to the arbitrarily chosen ratio between occupied and unoccupied (or available) locations, and this has a profound effect on the inference and predictive power of LR models. To demonstrate the effect of various sampling strategies/efforts on the quality of LR models, we used a unique census dataset containing all the used roosting cavities of the tree-dwelling bat Nyctalus leisleri and all cavities where the species was absent. We compared models constructed from randomly selected data subsets with varying ratios of occupied and unoccupied cavities (1:1, 1:5, 1:10) with a full dataset model (ratio 1:31). These comparisons revealed that the power of LR models was low when the sampling did not reflect the population ratio of occupied and unoccupied cavities. The use of weights improved the subsampled models. Thus, this study warns against inadequate data sampling and highly encourages a randomized sampling procedure to estimate the true ratio of occupied:unoccupied locations, which can then be used to optimize a manageable sampling effort and apply weights to improve the LR model. Such an approach may provide robust and reliable models suitable for both inference and prediction. |
format | Online Article Text |
id | pubmed-6056037 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-60560372018-08-06 Why sampling ratio matters: Logistic regression and studies of habitat use Nad’o, Ladislav Kaňuch, Peter PLoS One Research Article Logistic regression (LR) models are among the most frequently used statistical tools in ecology. With LR one can infer if a species’ habitat use is related to environmental factors and estimate the probability of species occurrence based on the values of these factors. However, studies often use inadequate sampling with regards to the arbitrarily chosen ratio between occupied and unoccupied (or available) locations, and this has a profound effect on the inference and predictive power of LR models. To demonstrate the effect of various sampling strategies/efforts on the quality of LR models, we used a unique census dataset containing all the used roosting cavities of the tree-dwelling bat Nyctalus leisleri and all cavities where the species was absent. We compared models constructed from randomly selected data subsets with varying ratios of occupied and unoccupied cavities (1:1, 1:5, 1:10) with a full dataset model (ratio 1:31). These comparisons revealed that the power of LR models was low when the sampling did not reflect the population ratio of occupied and unoccupied cavities. The use of weights improved the subsampled models. Thus, this study warns against inadequate data sampling and highly encourages a randomized sampling procedure to estimate the true ratio of occupied:unoccupied locations, which can then be used to optimize a manageable sampling effort and apply weights to improve the LR model. Such an approach may provide robust and reliable models suitable for both inference and prediction. Public Library of Science 2018-07-23 /pmc/articles/PMC6056037/ /pubmed/30036369 http://dx.doi.org/10.1371/journal.pone.0200742 Text en © 2018 Nad’o, Kaňuch http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Nad’o, Ladislav Kaňuch, Peter Why sampling ratio matters: Logistic regression and studies of habitat use |
title | Why sampling ratio matters: Logistic regression and studies of habitat use |
title_full | Why sampling ratio matters: Logistic regression and studies of habitat use |
title_fullStr | Why sampling ratio matters: Logistic regression and studies of habitat use |
title_full_unstemmed | Why sampling ratio matters: Logistic regression and studies of habitat use |
title_short | Why sampling ratio matters: Logistic regression and studies of habitat use |
title_sort | why sampling ratio matters: logistic regression and studies of habitat use |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6056037/ https://www.ncbi.nlm.nih.gov/pubmed/30036369 http://dx.doi.org/10.1371/journal.pone.0200742 |
work_keys_str_mv | AT nadoladislav whysamplingratiomatterslogisticregressionandstudiesofhabitatuse AT kanuchpeter whysamplingratiomatterslogisticregressionandstudiesofhabitatuse |