Cargando…

Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data

BACKGROUND: Multiple logistic regression is precluded from many practical applications in ecology that aim to predict the geographic distributions of species because it requires absence data, which are rarely available or are unreliable. In order to use multiple logistic regression, many studies hav...

Descripción completa

Detalles Bibliográficos
Autores principales: Wisz, Mary S, Guisan, Antoine
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2680809/
https://www.ncbi.nlm.nih.gov/pubmed/19393082
http://dx.doi.org/10.1186/1472-6785-9-8
_version_ 1782166971627864064
author Wisz, Mary S
Guisan, Antoine
author_facet Wisz, Mary S
Guisan, Antoine
author_sort Wisz, Mary S
collection PubMed
description BACKGROUND: Multiple logistic regression is precluded from many practical applications in ecology that aim to predict the geographic distributions of species because it requires absence data, which are rarely available or are unreliable. In order to use multiple logistic regression, many studies have simulated "pseudo-absences" through a number of strategies, but it is unknown how the choice of strategy influences models and their geographic predictions of species. In this paper we evaluate the effect of several prevailing pseudo-absence strategies on the predictions of the geographic distribution of a virtual species whose "true" distribution and relationship to three environmental predictors was predefined. We evaluated the effect of using a) real absences b) pseudo-absences selected randomly from the background and c) two-step approaches: pseudo-absences selected from low suitability areas predicted by either Ecological Niche Factor Analysis: (ENFA) or BIOCLIM. We compared how the choice of pseudo-absence strategy affected model fit, predictive power, and information-theoretic model selection results. RESULTS: Models built with true absences had the best predictive power, best discriminatory power, and the "true" model (the one that contained the correct predictors) was supported by the data according to AIC, as expected. Models based on random pseudo-absences had among the lowest fit, but yielded the second highest AUC value (0.97), and the "true" model was also supported by the data. Models based on two-step approaches had intermediate fit, the lowest predictive power, and the "true" model was not supported by the data. CONCLUSION: If ecologists wish to build parsimonious GLM models that will allow them to make robust predictions, a reasonable approach is to use a large number of randomly selected pseudo-absences, and perform model selection based on an information theoretic approach. However, the resulting models can be expected to have limited fit.
format Text
id pubmed-2680809
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26808092009-05-13 Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data Wisz, Mary S Guisan, Antoine BMC Ecol Research Article BACKGROUND: Multiple logistic regression is precluded from many practical applications in ecology that aim to predict the geographic distributions of species because it requires absence data, which are rarely available or are unreliable. In order to use multiple logistic regression, many studies have simulated "pseudo-absences" through a number of strategies, but it is unknown how the choice of strategy influences models and their geographic predictions of species. In this paper we evaluate the effect of several prevailing pseudo-absence strategies on the predictions of the geographic distribution of a virtual species whose "true" distribution and relationship to three environmental predictors was predefined. We evaluated the effect of using a) real absences b) pseudo-absences selected randomly from the background and c) two-step approaches: pseudo-absences selected from low suitability areas predicted by either Ecological Niche Factor Analysis: (ENFA) or BIOCLIM. We compared how the choice of pseudo-absence strategy affected model fit, predictive power, and information-theoretic model selection results. RESULTS: Models built with true absences had the best predictive power, best discriminatory power, and the "true" model (the one that contained the correct predictors) was supported by the data according to AIC, as expected. Models based on random pseudo-absences had among the lowest fit, but yielded the second highest AUC value (0.97), and the "true" model was also supported by the data. Models based on two-step approaches had intermediate fit, the lowest predictive power, and the "true" model was not supported by the data. CONCLUSION: If ecologists wish to build parsimonious GLM models that will allow them to make robust predictions, a reasonable approach is to use a large number of randomly selected pseudo-absences, and perform model selection based on an information theoretic approach. However, the resulting models can be expected to have limited fit. BioMed Central 2009-04-24 /pmc/articles/PMC2680809/ /pubmed/19393082 http://dx.doi.org/10.1186/1472-6785-9-8 Text en Copyright © 2009 Wisz and Guisan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wisz, Mary S
Guisan, Antoine
Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data
title Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data
title_full Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data
title_fullStr Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data
title_full_unstemmed Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data
title_short Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data
title_sort do pseudo-absence selection strategies influence species distribution models and their predictions? an information-theoretic approach based on simulated data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2680809/
https://www.ncbi.nlm.nih.gov/pubmed/19393082
http://dx.doi.org/10.1186/1472-6785-9-8
work_keys_str_mv AT wiszmarys dopseudoabsenceselectionstrategiesinfluencespeciesdistributionmodelsandtheirpredictionsaninformationtheoreticapproachbasedonsimulateddata
AT guisanantoine dopseudoabsenceselectionstrategiesinfluencespeciesdistributionmodelsandtheirpredictionsaninformationtheoreticapproachbasedonsimulateddata