Cargando…

The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models

Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance....

Descripción completa

Detalles Bibliográficos
Autores principales: Syfert, Mindy M., Smith, Matthew J., Coomes, David A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3573023/
https://www.ncbi.nlm.nih.gov/pubmed/23457462
http://dx.doi.org/10.1371/journal.pone.0055158
_version_ 1782259387308441600
author Syfert, Mindy M.
Smith, Matthew J.
Coomes, David A.
author_facet Syfert, Mindy M.
Smith, Matthew J.
Coomes, David A.
author_sort Syfert, Mindy M.
collection PubMed
description Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance. Working with MaxEnt software and with tree fern presence data from New Zealand, we assessed whether (a) choosing to correct for geographical sampling bias and (b) using complex environmental response curves have strong effects on goodness of fit. SDMs were trained on tree fern data, obtained from an online biodiversity data portal, with two sources that differed in size and geographical sampling bias: a small, widely-distributed set of herbarium specimens and a large, spatially clustered set of ecological survey records. We attempted to correct for geographical sampling bias by incorporating sampling bias grids in the SDMs, created from all georeferenced vascular plants in the datasets, and explored model complexity issues by fitting a wide variety of environmental response curves (known as “feature types” in MaxEnt). In each case, goodness of fit was assessed by comparing predicted range maps with tree fern presences and absences using an independent national dataset to validate the SDMs. We found that correcting for geographical sampling bias led to major improvements in goodness of fit, but did not entirely resolve the problem: predictions made with clustered ecological data were inferior to those made with the herbarium dataset, even after sampling bias correction. We also found that the choice of feature type had negligible effects on predictive performance, indicating that simple feature types may be sufficient once sampling bias is accounted for. Our study emphasizes the importance of reducing geographical sampling bias, where possible, in datasets used to train SDMs, and the effectiveness and essentialness of sampling bias correction within MaxEnt.
format Online
Article
Text
id pubmed-3573023
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35730232013-03-01 The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models Syfert, Mindy M. Smith, Matthew J. Coomes, David A. PLoS One Research Article Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance. Working with MaxEnt software and with tree fern presence data from New Zealand, we assessed whether (a) choosing to correct for geographical sampling bias and (b) using complex environmental response curves have strong effects on goodness of fit. SDMs were trained on tree fern data, obtained from an online biodiversity data portal, with two sources that differed in size and geographical sampling bias: a small, widely-distributed set of herbarium specimens and a large, spatially clustered set of ecological survey records. We attempted to correct for geographical sampling bias by incorporating sampling bias grids in the SDMs, created from all georeferenced vascular plants in the datasets, and explored model complexity issues by fitting a wide variety of environmental response curves (known as “feature types” in MaxEnt). In each case, goodness of fit was assessed by comparing predicted range maps with tree fern presences and absences using an independent national dataset to validate the SDMs. We found that correcting for geographical sampling bias led to major improvements in goodness of fit, but did not entirely resolve the problem: predictions made with clustered ecological data were inferior to those made with the herbarium dataset, even after sampling bias correction. We also found that the choice of feature type had negligible effects on predictive performance, indicating that simple feature types may be sufficient once sampling bias is accounted for. Our study emphasizes the importance of reducing geographical sampling bias, where possible, in datasets used to train SDMs, and the effectiveness and essentialness of sampling bias correction within MaxEnt. Public Library of Science 2013-02-14 /pmc/articles/PMC3573023/ /pubmed/23457462 http://dx.doi.org/10.1371/journal.pone.0055158 Text en © 2013 Syfert et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Syfert, Mindy M.
Smith, Matthew J.
Coomes, David A.
The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models
title The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models
title_full The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models
title_fullStr The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models
title_full_unstemmed The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models
title_short The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models
title_sort effects of sampling bias and model complexity on the predictive performance of maxent species distribution models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3573023/
https://www.ncbi.nlm.nih.gov/pubmed/23457462
http://dx.doi.org/10.1371/journal.pone.0055158
work_keys_str_mv AT syfertmindym theeffectsofsamplingbiasandmodelcomplexityonthepredictiveperformanceofmaxentspeciesdistributionmodels
AT smithmatthewj theeffectsofsamplingbiasandmodelcomplexityonthepredictiveperformanceofmaxentspeciesdistributionmodels
AT coomesdavida theeffectsofsamplingbiasandmodelcomplexityonthepredictiveperformanceofmaxentspeciesdistributionmodels
AT syfertmindym effectsofsamplingbiasandmodelcomplexityonthepredictiveperformanceofmaxentspeciesdistributionmodels
AT smithmatthewj effectsofsamplingbiasandmodelcomplexityonthepredictiveperformanceofmaxentspeciesdistributionmodels
AT coomesdavida effectsofsamplingbiasandmodelcomplexityonthepredictiveperformanceofmaxentspeciesdistributionmodels