Cargando…

Data quantity is more important than its spatial bias for predictive species distribution modelling

Biological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual sp...

Descripción completa

Detalles Bibliográficos
Autores principales: Gaul, Willson, Sadykova, Dinara, White, Hannah J., Leon-Sanchez, Lupe, Caplat, Paul, Emmerson, Mark C., Yearsley, Jon M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703440/
https://www.ncbi.nlm.nih.gov/pubmed/33312769
http://dx.doi.org/10.7717/peerj.10411
_version_ 1783616638901288960
author Gaul, Willson
Sadykova, Dinara
White, Hannah J.
Leon-Sanchez, Lupe
Caplat, Paul
Emmerson, Mark C.
Yearsley, Jon M.
author_facet Gaul, Willson
Sadykova, Dinara
White, Hannah J.
Leon-Sanchez, Lupe
Caplat, Paul
Emmerson, Mark C.
Yearsley, Jon M.
author_sort Gaul, Willson
collection PubMed
description Biological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of (1) spatial bias in training data, (2) sample size (the average number of observations per species), and (3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.
format Online
Article
Text
id pubmed-7703440
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-77034402020-12-10 Data quantity is more important than its spatial bias for predictive species distribution modelling Gaul, Willson Sadykova, Dinara White, Hannah J. Leon-Sanchez, Lupe Caplat, Paul Emmerson, Mark C. Yearsley, Jon M. PeerJ Biogeography Biological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of (1) spatial bias in training data, (2) sample size (the average number of observations per species), and (3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models. PeerJ Inc. 2020-11-27 /pmc/articles/PMC7703440/ /pubmed/33312769 http://dx.doi.org/10.7717/peerj.10411 Text en ©2020 Gaul et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biogeography
Gaul, Willson
Sadykova, Dinara
White, Hannah J.
Leon-Sanchez, Lupe
Caplat, Paul
Emmerson, Mark C.
Yearsley, Jon M.
Data quantity is more important than its spatial bias for predictive species distribution modelling
title Data quantity is more important than its spatial bias for predictive species distribution modelling
title_full Data quantity is more important than its spatial bias for predictive species distribution modelling
title_fullStr Data quantity is more important than its spatial bias for predictive species distribution modelling
title_full_unstemmed Data quantity is more important than its spatial bias for predictive species distribution modelling
title_short Data quantity is more important than its spatial bias for predictive species distribution modelling
title_sort data quantity is more important than its spatial bias for predictive species distribution modelling
topic Biogeography
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703440/
https://www.ncbi.nlm.nih.gov/pubmed/33312769
http://dx.doi.org/10.7717/peerj.10411
work_keys_str_mv AT gaulwillson dataquantityismoreimportantthanitsspatialbiasforpredictivespeciesdistributionmodelling
AT sadykovadinara dataquantityismoreimportantthanitsspatialbiasforpredictivespeciesdistributionmodelling
AT whitehannahj dataquantityismoreimportantthanitsspatialbiasforpredictivespeciesdistributionmodelling
AT leonsanchezlupe dataquantityismoreimportantthanitsspatialbiasforpredictivespeciesdistributionmodelling
AT caplatpaul dataquantityismoreimportantthanitsspatialbiasforpredictivespeciesdistributionmodelling
AT emmersonmarkc dataquantityismoreimportantthanitsspatialbiasforpredictivespeciesdistributionmodelling
AT yearsleyjonm dataquantityismoreimportantthanitsspatialbiasforpredictivespeciesdistributionmodelling