Cargando…
Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction?
Citizen‐science databases have been used to develop species distribution models (SDMs), although many taxa may be only georeferenced to county. It is tacitly assumed that SDMs built from county‐scale data should be less precise than those built with more accurate localities, but the extent of the bi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5551104/ https://www.ncbi.nlm.nih.gov/pubmed/28808561 http://dx.doi.org/10.1002/ece3.3115 |
_version_ | 1783256242625445888 |
---|---|
author | Collins, Steven D. Abbott, John C. McIntyre, Nancy E. |
author_facet | Collins, Steven D. Abbott, John C. McIntyre, Nancy E. |
author_sort | Collins, Steven D. |
collection | PubMed |
description | Citizen‐science databases have been used to develop species distribution models (SDMs), although many taxa may be only georeferenced to county. It is tacitly assumed that SDMs built from county‐scale data should be less precise than those built with more accurate localities, but the extent of the bias is currently unknown. Our aims in this study were to illustrate the effects of using county‐scale data on the spatial extent and accuracy of SDMs relative to true locality data and to compare potential compensatory methods (including increased sample size and using overall county environmental averages rather than point locality environmental data). To do so, we developed SDMs in maxent with PRISM‐derived BIOCLIM parameters for 283 and 230 species of odonates (dragonflies and damselflies) and butterflies, respectively, for five subsets from the OdonataCentral and Butterflies and Moths of North America citizen‐science databases: (1) a true locality dataset, (2) a corresponding sister dataset of county‐centroid coordinates, (3) a dataset where the average environmental conditions within each county were assigned to each record, (4) a 50/50% mix of true localities and county‐centroid coordinates, and (5) a 50/50% mix of true localities and records assigned the average environmental conditions within each county. These mixtures allowed us to quantify the degree of bias from county‐scale data. Models developed with county centroids overpredicted the extent of suitable habitat by 15% on average compared to true locality models, although larger sample sizes (>100 locality records) reduced this disparity. Assigning county‐averaged environmental conditions did not offer consistent improvement, however. Because county‐level data are of limited value for developing SDMs except for species that are widespread and well collected or that inhabit regions where small, climatically uniform counties predominate, three means of encouraging more accurate georeferencing in citizen‐science databases are provided. |
format | Online Article Text |
id | pubmed-5551104 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-55511042017-08-14 Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? Collins, Steven D. Abbott, John C. McIntyre, Nancy E. Ecol Evol Original Research Citizen‐science databases have been used to develop species distribution models (SDMs), although many taxa may be only georeferenced to county. It is tacitly assumed that SDMs built from county‐scale data should be less precise than those built with more accurate localities, but the extent of the bias is currently unknown. Our aims in this study were to illustrate the effects of using county‐scale data on the spatial extent and accuracy of SDMs relative to true locality data and to compare potential compensatory methods (including increased sample size and using overall county environmental averages rather than point locality environmental data). To do so, we developed SDMs in maxent with PRISM‐derived BIOCLIM parameters for 283 and 230 species of odonates (dragonflies and damselflies) and butterflies, respectively, for five subsets from the OdonataCentral and Butterflies and Moths of North America citizen‐science databases: (1) a true locality dataset, (2) a corresponding sister dataset of county‐centroid coordinates, (3) a dataset where the average environmental conditions within each county were assigned to each record, (4) a 50/50% mix of true localities and county‐centroid coordinates, and (5) a 50/50% mix of true localities and records assigned the average environmental conditions within each county. These mixtures allowed us to quantify the degree of bias from county‐scale data. Models developed with county centroids overpredicted the extent of suitable habitat by 15% on average compared to true locality models, although larger sample sizes (>100 locality records) reduced this disparity. Assigning county‐averaged environmental conditions did not offer consistent improvement, however. Because county‐level data are of limited value for developing SDMs except for species that are widespread and well collected or that inhabit regions where small, climatically uniform counties predominate, three means of encouraging more accurate georeferencing in citizen‐science databases are provided. John Wiley and Sons Inc. 2017-06-28 /pmc/articles/PMC5551104/ /pubmed/28808561 http://dx.doi.org/10.1002/ece3.3115 Text en © 2017 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution (http://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Research Collins, Steven D. Abbott, John C. McIntyre, Nancy E. Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? |
title | Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? |
title_full | Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? |
title_fullStr | Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? |
title_full_unstemmed | Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? |
title_short | Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? |
title_sort | quantifying the degree of bias from using county‐scale data in species distribution modeling: can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5551104/ https://www.ncbi.nlm.nih.gov/pubmed/28808561 http://dx.doi.org/10.1002/ece3.3115 |
work_keys_str_mv | AT collinsstevend quantifyingthedegreeofbiasfromusingcountyscaledatainspeciesdistributionmodelingcanincreasingsamplesizeorusingcountyaveragedenvironmentaldatareducedistributionaloverprediction AT abbottjohnc quantifyingthedegreeofbiasfromusingcountyscaledatainspeciesdistributionmodelingcanincreasingsamplesizeorusingcountyaveragedenvironmentaldatareducedistributionaloverprediction AT mcintyrenancye quantifyingthedegreeofbiasfromusingcountyscaledatainspeciesdistributionmodelingcanincreasingsamplesizeorusingcountyaveragedenvironmentaldatareducedistributionaloverprediction |