Cargando…
The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species
The availability of spatially referenced environmental data and species occurrence records in online databases enable practitioners to easily generate species distribution models (SDMs) for a broad array of taxa. Such databases often include occurrence records of unknown reliability, yet little info...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5480872/ https://www.ncbi.nlm.nih.gov/pubmed/28640819 http://dx.doi.org/10.1371/journal.pone.0179152 |
_version_ | 1783245316756078592 |
---|---|
author | Aubry, Keith B. Raley, Catherine M. McKelvey, Kevin S. |
author_facet | Aubry, Keith B. Raley, Catherine M. McKelvey, Kevin S. |
author_sort | Aubry, Keith B. |
collection | PubMed |
description | The availability of spatially referenced environmental data and species occurrence records in online databases enable practitioners to easily generate species distribution models (SDMs) for a broad array of taxa. Such databases often include occurrence records of unknown reliability, yet little information is available on the influence of data quality on SDMs generated for rare, elusive, and cryptic species that are prone to misidentification in the field. We investigated this question for the fisher (Pekania pennanti), a forest carnivore of conservation concern in the Pacific States that is often confused with the more common Pacific marten (Martes caurina). Fisher occurrence records supported by physical evidence (verifiable records) were available from a limited area, whereas occurrence records of unknown quality (unscreened records) were available from throughout the fisher’s historical range. We reserved 20% of the verifiable records to use as a test sample for both models and generated SDMs with each dataset using Maxent. The verifiable model performed substantially better than the unscreened model based on multiple metrics including AUC(test) values (0.78 and 0.62, respectively), evaluation of training and test gains, and statistical tests of how well each model predicted test localities. In addition, the verifiable model was consistent with our knowledge of the fisher’s habitat relations and potential distribution, whereas the unscreened model indicated a much broader area of high-quality habitat (indices > 0.5) that included large expanses of high-elevation habitat that fishers do not occupy. Because Pacific martens remain relatively common in upper elevation habitats in the Cascade Range and Sierra Nevada, the SDM based on unscreened records likely reflects primarily a conflation of marten and fisher habitat. Consequently, accurate identifications are far more important than the spatial extent of occurrence records for generating reliable SDMs for the fisher in this region. We strongly recommend that practitioners avoid using anecdotal occurrence records to build SDMs but, if such data are used, the validity of resulting models should be tested with verifiable occurrence records. |
format | Online Article Text |
id | pubmed-5480872 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-54808722017-07-05 The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species Aubry, Keith B. Raley, Catherine M. McKelvey, Kevin S. PLoS One Research Article The availability of spatially referenced environmental data and species occurrence records in online databases enable practitioners to easily generate species distribution models (SDMs) for a broad array of taxa. Such databases often include occurrence records of unknown reliability, yet little information is available on the influence of data quality on SDMs generated for rare, elusive, and cryptic species that are prone to misidentification in the field. We investigated this question for the fisher (Pekania pennanti), a forest carnivore of conservation concern in the Pacific States that is often confused with the more common Pacific marten (Martes caurina). Fisher occurrence records supported by physical evidence (verifiable records) were available from a limited area, whereas occurrence records of unknown quality (unscreened records) were available from throughout the fisher’s historical range. We reserved 20% of the verifiable records to use as a test sample for both models and generated SDMs with each dataset using Maxent. The verifiable model performed substantially better than the unscreened model based on multiple metrics including AUC(test) values (0.78 and 0.62, respectively), evaluation of training and test gains, and statistical tests of how well each model predicted test localities. In addition, the verifiable model was consistent with our knowledge of the fisher’s habitat relations and potential distribution, whereas the unscreened model indicated a much broader area of high-quality habitat (indices > 0.5) that included large expanses of high-elevation habitat that fishers do not occupy. Because Pacific martens remain relatively common in upper elevation habitats in the Cascade Range and Sierra Nevada, the SDM based on unscreened records likely reflects primarily a conflation of marten and fisher habitat. Consequently, accurate identifications are far more important than the spatial extent of occurrence records for generating reliable SDMs for the fisher in this region. We strongly recommend that practitioners avoid using anecdotal occurrence records to build SDMs but, if such data are used, the validity of resulting models should be tested with verifiable occurrence records. Public Library of Science 2017-06-22 /pmc/articles/PMC5480872/ /pubmed/28640819 http://dx.doi.org/10.1371/journal.pone.0179152 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication. |
spellingShingle | Research Article Aubry, Keith B. Raley, Catherine M. McKelvey, Kevin S. The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species |
title | The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species |
title_full | The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species |
title_fullStr | The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species |
title_full_unstemmed | The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species |
title_short | The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species |
title_sort | importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5480872/ https://www.ncbi.nlm.nih.gov/pubmed/28640819 http://dx.doi.org/10.1371/journal.pone.0179152 |
work_keys_str_mv | AT aubrykeithb theimportanceofdataqualityforgeneratingreliabledistributionmodelsforrareelusiveandcrypticspecies AT raleycatherinem theimportanceofdataqualityforgeneratingreliabledistributionmodelsforrareelusiveandcrypticspecies AT mckelveykevins theimportanceofdataqualityforgeneratingreliabledistributionmodelsforrareelusiveandcrypticspecies AT aubrykeithb importanceofdataqualityforgeneratingreliabledistributionmodelsforrareelusiveandcrypticspecies AT raleycatherinem importanceofdataqualityforgeneratingreliabledistributionmodelsforrareelusiveandcrypticspecies AT mckelveykevins importanceofdataqualityforgeneratingreliabledistributionmodelsforrareelusiveandcrypticspecies |