Cargando…

On the selection of thresholds for predicting species occurrence with presence‐only data

Presence‐only data present challenges for selecting thresholds to transform species distribution modeling results into binary outputs. In this article, we compare two recently published threshold selection methods (maxSSS and maxF (pb)) and examine the effectiveness of the threshold‐based prevalence...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Canran, Newell, Graeme, White, Matt
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley and Sons Inc. 2015
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4716501/ https://www.ncbi.nlm.nih.gov/pubmed/26811797 http://dx.doi.org/10.1002/ece3.1878

_version_	1782410547319275520
author	Liu, Canran Newell, Graeme White, Matt
author_facet	Liu, Canran Newell, Graeme White, Matt
author_sort	Liu, Canran
collection	PubMed
description	Presence‐only data present challenges for selecting thresholds to transform species distribution modeling results into binary outputs. In this article, we compare two recently published threshold selection methods (maxSSS and maxF (pb)) and examine the effectiveness of the threshold‐based prevalence estimation approach. Six virtual species with varying prevalence were simulated within a real landscape in southeastern Australia. Presence‐only models were built with DOMAIN, generalized linear model, Maxent, and Random Forest. Thresholds were selected with two methods maxSSS and maxF (pb) with four presence‐only datasets with different ratios of the number of known presences to the number of random points (KP–RP (ratio)). Sensitivity, specificity, true skill statistic, and F measure were used to evaluate the performance of the results. Species prevalence was estimated as the ratio of the number of predicted presences to the total number of points in the evaluation dataset. Thresholds selected with maxF (pb) varied as the KP–RP (ratio) of the threshold selection datasets changed. Datasets with the KP–RP (ratio) around 1 generally produced better results than scores distant from 1. Results produced by We conclude that maxF(pb) had specificity too low for very common species using Random Forest and Maxent models. In contrast, maxSSS produced consistent results whichever dataset was used. The estimation of prevalence was almost always biased, and the bias was very large for DOMAIN and Random Forest predictions. We conclude that maxF (pb) is affected by the KP–RP (ratio) of the threshold selection datasets, but maxSSS is almost unaffected by this ratio. Unbiased estimations of prevalence are difficult to be determined using the threshold‐based approach.
format	Online Article Text
id	pubmed-4716501
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	John Wiley and Sons Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-47165012016-01-25 On the selection of thresholds for predicting species occurrence with presence‐only data Liu, Canran Newell, Graeme White, Matt Ecol Evol Original Research Presence‐only data present challenges for selecting thresholds to transform species distribution modeling results into binary outputs. In this article, we compare two recently published threshold selection methods (maxSSS and maxF (pb)) and examine the effectiveness of the threshold‐based prevalence estimation approach. Six virtual species with varying prevalence were simulated within a real landscape in southeastern Australia. Presence‐only models were built with DOMAIN, generalized linear model, Maxent, and Random Forest. Thresholds were selected with two methods maxSSS and maxF (pb) with four presence‐only datasets with different ratios of the number of known presences to the number of random points (KP–RP (ratio)). Sensitivity, specificity, true skill statistic, and F measure were used to evaluate the performance of the results. Species prevalence was estimated as the ratio of the number of predicted presences to the total number of points in the evaluation dataset. Thresholds selected with maxF (pb) varied as the KP–RP (ratio) of the threshold selection datasets changed. Datasets with the KP–RP (ratio) around 1 generally produced better results than scores distant from 1. Results produced by We conclude that maxF(pb) had specificity too low for very common species using Random Forest and Maxent models. In contrast, maxSSS produced consistent results whichever dataset was used. The estimation of prevalence was almost always biased, and the bias was very large for DOMAIN and Random Forest predictions. We conclude that maxF (pb) is affected by the KP–RP (ratio) of the threshold selection datasets, but maxSSS is almost unaffected by this ratio. Unbiased estimations of prevalence are difficult to be determined using the threshold‐based approach. John Wiley and Sons Inc. 2015-12-29 /pmc/articles/PMC4716501/ /pubmed/26811797 http://dx.doi.org/10.1002/ece3.1878 Text en © 2015 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution (http://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Research Liu, Canran Newell, Graeme White, Matt On the selection of thresholds for predicting species occurrence with presence‐only data
title	On the selection of thresholds for predicting species occurrence with presence‐only data
title_full	On the selection of thresholds for predicting species occurrence with presence‐only data
title_fullStr	On the selection of thresholds for predicting species occurrence with presence‐only data
title_full_unstemmed	On the selection of thresholds for predicting species occurrence with presence‐only data
title_short	On the selection of thresholds for predicting species occurrence with presence‐only data
title_sort	on the selection of thresholds for predicting species occurrence with presence‐only data
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4716501/ https://www.ncbi.nlm.nih.gov/pubmed/26811797 http://dx.doi.org/10.1002/ece3.1878
work_keys_str_mv	AT liucanran ontheselectionofthresholdsforpredictingspeciesoccurrencewithpresenceonlydata AT newellgraeme ontheselectionofthresholdsforpredictingspeciesoccurrencewithpresenceonlydata AT whitematt ontheselectionofthresholdsforpredictingspeciesoccurrencewithpresenceonlydata

On the selection of thresholds for predicting species occurrence with presence‐only data

Ejemplares similares