Cargando…

Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data

BACKGROUND: Ecological communities tend to be spatially structured due to environmental gradients and/or spatially contagious processes such as growth, dispersion and species interactions. Data transformation followed by usage of algorithms such as Redundancy Analysis (RDA) is a fairly common approa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Carlos-Júnior, Lélis A., Creed, Joel C., Marrs, Rob, Lewis, Rob J., Moulton, Timothy P., Feijó-Lima, Rafael, Spencer, Matthew
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2020
Materias:	Biogeography
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7474884/ https://www.ncbi.nlm.nih.gov/pubmed/32953266 http://dx.doi.org/10.7717/peerj.9777

_version_	1783579409419075584
author	Carlos-Júnior, Lélis A. Creed, Joel C. Marrs, Rob Lewis, Rob J. Moulton, Timothy P. Feijó-Lima, Rafael Spencer, Matthew
author_facet	Carlos-Júnior, Lélis A. Creed, Joel C. Marrs, Rob Lewis, Rob J. Moulton, Timothy P. Feijó-Lima, Rafael Spencer, Matthew
author_sort	Carlos-Júnior, Lélis A.
collection	PubMed
description	BACKGROUND: Ecological communities tend to be spatially structured due to environmental gradients and/or spatially contagious processes such as growth, dispersion and species interactions. Data transformation followed by usage of algorithms such as Redundancy Analysis (RDA) is a fairly common approach in studies searching for spatial structure in ecological communities, despite recent suggestions advocating the use of Generalized Linear Models (GLMs). Here, we compared the performance of GLMs and RDA in describing spatial structure in ecological community composition data. We simulated realistic presence/absence data typical of many β-diversity studies. For model selection we used standard methods commonly used in most studies involving RDA and GLMs. METHODS: We simulated communities with known spatial structure, based on three real spatial community presence/absence datasets (one terrestrial, one marine and one freshwater). We used spatial eigenvectors as explanatory variables. We varied the number of non-zero coefficients of the spatial variables, and the spatial scales with which these coefficients were associated and then compared the performance of GLMs and RDA frameworks to correctly retrieve the spatial patterns contained in the simulated communities. We used two different methods for model selection, Forward Selection (FW) for RDA and the Akaike Information Criterion (AIC) for GLMs. The performance of each method was assessed by scoring overall accuracy as the proportion of variables whose inclusion/exclusion status was correct, and by distinguishing which kind of error was observed for each method. We also assessed whether errors in variable selection could affect the interpretation of spatial structure. RESULTS: Overall GLM with AIC-based model selection (GLM/AIC) performed better than RDA/FW in selecting spatial explanatory variables, although under some simulations the methods performed similarly. In general, RDA/FW performed unpredictably, often retaining too many explanatory variables and selecting variables associated with incorrect spatial scales. The spatial scale of the pattern had a negligible effect on GLM/AIC performance but consistently affected RDA’s error rates under almost all scenarios. CONCLUSION: We encourage the use of GLM/AIC for studies searching for spatial drivers of species presence/absence patterns, since this framework outperformed RDA/FW in situations most likely to be found in natural communities. It is likely that such recommendations might extend to other types of explanatory variables.
format	Online Article Text
id	pubmed-7474884
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-74748842020-09-18 Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data Carlos-Júnior, Lélis A. Creed, Joel C. Marrs, Rob Lewis, Rob J. Moulton, Timothy P. Feijó-Lima, Rafael Spencer, Matthew PeerJ Biogeography BACKGROUND: Ecological communities tend to be spatially structured due to environmental gradients and/or spatially contagious processes such as growth, dispersion and species interactions. Data transformation followed by usage of algorithms such as Redundancy Analysis (RDA) is a fairly common approach in studies searching for spatial structure in ecological communities, despite recent suggestions advocating the use of Generalized Linear Models (GLMs). Here, we compared the performance of GLMs and RDA in describing spatial structure in ecological community composition data. We simulated realistic presence/absence data typical of many β-diversity studies. For model selection we used standard methods commonly used in most studies involving RDA and GLMs. METHODS: We simulated communities with known spatial structure, based on three real spatial community presence/absence datasets (one terrestrial, one marine and one freshwater). We used spatial eigenvectors as explanatory variables. We varied the number of non-zero coefficients of the spatial variables, and the spatial scales with which these coefficients were associated and then compared the performance of GLMs and RDA frameworks to correctly retrieve the spatial patterns contained in the simulated communities. We used two different methods for model selection, Forward Selection (FW) for RDA and the Akaike Information Criterion (AIC) for GLMs. The performance of each method was assessed by scoring overall accuracy as the proportion of variables whose inclusion/exclusion status was correct, and by distinguishing which kind of error was observed for each method. We also assessed whether errors in variable selection could affect the interpretation of spatial structure. RESULTS: Overall GLM with AIC-based model selection (GLM/AIC) performed better than RDA/FW in selecting spatial explanatory variables, although under some simulations the methods performed similarly. In general, RDA/FW performed unpredictably, often retaining too many explanatory variables and selecting variables associated with incorrect spatial scales. The spatial scale of the pattern had a negligible effect on GLM/AIC performance but consistently affected RDA’s error rates under almost all scenarios. CONCLUSION: We encourage the use of GLM/AIC for studies searching for spatial drivers of species presence/absence patterns, since this framework outperformed RDA/FW in situations most likely to be found in natural communities. It is likely that such recommendations might extend to other types of explanatory variables. PeerJ Inc. 2020-09-03 /pmc/articles/PMC7474884/ /pubmed/32953266 http://dx.doi.org/10.7717/peerj.9777 Text en ©2020 Carlos-Júnior et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle	Biogeography Carlos-Júnior, Lélis A. Creed, Joel C. Marrs, Rob Lewis, Rob J. Moulton, Timothy P. Feijó-Lima, Rafael Spencer, Matthew Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data
title	Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data
title_full	Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data
title_fullStr	Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data
title_full_unstemmed	Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data
title_short	Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data
title_sort	generalized linear models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data
topic	Biogeography
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7474884/ https://www.ncbi.nlm.nih.gov/pubmed/32953266 http://dx.doi.org/10.7717/peerj.9777
work_keys_str_mv	AT carlosjuniorlelisa generalizedlinearmodelsoutperformcommonlyusedcanonicalanalysisinestimatingspatialstructureofpresenceabsencedata AT creedjoelc generalizedlinearmodelsoutperformcommonlyusedcanonicalanalysisinestimatingspatialstructureofpresenceabsencedata AT marrsrob generalizedlinearmodelsoutperformcommonlyusedcanonicalanalysisinestimatingspatialstructureofpresenceabsencedata AT lewisrobj generalizedlinearmodelsoutperformcommonlyusedcanonicalanalysisinestimatingspatialstructureofpresenceabsencedata AT moultontimothyp generalizedlinearmodelsoutperformcommonlyusedcanonicalanalysisinestimatingspatialstructureofpresenceabsencedata AT feijolimarafael generalizedlinearmodelsoutperformcommonlyusedcanonicalanalysisinestimatingspatialstructureofpresenceabsencedata AT spencermatthew generalizedlinearmodelsoutperformcommonlyusedcanonicalanalysisinestimatingspatialstructureofpresenceabsencedata

Generalized Linear Models outperform commonly used canonical analysis in estimating spatial structure of presence/absence data

Ejemplares similares