Cargando…

Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning

This article describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra...

Descripción completa

Detalles Bibliográficos
Autores principales: Bonannella, Carmelo, Hengl, Tomislav, Heisig, Johannes, Parente, Leandro, Wright, Marvin N., Herold, Martin, de Bruin, Sytze
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9332400/
https://www.ncbi.nlm.nih.gov/pubmed/35910765
http://dx.doi.org/10.7717/peerj.13728
_version_ 1784758637331218432
author Bonannella, Carmelo
Hengl, Tomislav
Heisig, Johannes
Parente, Leandro
Wright, Marvin N.
Herold, Martin
de Bruin, Sytze
author_facet Bonannella, Carmelo
Hengl, Tomislav
Heisig, Johannes
Parente, Leandro
Wright, Marvin N.
Herold, Martin
de Bruin, Sytze
author_sort Bonannella, Carmelo
collection PubMed
description This article describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus robur L., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for a total of three million of points was used to train different algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 305 coarse and high resolution covariates representing spectral reflectance, different biophysical conditions and biotic competition was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. Logloss and computing time were used to select the three best algorithms to tune and train an ensemble model based on stacking with a logistic regressor as a meta-learner. An ensemble model was trained for each species: probability and model uncertainty maps of realized distribution were produced for each species using a time window of 4 years for a total of six distribution maps per species, while for potential distributions only one map per species was produced. Results of spatial cross validation show that the ensemble model consistently outperformed or performed as good as the best individual model in both potential and realized distribution tasks, with potential distribution models achieving higher predictive performances (TSS = 0.898, R(2)(logloss) = 0.857) than realized distribution ones on average (TSS = 0.874, R(2)(logloss) = 0.839). Ensemble models for Q. suber achieved the best performances in both potential (TSS = 0.968, R(2)(logloss) = 0.952) and realized (TSS = 0.959, R(2)(logloss) = 0.949) distribution, while P. sylvestris (TSS = 0.731, 0.785, R(2)(logloss) = 0.585, 0.670, respectively, for potential and realized distribution) and P. nigra (TSS = 0.658, 0.686, R(2)(logloss) = 0.623, 0.664) achieved the worst. Importance of predictor variables differed across species and models, with the green band for summer and the Normalized Difference Vegetation Index (NDVI) for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter (BIO17) being the most frequent and important for potential distribution. On average, fine-resolution models outperformed coarse resolution models (250 m) for realized distribution (TSS = +6.5%, R(2)(logloss) = +7.5%). The framework shows how combining continuous and consistent Earth Observation time series data with state of the art machine learning can be used to derive dynamic distribution maps. The produced predictions can be used to quantify temporal trends of potential forest degradation and species composition change.
format Online
Article
Text
id pubmed-9332400
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-93324002022-07-29 Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning Bonannella, Carmelo Hengl, Tomislav Heisig, Johannes Parente, Leandro Wright, Marvin N. Herold, Martin de Bruin, Sytze PeerJ Biogeography This article describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus robur L., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for a total of three million of points was used to train different algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 305 coarse and high resolution covariates representing spectral reflectance, different biophysical conditions and biotic competition was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. Logloss and computing time were used to select the three best algorithms to tune and train an ensemble model based on stacking with a logistic regressor as a meta-learner. An ensemble model was trained for each species: probability and model uncertainty maps of realized distribution were produced for each species using a time window of 4 years for a total of six distribution maps per species, while for potential distributions only one map per species was produced. Results of spatial cross validation show that the ensemble model consistently outperformed or performed as good as the best individual model in both potential and realized distribution tasks, with potential distribution models achieving higher predictive performances (TSS = 0.898, R(2)(logloss) = 0.857) than realized distribution ones on average (TSS = 0.874, R(2)(logloss) = 0.839). Ensemble models for Q. suber achieved the best performances in both potential (TSS = 0.968, R(2)(logloss) = 0.952) and realized (TSS = 0.959, R(2)(logloss) = 0.949) distribution, while P. sylvestris (TSS = 0.731, 0.785, R(2)(logloss) = 0.585, 0.670, respectively, for potential and realized distribution) and P. nigra (TSS = 0.658, 0.686, R(2)(logloss) = 0.623, 0.664) achieved the worst. Importance of predictor variables differed across species and models, with the green band for summer and the Normalized Difference Vegetation Index (NDVI) for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter (BIO17) being the most frequent and important for potential distribution. On average, fine-resolution models outperformed coarse resolution models (250 m) for realized distribution (TSS = +6.5%, R(2)(logloss) = +7.5%). The framework shows how combining continuous and consistent Earth Observation time series data with state of the art machine learning can be used to derive dynamic distribution maps. The produced predictions can be used to quantify temporal trends of potential forest degradation and species composition change. PeerJ Inc. 2022-07-25 /pmc/articles/PMC9332400/ /pubmed/35910765 http://dx.doi.org/10.7717/peerj.13728 Text en © 2022 Bonannella et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biogeography
Bonannella, Carmelo
Hengl, Tomislav
Heisig, Johannes
Parente, Leandro
Wright, Marvin N.
Herold, Martin
de Bruin, Sytze
Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
title Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
title_full Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
title_fullStr Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
title_full_unstemmed Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
title_short Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
title_sort forest tree species distribution for europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
topic Biogeography
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9332400/
https://www.ncbi.nlm.nih.gov/pubmed/35910765
http://dx.doi.org/10.7717/peerj.13728
work_keys_str_mv AT bonannellacarmelo foresttreespeciesdistributionforeurope20002020mappingpotentialandrealizeddistributionsusingspatiotemporalmachinelearning
AT hengltomislav foresttreespeciesdistributionforeurope20002020mappingpotentialandrealizeddistributionsusingspatiotemporalmachinelearning
AT heisigjohannes foresttreespeciesdistributionforeurope20002020mappingpotentialandrealizeddistributionsusingspatiotemporalmachinelearning
AT parenteleandro foresttreespeciesdistributionforeurope20002020mappingpotentialandrealizeddistributionsusingspatiotemporalmachinelearning
AT wrightmarvinn foresttreespeciesdistributionforeurope20002020mappingpotentialandrealizeddistributionsusingspatiotemporalmachinelearning
AT heroldmartin foresttreespeciesdistributionforeurope20002020mappingpotentialandrealizeddistributionsusingspatiotemporalmachinelearning
AT debruinsytze foresttreespeciesdistributionforeurope20002020mappingpotentialandrealizeddistributionsusingspatiotemporalmachinelearning