Cargando…

A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data

The current exponential increase of spatiotemporally explicit data streams from satellitebased Earth observation missions offers promising opportunities for global vegetation monitoring. Intelligent sampling through active learning (AL) heuristics provides a pathway for fast inference of essential v...

Descripción completa

Detalles Bibliográficos
Autores principales: Berger, Katja, Caicedo, Juan Pablo Rivera, Martino, Luca, Wocher, Matthias, Hank, Tobias, Verrelst, Jochem
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7613397/
https://www.ncbi.nlm.nih.gov/pubmed/36081683
http://dx.doi.org/10.3390/rs13020287
_version_ 1783605477399068672
author Berger, Katja
Caicedo, Juan Pablo Rivera
Martino, Luca
Wocher, Matthias
Hank, Tobias
Verrelst, Jochem
author_facet Berger, Katja
Caicedo, Juan Pablo Rivera
Martino, Luca
Wocher, Matthias
Hank, Tobias
Verrelst, Jochem
author_sort Berger, Katja
collection PubMed
description The current exponential increase of spatiotemporally explicit data streams from satellitebased Earth observation missions offers promising opportunities for global vegetation monitoring. Intelligent sampling through active learning (AL) heuristics provides a pathway for fast inference of essential vegetation variables by means of hybrid retrieval approaches, i.e., machine learning regression algorithms trained by radiative transfer model (RTM) simulations. In this study we summarize AL theory and perform a brief systematic literature survey about AL heuristics used in the context of Earth observation regression problems over terrestrial targets. Across all relevant studies it appeared that: (i) retrieval accuracy of AL-optimized training data sets outperformed models trained over large randomly sampled data sets, and (ii) Euclidean distance-based (EBD) diversity method tends to be the most efficient AL technique in terms of accuracy and computational demand. Additionally, a case study is presented based on experimental data employing both uncertainty and diversity AL criteria. Hereby, a a simulated training data base by the PROSAIL-PRO canopy RTM is used to demonstrate the benefit of AL techniques for the estimation of total leaf carotenoid content (C(xc)) and leaf water content (C(w)). Gaussian process regression (GPR) was incorporated to minimize and optimize the training data set with AL. Training the GPR algorithm on optimally AL-based sampled data sets led to improved variable retrievals compared to training on full data pools, which is further demonstrated on a mapping example. From these findings we can recommend the use of AL-based sub-sampling procedures to select the most informative samples out of large training data pools. This will not only optimize regression accuracy due to exclusion of redundant information, but also speed up processing time and reduce final model size of kernel-based machine learning regression algorithms, such as GPR. With this study we want to encourage further testing and implementation of AL sampling methods for hybrid retrieval workflows. AL can contribute to the solution of regression problems within the framework of operational vegetation monitoring using satellite imaging spectroscopy data, and may strongly facilitate data processing for cloud-computing platforms.
format Online
Article
Text
id pubmed-7613397
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-76133972022-09-07 A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data Berger, Katja Caicedo, Juan Pablo Rivera Martino, Luca Wocher, Matthias Hank, Tobias Verrelst, Jochem Remote Sens (Basel) Article The current exponential increase of spatiotemporally explicit data streams from satellitebased Earth observation missions offers promising opportunities for global vegetation monitoring. Intelligent sampling through active learning (AL) heuristics provides a pathway for fast inference of essential vegetation variables by means of hybrid retrieval approaches, i.e., machine learning regression algorithms trained by radiative transfer model (RTM) simulations. In this study we summarize AL theory and perform a brief systematic literature survey about AL heuristics used in the context of Earth observation regression problems over terrestrial targets. Across all relevant studies it appeared that: (i) retrieval accuracy of AL-optimized training data sets outperformed models trained over large randomly sampled data sets, and (ii) Euclidean distance-based (EBD) diversity method tends to be the most efficient AL technique in terms of accuracy and computational demand. Additionally, a case study is presented based on experimental data employing both uncertainty and diversity AL criteria. Hereby, a a simulated training data base by the PROSAIL-PRO canopy RTM is used to demonstrate the benefit of AL techniques for the estimation of total leaf carotenoid content (C(xc)) and leaf water content (C(w)). Gaussian process regression (GPR) was incorporated to minimize and optimize the training data set with AL. Training the GPR algorithm on optimally AL-based sampled data sets led to improved variable retrievals compared to training on full data pools, which is further demonstrated on a mapping example. From these findings we can recommend the use of AL-based sub-sampling procedures to select the most informative samples out of large training data pools. This will not only optimize regression accuracy due to exclusion of redundant information, but also speed up processing time and reduce final model size of kernel-based machine learning regression algorithms, such as GPR. With this study we want to encourage further testing and implementation of AL sampling methods for hybrid retrieval workflows. AL can contribute to the solution of regression problems within the framework of operational vegetation monitoring using satellite imaging spectroscopy data, and may strongly facilitate data processing for cloud-computing platforms. 2021-01-15 /pmc/articles/PMC7613397/ /pubmed/36081683 http://dx.doi.org/10.3390/rs13020287 Text en https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Berger, Katja
Caicedo, Juan Pablo Rivera
Martino, Luca
Wocher, Matthias
Hank, Tobias
Verrelst, Jochem
A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data
title A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data
title_full A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data
title_fullStr A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data
title_full_unstemmed A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data
title_short A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data
title_sort survey of active learning for quantifying vegetation traits from terrestrial earth observation data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7613397/
https://www.ncbi.nlm.nih.gov/pubmed/36081683
http://dx.doi.org/10.3390/rs13020287
work_keys_str_mv AT bergerkatja asurveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT caicedojuanpablorivera asurveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT martinoluca asurveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT wochermatthias asurveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT hanktobias asurveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT verrelstjochem asurveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT bergerkatja surveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT caicedojuanpablorivera surveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT martinoluca surveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT wochermatthias surveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT hanktobias surveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata
AT verrelstjochem surveyofactivelearningforquantifyingvegetationtraitsfromterrestrialearthobservationdata