Cargando…

Identifying and characterizing extrapolation in multivariate response data

Faced with limitations in data availability, funding, and time constraints, ecologists are often tasked with making predictions beyond the range of their data. In ecological studies, it is not always obvious when and where extrapolation occurs because of the multivariate nature of the data. Previous...

Descripción completa

Detalles Bibliográficos
Autores principales: Bartley, Meridith L., Hanks, Ephraim M., Schliep, Erin M., Soranno, Patricia A., Wagner, Tyler
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6894872/
https://www.ncbi.nlm.nih.gov/pubmed/31805095
http://dx.doi.org/10.1371/journal.pone.0225715
_version_ 1783476476807479296
author Bartley, Meridith L.
Hanks, Ephraim M.
Schliep, Erin M.
Soranno, Patricia A.
Wagner, Tyler
author_facet Bartley, Meridith L.
Hanks, Ephraim M.
Schliep, Erin M.
Soranno, Patricia A.
Wagner, Tyler
author_sort Bartley, Meridith L.
collection PubMed
description Faced with limitations in data availability, funding, and time constraints, ecologists are often tasked with making predictions beyond the range of their data. In ecological studies, it is not always obvious when and where extrapolation occurs because of the multivariate nature of the data. Previous work on identifying extrapolation has focused on univariate response data, but these methods are not directly applicable to multivariate response data, which are common in ecological investigations. In this paper, we extend previous work that identified extrapolation by applying the predictive variance from the univariate setting to the multivariate case. We propose using the trace or determinant of the predictive variance matrix to obtain a scalar value measure that, when paired with a selected cutoff value, allows for delineation between prediction and extrapolation. We illustrate our approach through an analysis of jointly modeled lake nutrients and indicators of algal biomass and water clarity in over 7000 inland lakes from across the Northeast and Mid-west US. In addition, we outline novel exploratory approaches for identifying regions of covariate space where extrapolation is more likely to occur using classification and regression trees. The use of our Multivariate Predictive Variance (MVPV) measures and multiple cutoff values when exploring the validity of predictions made from multivariate statistical models can help guide ecological inferences.
format Online
Article
Text
id pubmed-6894872
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-68948722019-12-14 Identifying and characterizing extrapolation in multivariate response data Bartley, Meridith L. Hanks, Ephraim M. Schliep, Erin M. Soranno, Patricia A. Wagner, Tyler PLoS One Research Article Faced with limitations in data availability, funding, and time constraints, ecologists are often tasked with making predictions beyond the range of their data. In ecological studies, it is not always obvious when and where extrapolation occurs because of the multivariate nature of the data. Previous work on identifying extrapolation has focused on univariate response data, but these methods are not directly applicable to multivariate response data, which are common in ecological investigations. In this paper, we extend previous work that identified extrapolation by applying the predictive variance from the univariate setting to the multivariate case. We propose using the trace or determinant of the predictive variance matrix to obtain a scalar value measure that, when paired with a selected cutoff value, allows for delineation between prediction and extrapolation. We illustrate our approach through an analysis of jointly modeled lake nutrients and indicators of algal biomass and water clarity in over 7000 inland lakes from across the Northeast and Mid-west US. In addition, we outline novel exploratory approaches for identifying regions of covariate space where extrapolation is more likely to occur using classification and regression trees. The use of our Multivariate Predictive Variance (MVPV) measures and multiple cutoff values when exploring the validity of predictions made from multivariate statistical models can help guide ecological inferences. Public Library of Science 2019-12-05 /pmc/articles/PMC6894872/ /pubmed/31805095 http://dx.doi.org/10.1371/journal.pone.0225715 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Bartley, Meridith L.
Hanks, Ephraim M.
Schliep, Erin M.
Soranno, Patricia A.
Wagner, Tyler
Identifying and characterizing extrapolation in multivariate response data
title Identifying and characterizing extrapolation in multivariate response data
title_full Identifying and characterizing extrapolation in multivariate response data
title_fullStr Identifying and characterizing extrapolation in multivariate response data
title_full_unstemmed Identifying and characterizing extrapolation in multivariate response data
title_short Identifying and characterizing extrapolation in multivariate response data
title_sort identifying and characterizing extrapolation in multivariate response data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6894872/
https://www.ncbi.nlm.nih.gov/pubmed/31805095
http://dx.doi.org/10.1371/journal.pone.0225715
work_keys_str_mv AT bartleymeridithl identifyingandcharacterizingextrapolationinmultivariateresponsedata
AT hanksephraimm identifyingandcharacterizingextrapolationinmultivariateresponsedata
AT schlieperinm identifyingandcharacterizingextrapolationinmultivariateresponsedata
AT sorannopatriciaa identifyingandcharacterizingextrapolationinmultivariateresponsedata
AT wagnertyler identifyingandcharacterizingextrapolationinmultivariateresponsedata