Cargando…

It’s all relative: Regression analysis with compositional predictors

Compositional data reside in a simplex and measure fractions or proportions of parts to a whole. Most existing regression methods for such data rely on log-ratio transformations that are inadequate or inappropriate in modeling high-dimensional data with excessive zeros and hierarchical structures. M...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Gen, Li, Yan, Chen, Kun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9767704/
https://www.ncbi.nlm.nih.gov/pubmed/35616500
http://dx.doi.org/10.1111/biom.13703
_version_ 1784854017675886592
author Li, Gen
Li, Yan
Chen, Kun
author_facet Li, Gen
Li, Yan
Chen, Kun
author_sort Li, Gen
collection PubMed
description Compositional data reside in a simplex and measure fractions or proportions of parts to a whole. Most existing regression methods for such data rely on log-ratio transformations that are inadequate or inappropriate in modeling high-dimensional data with excessive zeros and hierarchical structures. Moreover, such models usually lack a straightforward interpretation due to the interrelation between parts of a composition. We develop a novel relative-shift regression framework that directly uses proportions as predictors. The new framework provides a paradigm shift for regression analysis with compositional predictors and offers a superior interpretation of how shifting concentration between parts affects the response. New equi-sparsity and tree-guided regularization methods and an efficient smoothing proximal gradient algorithm are developed to facilitate feature aggregation and dimension reduction in regression. A unified finite-sample prediction error bound is derived for the proposed regularized estimators. We demonstrate the efficacy of the proposed methods in extensive simulation studies and a real gut microbiome study. Guided by the taxonomy of the microbiome data, the framework identifies important taxa at different taxonomic levels associated with the neurodevelopment of preterm infants.
format Online
Article
Text
id pubmed-9767704
institution National Center for Biotechnology Information
language English
publishDate 2023
record_format MEDLINE/PubMed
spelling pubmed-97677042023-06-27 It’s all relative: Regression analysis with compositional predictors Li, Gen Li, Yan Chen, Kun Biometrics Article Compositional data reside in a simplex and measure fractions or proportions of parts to a whole. Most existing regression methods for such data rely on log-ratio transformations that are inadequate or inappropriate in modeling high-dimensional data with excessive zeros and hierarchical structures. Moreover, such models usually lack a straightforward interpretation due to the interrelation between parts of a composition. We develop a novel relative-shift regression framework that directly uses proportions as predictors. The new framework provides a paradigm shift for regression analysis with compositional predictors and offers a superior interpretation of how shifting concentration between parts affects the response. New equi-sparsity and tree-guided regularization methods and an efficient smoothing proximal gradient algorithm are developed to facilitate feature aggregation and dimension reduction in regression. A unified finite-sample prediction error bound is derived for the proposed regularized estimators. We demonstrate the efficacy of the proposed methods in extensive simulation studies and a real gut microbiome study. Guided by the taxonomy of the microbiome data, the framework identifies important taxa at different taxonomic levels associated with the neurodevelopment of preterm infants. 2023-06 2022-07-11 /pmc/articles/PMC9767704/ /pubmed/35616500 http://dx.doi.org/10.1111/biom.13703 Text en https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the Creative Commons Attribution-NonCommercial (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Article
Li, Gen
Li, Yan
Chen, Kun
It’s all relative: Regression analysis with compositional predictors
title It’s all relative: Regression analysis with compositional predictors
title_full It’s all relative: Regression analysis with compositional predictors
title_fullStr It’s all relative: Regression analysis with compositional predictors
title_full_unstemmed It’s all relative: Regression analysis with compositional predictors
title_short It’s all relative: Regression analysis with compositional predictors
title_sort it’s all relative: regression analysis with compositional predictors
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9767704/
https://www.ncbi.nlm.nih.gov/pubmed/35616500
http://dx.doi.org/10.1111/biom.13703
work_keys_str_mv AT ligen itsallrelativeregressionanalysiswithcompositionalpredictors
AT liyan itsallrelativeregressionanalysiswithcompositionalpredictors
AT chenkun itsallrelativeregressionanalysiswithcompositionalpredictors