Cargando…

Variable selection in multivariate multiple regression

INTRODUCTION: In many practical situations, we are interested in the effect of covariates on correlated multiple responses. In this paper, we focus on estimation and variable selection in multi-response multiple regression models. Correlation among the response variables must be modeled for valid in...

Descripción completa

Detalles Bibliográficos
Autores principales: Variyath, Asokan Mulayath, Brobbey, Anita
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7367460/
https://www.ncbi.nlm.nih.gov/pubmed/32678828
http://dx.doi.org/10.1371/journal.pone.0236067
_version_ 1783560427882414080
author Variyath, Asokan Mulayath
Brobbey, Anita
author_facet Variyath, Asokan Mulayath
Brobbey, Anita
author_sort Variyath, Asokan Mulayath
collection PubMed
description INTRODUCTION: In many practical situations, we are interested in the effect of covariates on correlated multiple responses. In this paper, we focus on estimation and variable selection in multi-response multiple regression models. Correlation among the response variables must be modeled for valid inference. METHOD: We used an extension of the generalized estimating equation (GEE) methodology to simultaneously analyze binary, count, and continuous outcomes with nonlinear functions. Variable selection plays an important role in modeling correlated responses because of the large number of model parameters that must be estimated. We propose a penalized-likelihood approach based on the extended GEEs for simultaneous parameter estimation and variable selection. RESULTS AND CONCLUSIONS: We conducted a series of Monte Carlo simulations to investigate the performance of our method, considering different sample sizes and numbers of response variables. The results showed that our method works well compared to treating the responses as uncorrelated. We recommend using an unstructured correlation model with the Bayesian information criterion (BIC) to select the tuning parameters. We demonstrated our method using data from a concrete slump test.
format Online
Article
Text
id pubmed-7367460
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-73674602020-08-05 Variable selection in multivariate multiple regression Variyath, Asokan Mulayath Brobbey, Anita PLoS One Research Article INTRODUCTION: In many practical situations, we are interested in the effect of covariates on correlated multiple responses. In this paper, we focus on estimation and variable selection in multi-response multiple regression models. Correlation among the response variables must be modeled for valid inference. METHOD: We used an extension of the generalized estimating equation (GEE) methodology to simultaneously analyze binary, count, and continuous outcomes with nonlinear functions. Variable selection plays an important role in modeling correlated responses because of the large number of model parameters that must be estimated. We propose a penalized-likelihood approach based on the extended GEEs for simultaneous parameter estimation and variable selection. RESULTS AND CONCLUSIONS: We conducted a series of Monte Carlo simulations to investigate the performance of our method, considering different sample sizes and numbers of response variables. The results showed that our method works well compared to treating the responses as uncorrelated. We recommend using an unstructured correlation model with the Bayesian information criterion (BIC) to select the tuning parameters. We demonstrated our method using data from a concrete slump test. Public Library of Science 2020-07-17 /pmc/articles/PMC7367460/ /pubmed/32678828 http://dx.doi.org/10.1371/journal.pone.0236067 Text en © 2020 Variyath, Brobbey http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Variyath, Asokan Mulayath
Brobbey, Anita
Variable selection in multivariate multiple regression
title Variable selection in multivariate multiple regression
title_full Variable selection in multivariate multiple regression
title_fullStr Variable selection in multivariate multiple regression
title_full_unstemmed Variable selection in multivariate multiple regression
title_short Variable selection in multivariate multiple regression
title_sort variable selection in multivariate multiple regression
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7367460/
https://www.ncbi.nlm.nih.gov/pubmed/32678828
http://dx.doi.org/10.1371/journal.pone.0236067
work_keys_str_mv AT variyathasokanmulayath variableselectioninmultivariatemultipleregression
AT brobbeyanita variableselectioninmultivariatemultipleregression