Cargando…

Fitting parametric random effects models in very large data sets with application to VHA national data

BACKGROUND: With the current focus on personalized medicine, patient/subject level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets that are characterized by larg...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gebregziabher, Mulugeta, Egede, Leonard, Gilbert, Gregory E, Hunt, Kelly, Nietert, Paul J, Mauldin, Patrick
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3542162/ https://www.ncbi.nlm.nih.gov/pubmed/23095325 http://dx.doi.org/10.1186/1471-2288-12-163

_version_	1782255464213381120
author	Gebregziabher, Mulugeta Egede, Leonard Gilbert, Gregory E Hunt, Kelly Nietert, Paul J Mauldin, Patrick
author_facet	Gebregziabher, Mulugeta Egede, Leonard Gilbert, Gregory E Hunt, Kelly Nietert, Paul J Mauldin, Patrick
author_sort	Gebregziabher, Mulugeta
collection	PubMed
description	BACKGROUND: With the current focus on personalized medicine, patient/subject level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets that are characterized by large sample size, it can be difficult to fit REM using commonly available statistical software such as SAS since they require inordinate amounts of computer time and memory allocations beyond what are available preventing model convergence. For example, in a retrospective cohort study of over 800,000 Veterans with type 2 diabetes with longitudinal data over 5 years, fitting REM via generalized linear mixed modeling using currently available standard procedures in SAS (e.g. PROC GLIMMIX) was very difficult and same problems exist in Stata’s gllamm or R’s lme packages. Thus, this study proposes and assesses the performance of a meta regression approach and makes comparison with methods based on sampling of the full data. DATA: We use both simulated and real data from a national cohort of Veterans with type 2 diabetes (n=890,394) which was created by linking multiple patient and administrative files resulting in a cohort with longitudinal data collected over 5 years. METHODS AND RESULTS: The outcome of interest was mean annual HbA1c measured over a 5 years period. Using this outcome, we compared parameter estimates from the proposed random effects meta regression (REMR) with estimates based on simple random sampling and VISN (Veterans Integrated Service Networks) based stratified sampling of the full data. Our results indicate that REMR provides parameter estimates that are less likely to be biased with tighter confidence intervals when the VISN level estimates are homogenous. CONCLUSION: When the interest is to fit REM in repeated measures data with very large sample size, REMR can be used as a good alternative. It leads to reasonable inference for both Gaussian and non-Gaussian responses if parameter estimates are homogeneous across VISNs.
format	Online Article Text
id	pubmed-3542162
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-35421622013-01-11 Fitting parametric random effects models in very large data sets with application to VHA national data Gebregziabher, Mulugeta Egede, Leonard Gilbert, Gregory E Hunt, Kelly Nietert, Paul J Mauldin, Patrick BMC Med Res Methodol Research Article BACKGROUND: With the current focus on personalized medicine, patient/subject level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets that are characterized by large sample size, it can be difficult to fit REM using commonly available statistical software such as SAS since they require inordinate amounts of computer time and memory allocations beyond what are available preventing model convergence. For example, in a retrospective cohort study of over 800,000 Veterans with type 2 diabetes with longitudinal data over 5 years, fitting REM via generalized linear mixed modeling using currently available standard procedures in SAS (e.g. PROC GLIMMIX) was very difficult and same problems exist in Stata’s gllamm or R’s lme packages. Thus, this study proposes and assesses the performance of a meta regression approach and makes comparison with methods based on sampling of the full data. DATA: We use both simulated and real data from a national cohort of Veterans with type 2 diabetes (n=890,394) which was created by linking multiple patient and administrative files resulting in a cohort with longitudinal data collected over 5 years. METHODS AND RESULTS: The outcome of interest was mean annual HbA1c measured over a 5 years period. Using this outcome, we compared parameter estimates from the proposed random effects meta regression (REMR) with estimates based on simple random sampling and VISN (Veterans Integrated Service Networks) based stratified sampling of the full data. Our results indicate that REMR provides parameter estimates that are less likely to be biased with tighter confidence intervals when the VISN level estimates are homogenous. CONCLUSION: When the interest is to fit REM in repeated measures data with very large sample size, REMR can be used as a good alternative. It leads to reasonable inference for both Gaussian and non-Gaussian responses if parameter estimates are homogeneous across VISNs. BioMed Central 2012-10-24 /pmc/articles/PMC3542162/ /pubmed/23095325 http://dx.doi.org/10.1186/1471-2288-12-163 Text en Copyright ©2012 Gebregziabher et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Gebregziabher, Mulugeta Egede, Leonard Gilbert, Gregory E Hunt, Kelly Nietert, Paul J Mauldin, Patrick Fitting parametric random effects models in very large data sets with application to VHA national data
title	Fitting parametric random effects models in very large data sets with application to VHA national data
title_full	Fitting parametric random effects models in very large data sets with application to VHA national data
title_fullStr	Fitting parametric random effects models in very large data sets with application to VHA national data
title_full_unstemmed	Fitting parametric random effects models in very large data sets with application to VHA national data
title_short	Fitting parametric random effects models in very large data sets with application to VHA national data
title_sort	fitting parametric random effects models in very large data sets with application to vha national data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3542162/ https://www.ncbi.nlm.nih.gov/pubmed/23095325 http://dx.doi.org/10.1186/1471-2288-12-163
work_keys_str_mv	AT gebregziabhermulugeta fittingparametricrandomeffectsmodelsinverylargedatasetswithapplicationtovhanationaldata AT egedeleonard fittingparametricrandomeffectsmodelsinverylargedatasetswithapplicationtovhanationaldata AT gilbertgregorye fittingparametricrandomeffectsmodelsinverylargedatasetswithapplicationtovhanationaldata AT huntkelly fittingparametricrandomeffectsmodelsinverylargedatasetswithapplicationtovhanationaldata AT nietertpaulj fittingparametricrandomeffectsmodelsinverylargedatasetswithapplicationtovhanationaldata AT mauldinpatrick fittingparametricrandomeffectsmodelsinverylargedatasetswithapplicationtovhanationaldata

Fitting parametric random effects models in very large data sets with application to VHA national data

Ejemplares similares