Cargando…

Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research

PURPOSE: Sharing of detailed individual-level data continues to pose challenges in multi-center studies. This issue can be addressed in part by using analytic methods that require only summary-level information to perform the desired multivariable-adjusted analysis. We examined the feasibility and e...

Descripción completa

Detalles Bibliográficos
Autores principales: Toh, Sengwee, Wellman, Robert, Coley, R Yates, Horgan, Casie, Sturtevant, Jessica, Moyneur, Erick, Janning, Cheri, Pardee, Roy, Coleman, Karen J, Arterburn, David, McTigue, Kathleen, Anau, Jane, Cook, Andrea J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Dove Medical Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6267363/
https://www.ncbi.nlm.nih.gov/pubmed/30568510
http://dx.doi.org/10.2147/CLEP.S178163
_version_ 1783376055363436544
author Toh, Sengwee
Wellman, Robert
Coley, R Yates
Horgan, Casie
Sturtevant, Jessica
Moyneur, Erick
Janning, Cheri
Pardee, Roy
Coleman, Karen J
Arterburn, David
McTigue, Kathleen
Anau, Jane
Cook, Andrea J
author_facet Toh, Sengwee
Wellman, Robert
Coley, R Yates
Horgan, Casie
Sturtevant, Jessica
Moyneur, Erick
Janning, Cheri
Pardee, Roy
Coleman, Karen J
Arterburn, David
McTigue, Kathleen
Anau, Jane
Cook, Andrea J
author_sort Toh, Sengwee
collection PubMed
description PURPOSE: Sharing of detailed individual-level data continues to pose challenges in multi-center studies. This issue can be addressed in part by using analytic methods that require only summary-level information to perform the desired multivariable-adjusted analysis. We examined the feasibility and empirical validity of 1) conducting multivariable-adjusted distributed linear regression and 2) combining distributed linear regression with propensity scores, in a large distributed data network. PATIENTS AND METHODS: We compared percent total weight loss 1-year postsurgery between Roux-en-Y gastric bypass and sleeve gastrectomy procedure among 43,110 patients from 36 health systems in the National Patient-Centered Clinical Research Network. We adjusted for baseline demographic and clinical variables as individual covariates, deciles of propensity scores, or both, in three separate outcome regression models. We used distributed linear regression, a method that requires only summary-level information (specifically, sums of squares and cross products matrix) from sites, to fit the three ordinary least squares linear regression models. A comparison set of analyses that used pooled deidentified individual-level data from sites served as the reference. RESULTS: Distributed linear regression produced results identical to those from the corresponding pooled individual-level data analysis for all variables in all three models. The maximum numerical difference in the parameter estimate or standard error for all the variables was 3×10(−11) across three models. CONCLUSION: Distributed linear regression analysis is a feasible and valid analytic method in multicenter studies for one-time continuous outcomes. Combining distributed regression with propensity scores via modeling offers more privacy protection and analytic flexibility.
format Online
Article
Text
id pubmed-6267363
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Dove Medical Press
record_format MEDLINE/PubMed
spelling pubmed-62673632018-12-19 Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research Toh, Sengwee Wellman, Robert Coley, R Yates Horgan, Casie Sturtevant, Jessica Moyneur, Erick Janning, Cheri Pardee, Roy Coleman, Karen J Arterburn, David McTigue, Kathleen Anau, Jane Cook, Andrea J Clin Epidemiol Original Research PURPOSE: Sharing of detailed individual-level data continues to pose challenges in multi-center studies. This issue can be addressed in part by using analytic methods that require only summary-level information to perform the desired multivariable-adjusted analysis. We examined the feasibility and empirical validity of 1) conducting multivariable-adjusted distributed linear regression and 2) combining distributed linear regression with propensity scores, in a large distributed data network. PATIENTS AND METHODS: We compared percent total weight loss 1-year postsurgery between Roux-en-Y gastric bypass and sleeve gastrectomy procedure among 43,110 patients from 36 health systems in the National Patient-Centered Clinical Research Network. We adjusted for baseline demographic and clinical variables as individual covariates, deciles of propensity scores, or both, in three separate outcome regression models. We used distributed linear regression, a method that requires only summary-level information (specifically, sums of squares and cross products matrix) from sites, to fit the three ordinary least squares linear regression models. A comparison set of analyses that used pooled deidentified individual-level data from sites served as the reference. RESULTS: Distributed linear regression produced results identical to those from the corresponding pooled individual-level data analysis for all variables in all three models. The maximum numerical difference in the parameter estimate or standard error for all the variables was 3×10(−11) across three models. CONCLUSION: Distributed linear regression analysis is a feasible and valid analytic method in multicenter studies for one-time continuous outcomes. Combining distributed regression with propensity scores via modeling offers more privacy protection and analytic flexibility. Dove Medical Press 2018-11-27 /pmc/articles/PMC6267363/ /pubmed/30568510 http://dx.doi.org/10.2147/CLEP.S178163 Text en © 2018 Toh et al. This work is published and licensed by Dove Medical Press Limited The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed.
spellingShingle Original Research
Toh, Sengwee
Wellman, Robert
Coley, R Yates
Horgan, Casie
Sturtevant, Jessica
Moyneur, Erick
Janning, Cheri
Pardee, Roy
Coleman, Karen J
Arterburn, David
McTigue, Kathleen
Anau, Jane
Cook, Andrea J
Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research
title Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research
title_full Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research
title_fullStr Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research
title_full_unstemmed Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research
title_short Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research
title_sort combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6267363/
https://www.ncbi.nlm.nih.gov/pubmed/30568510
http://dx.doi.org/10.2147/CLEP.S178163
work_keys_str_mv AT tohsengwee combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT wellmanrobert combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT coleyryates combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT horgancasie combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT sturtevantjessica combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT moyneurerick combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT janningcheri combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT pardeeroy combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT colemankarenj combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT arterburndavid combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT mctiguekathleen combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT anaujane combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch
AT cookandreaj combiningdistributedregressionandpropensityscoresadoublyprivacyprotectinganalyticmethodformulticenterresearch