Cargando…

Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study

BACKGROUND: Privacy-protecting analytic approaches without centralized pooling of individual-level data, such as distributed regression, are particularly important for vulnerable populations, such as children, but these methods have not yet been tested in multi-center pediatric studies. METHODS: Usi...

Descripción completa

Detalles Bibliográficos
Autores principales: Toh, Sengwee, Rifas-Shiman, Sheryl L., Lin, Pi-I, Bailey, L. Charles, Forrest, Christopher B., Horgan, Casie E., Lunsford, Douglas, Moyneur, Erick, Sturtevant, Jessica L., Young, Jessica G., Block, Jason P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7113085/
https://www.ncbi.nlm.nih.gov/pubmed/31578038
http://dx.doi.org/10.1038/s41390-019-0596-0
_version_ 1783513601068236800
author Toh, Sengwee
Rifas-Shiman, Sheryl L.
Lin, Pi-I
Bailey, L. Charles
Forrest, Christopher B.
Horgan, Casie E.
Lunsford, Douglas
Moyneur, Erick
Sturtevant, Jessica L.
Young, Jessica G.
Block, Jason P.
author_facet Toh, Sengwee
Rifas-Shiman, Sheryl L.
Lin, Pi-I
Bailey, L. Charles
Forrest, Christopher B.
Horgan, Casie E.
Lunsford, Douglas
Moyneur, Erick
Sturtevant, Jessica L.
Young, Jessica G.
Block, Jason P.
author_sort Toh, Sengwee
collection PubMed
description BACKGROUND: Privacy-protecting analytic approaches without centralized pooling of individual-level data, such as distributed regression, are particularly important for vulnerable populations, such as children, but these methods have not yet been tested in multi-center pediatric studies. METHODS: Using the electronic health data from 34 healthcare institutions in the National Patient-Centered Clinical Research Network (PCORnet), we fit 12 multivariable-adjusted linear regression models to assess the associations of antibiotic use <24 months of age with body mass index z-score at 48 to <72 months of age. We ran these models using pooled individual-level data and conventional multivariable-adjusted regression (reference method), as well as using pooled summary-level intermediate statistics and the more privacy-protecting distributed regression technique. We compared the results from these two methods. RESULTS: Pooled individual-level and distributed linear regression analyses showed virtually identical parameter estimates and standard errors. Across all 12 models, the maximum difference in any of the parameter estimates or standard errors was 4.4833×10(−10). CONCLUSIONS: We demonstrated empirically the feasibility and validity of distributed linear regression analysis using only summary-level information within a large multi-center study of children. This approach could enable expanded opportunities for multi-center pediatric research, especially when sharing of granular individual-level data is challenging.
format Online
Article
Text
id pubmed-7113085
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-71130852020-05-04 Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study Toh, Sengwee Rifas-Shiman, Sheryl L. Lin, Pi-I Bailey, L. Charles Forrest, Christopher B. Horgan, Casie E. Lunsford, Douglas Moyneur, Erick Sturtevant, Jessica L. Young, Jessica G. Block, Jason P. Pediatr Res Article BACKGROUND: Privacy-protecting analytic approaches without centralized pooling of individual-level data, such as distributed regression, are particularly important for vulnerable populations, such as children, but these methods have not yet been tested in multi-center pediatric studies. METHODS: Using the electronic health data from 34 healthcare institutions in the National Patient-Centered Clinical Research Network (PCORnet), we fit 12 multivariable-adjusted linear regression models to assess the associations of antibiotic use <24 months of age with body mass index z-score at 48 to <72 months of age. We ran these models using pooled individual-level data and conventional multivariable-adjusted regression (reference method), as well as using pooled summary-level intermediate statistics and the more privacy-protecting distributed regression technique. We compared the results from these two methods. RESULTS: Pooled individual-level and distributed linear regression analyses showed virtually identical parameter estimates and standard errors. Across all 12 models, the maximum difference in any of the parameter estimates or standard errors was 4.4833×10(−10). CONCLUSIONS: We demonstrated empirically the feasibility and validity of distributed linear regression analysis using only summary-level information within a large multi-center study of children. This approach could enable expanded opportunities for multi-center pediatric research, especially when sharing of granular individual-level data is challenging. 2019-10-02 2020-05 /pmc/articles/PMC7113085/ /pubmed/31578038 http://dx.doi.org/10.1038/s41390-019-0596-0 Text en http://www.nature.com/authors/editorial_policies/license.html#terms Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Toh, Sengwee
Rifas-Shiman, Sheryl L.
Lin, Pi-I
Bailey, L. Charles
Forrest, Christopher B.
Horgan, Casie E.
Lunsford, Douglas
Moyneur, Erick
Sturtevant, Jessica L.
Young, Jessica G.
Block, Jason P.
Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
title Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
title_full Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
title_fullStr Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
title_full_unstemmed Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
title_short Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
title_sort privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7113085/
https://www.ncbi.nlm.nih.gov/pubmed/31578038
http://dx.doi.org/10.1038/s41390-019-0596-0
work_keys_str_mv AT tohsengwee privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT rifasshimansheryll privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT linpii privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT baileylcharles privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT forrestchristopherb privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT horgancasiee privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT lunsforddouglas privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT moyneurerick privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT sturtevantjessical privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT youngjessicag privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT blockjasonp privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy
AT privacyprotectingmultivariableadjusteddistributedregressionanalysisformulticenterpediatricstudy