Cargando…

Data harmonization and federated analysis of population-based studies: the BioSHaRE project

ABSTRACTS: BACKGROUND: Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these i...

Descripción completa

Detalles Bibliográficos
Autores principales: Doiron, Dany, Burton, Paul, Marcon, Yannick, Gaye, Amadou, Wolffenbuttel, Bruce H R, Perola, Markus, Stolk, Ronald P, Foco, Luisa, Minelli, Cosetta, Waldenberger, Melanie, Holle, Rolf, Kvaløy, Kirsti, Hillege, Hans L, Tassé, Anne-Marie, Ferretti, Vincent, Fortier, Isabel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4175511/
https://www.ncbi.nlm.nih.gov/pubmed/24257327
http://dx.doi.org/10.1186/1742-7622-10-12
_version_ 1782336496895787008
author Doiron, Dany
Burton, Paul
Marcon, Yannick
Gaye, Amadou
Wolffenbuttel, Bruce H R
Perola, Markus
Stolk, Ronald P
Foco, Luisa
Minelli, Cosetta
Waldenberger, Melanie
Holle, Rolf
Kvaløy, Kirsti
Hillege, Hans L
Tassé, Anne-Marie
Ferretti, Vincent
Fortier, Isabel
author_facet Doiron, Dany
Burton, Paul
Marcon, Yannick
Gaye, Amadou
Wolffenbuttel, Bruce H R
Perola, Markus
Stolk, Ronald P
Foco, Luisa
Minelli, Cosetta
Waldenberger, Melanie
Holle, Rolf
Kvaløy, Kirsti
Hillege, Hans L
Tassé, Anne-Marie
Ferretti, Vincent
Fortier, Isabel
author_sort Doiron, Dany
collection PubMed
description ABSTRACTS: BACKGROUND: Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses. METHODS: Eight population-based studies in six European countries were recruited to participate in the BioSHaRE project. Through workshops, teleconferences and electronic communications, participating investigators identified a set of 96 variables targeted for harmonization to answer research questions of interest. Using each study’s questionnaires, standard operating procedures, and data dictionaries, harmonization potential was assessed. Whenever harmonization was deemed possible, processing algorithms were developed and implemented in an open-source software infrastructure to transform study-specific data into the target (i.e. harmonized) format. Harmonized datasets located on server in each research centres across Europe were interconnected through a federated database system to perform statistical analysis. RESULTS: Retrospective harmonization led to the generation of common format variables for 73% of matches considered (96 targeted variables across 8 studies). Authenticated investigators can now perform complex statistical analyses of harmonized datasets stored on distributed servers without actually sharing individual-level data using the DataSHIELD method. CONCLUSION: New Internet-based networking technologies and database management systems are providing the means to support collaborative, multi-center research in an efficient and secure manner. The results from this pilot project show that, given a strong collaborative relationship between participating studies, it is possible to seamlessly co-analyse internationally harmonized research databases while allowing each study to retain full control over individual-level data. We encourage additional collaborative research networks in epidemiology, public health, and the social sciences to make use of the open source tools presented herein.
format Online
Article
Text
id pubmed-4175511
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41755112014-09-27 Data harmonization and federated analysis of population-based studies: the BioSHaRE project Doiron, Dany Burton, Paul Marcon, Yannick Gaye, Amadou Wolffenbuttel, Bruce H R Perola, Markus Stolk, Ronald P Foco, Luisa Minelli, Cosetta Waldenberger, Melanie Holle, Rolf Kvaløy, Kirsti Hillege, Hans L Tassé, Anne-Marie Ferretti, Vincent Fortier, Isabel Emerg Themes Epidemiol Analytic Perspective ABSTRACTS: BACKGROUND: Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses. METHODS: Eight population-based studies in six European countries were recruited to participate in the BioSHaRE project. Through workshops, teleconferences and electronic communications, participating investigators identified a set of 96 variables targeted for harmonization to answer research questions of interest. Using each study’s questionnaires, standard operating procedures, and data dictionaries, harmonization potential was assessed. Whenever harmonization was deemed possible, processing algorithms were developed and implemented in an open-source software infrastructure to transform study-specific data into the target (i.e. harmonized) format. Harmonized datasets located on server in each research centres across Europe were interconnected through a federated database system to perform statistical analysis. RESULTS: Retrospective harmonization led to the generation of common format variables for 73% of matches considered (96 targeted variables across 8 studies). Authenticated investigators can now perform complex statistical analyses of harmonized datasets stored on distributed servers without actually sharing individual-level data using the DataSHIELD method. CONCLUSION: New Internet-based networking technologies and database management systems are providing the means to support collaborative, multi-center research in an efficient and secure manner. The results from this pilot project show that, given a strong collaborative relationship between participating studies, it is possible to seamlessly co-analyse internationally harmonized research databases while allowing each study to retain full control over individual-level data. We encourage additional collaborative research networks in epidemiology, public health, and the social sciences to make use of the open source tools presented herein. BioMed Central 2013-11-21 /pmc/articles/PMC4175511/ /pubmed/24257327 http://dx.doi.org/10.1186/1742-7622-10-12 Text en Copyright © 2013 Doiron et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Analytic Perspective
Doiron, Dany
Burton, Paul
Marcon, Yannick
Gaye, Amadou
Wolffenbuttel, Bruce H R
Perola, Markus
Stolk, Ronald P
Foco, Luisa
Minelli, Cosetta
Waldenberger, Melanie
Holle, Rolf
Kvaløy, Kirsti
Hillege, Hans L
Tassé, Anne-Marie
Ferretti, Vincent
Fortier, Isabel
Data harmonization and federated analysis of population-based studies: the BioSHaRE project
title Data harmonization and federated analysis of population-based studies: the BioSHaRE project
title_full Data harmonization and federated analysis of population-based studies: the BioSHaRE project
title_fullStr Data harmonization and federated analysis of population-based studies: the BioSHaRE project
title_full_unstemmed Data harmonization and federated analysis of population-based studies: the BioSHaRE project
title_short Data harmonization and federated analysis of population-based studies: the BioSHaRE project
title_sort data harmonization and federated analysis of population-based studies: the bioshare project
topic Analytic Perspective
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4175511/
https://www.ncbi.nlm.nih.gov/pubmed/24257327
http://dx.doi.org/10.1186/1742-7622-10-12
work_keys_str_mv AT doirondany dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT burtonpaul dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT marconyannick dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT gayeamadou dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT wolffenbuttelbrucehr dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT perolamarkus dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT stolkronaldp dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT focoluisa dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT minellicosetta dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT waldenbergermelanie dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT hollerolf dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT kvaløykirsti dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT hillegehansl dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT tasseannemarie dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT ferrettivincent dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject
AT fortierisabel dataharmonizationandfederatedanalysisofpopulationbasedstudiesthebioshareproject