Cargando…

The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research

BACKGROUND: Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Pirmani, Ashkan, De Brouwer, Edward, Geys, Lotte, Parciak, Tina, Moreau, Yves, Peeters, Liesbet M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10667980/
https://www.ncbi.nlm.nih.gov/pubmed/37943585
http://dx.doi.org/10.2196/48030
_version_ 1785139371174789120
author Pirmani, Ashkan
De Brouwer, Edward
Geys, Lotte
Parciak, Tina
Moreau, Yves
Peeters, Liesbet M
author_facet Pirmani, Ashkan
De Brouwer, Edward
Geys, Lotte
Parciak, Tina
Moreau, Yves
Peeters, Liesbet M
author_sort Pirmani, Ashkan
collection PubMed
description BACKGROUND: Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and analysis, which negatively impact the generation of significant meaningful clinical evidence. OBJECTIVE: This study aims to present a comprehensive, research question–agnostic, multistakeholder-driven end-to-end data analysis pipeline that accommodates 3 prevalent data-sharing streams: individual data sharing, core data set sharing, and federated model sharing. METHODS: A demand-driven methodology is employed for standardization, followed by 3 streams of data acquisition, a data quality enhancement process, a data integration procedure, and a concluding analysis stage to fulfill real-world data-sharing requirements. This pipeline’s effectiveness was demonstrated through its successful implementation in the COVID-19 and multiple sclerosis global data sharing initiative. RESULTS: The global data sharing initiative yielded multiple scientific publications and provided extensive worldwide guidance for the community with multiple sclerosis. The pipeline facilitated gathering pertinent data from various sources, accommodating distinct sharing streams and assimilating them into a unified data set for subsequent statistical analysis or secure data examination. This pipeline contributed to the assembly of the largest data set of people with multiple sclerosis infected with COVID-19. CONCLUSIONS: The proposed data analysis pipeline exemplifies the potential of global stakeholder collaboration and underlines the significance of evidence-based decision-making. It serves as a paradigm for how data sharing initiatives can propel advancements in health care, emphasizing its adaptability and capacity to address diverse research inquiries.
format Online
Article
Text
id pubmed-10667980
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-106679802023-11-09 The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research Pirmani, Ashkan De Brouwer, Edward Geys, Lotte Parciak, Tina Moreau, Yves Peeters, Liesbet M JMIR Med Inform Original Paper BACKGROUND: Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and analysis, which negatively impact the generation of significant meaningful clinical evidence. OBJECTIVE: This study aims to present a comprehensive, research question–agnostic, multistakeholder-driven end-to-end data analysis pipeline that accommodates 3 prevalent data-sharing streams: individual data sharing, core data set sharing, and federated model sharing. METHODS: A demand-driven methodology is employed for standardization, followed by 3 streams of data acquisition, a data quality enhancement process, a data integration procedure, and a concluding analysis stage to fulfill real-world data-sharing requirements. This pipeline’s effectiveness was demonstrated through its successful implementation in the COVID-19 and multiple sclerosis global data sharing initiative. RESULTS: The global data sharing initiative yielded multiple scientific publications and provided extensive worldwide guidance for the community with multiple sclerosis. The pipeline facilitated gathering pertinent data from various sources, accommodating distinct sharing streams and assimilating them into a unified data set for subsequent statistical analysis or secure data examination. This pipeline contributed to the assembly of the largest data set of people with multiple sclerosis infected with COVID-19. CONCLUSIONS: The proposed data analysis pipeline exemplifies the potential of global stakeholder collaboration and underlines the significance of evidence-based decision-making. It serves as a paradigm for how data sharing initiatives can propel advancements in health care, emphasizing its adaptability and capacity to address diverse research inquiries. JMIR Publications 2023-11-09 /pmc/articles/PMC10667980/ /pubmed/37943585 http://dx.doi.org/10.2196/48030 Text en ©Ashkan Pirmani, Edward De Brouwer, Lotte Geys, Tina Parciak, Yves Moreau, Liesbet M Peeters. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 09.11.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Pirmani, Ashkan
De Brouwer, Edward
Geys, Lotte
Parciak, Tina
Moreau, Yves
Peeters, Liesbet M
The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research
title The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research
title_full The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research
title_fullStr The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research
title_full_unstemmed The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research
title_short The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research
title_sort journey of data within a global data sharing initiative: a federated 3-layer data analysis pipeline to scale up multiple sclerosis research
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10667980/
https://www.ncbi.nlm.nih.gov/pubmed/37943585
http://dx.doi.org/10.2196/48030
work_keys_str_mv AT pirmaniashkan thejourneyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT debrouweredward thejourneyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT geyslotte thejourneyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT parciaktina thejourneyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT moreauyves thejourneyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT peetersliesbetm thejourneyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT pirmaniashkan journeyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT debrouweredward journeyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT geyslotte journeyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT parciaktina journeyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT moreauyves journeyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch
AT peetersliesbetm journeyofdatawithinaglobaldatasharinginitiativeafederated3layerdataanalysispipelinetoscaleupmultiplesclerosisresearch