Cargando…

An enhanced version of the PHIRI infrastructure: improving the technological solutions

The proof of concept tested by PHIRI consisted of the development of several research questions in multiple data hubs using a federated approach. It was possible to embed the use cases’ analytical pipelines in a portable standalone (i.e. docker image) and distribute it in different health data hubs...

Descripción completa

Detalles Bibliográficos
Autor principal: Derycke, P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9594372/
http://dx.doi.org/10.1093/eurpub/ckac129.469
_version_ 1784815399796211712
author Derycke, P
author_facet Derycke, P
author_sort Derycke, P
collection PubMed
description The proof of concept tested by PHIRI consisted of the development of several research questions in multiple data hubs using a federated approach. It was possible to embed the use cases’ analytical pipelines in a portable standalone (i.e. docker image) and distribute it in different health data hubs and technological environments sources for execution. The tested solution has the advantage of not moving sensitive data out of the silos and thus protecting privacy - the code meets data and not the opposite. Some precious lessons provide guidance on how to further develop the PHIRI infrastructure. 1) A deep knowledge on what data is available in the different data hubs of a federation is key since the basis for the development of a research query is the construction of a data model that is common to all the nodes in the federation. In an eventual enhanced PHIRI infrastructure, a solution will be implementing a semantic information system that allows the exchange of metadata using federated and interoperable metadata catalogues based on Semantic RDF graph databases, compliant with the W3C DCAT metadata standard and exposing the end-points of the SPARQL querying language of the Web of linked-data. 2) Making available training samples mimicking real-world data within the docker image has been of high added-value for the development of the use cases’ analytical pipelines. In an eventual enhanced PHIRI infrastructure, a generalisation could consist of setting up a “knowledge hub” where synthetic data, twinning the population, data would allow any expert users to search and find data through federated queries and prepare and train their analytical pipelines; the “knowledge hub” would provide a computational environment (e.g. Jupyter as a service playground), the necessary tools (i.e. cookbooks and capacity building services) and training samples to answer research questions, with the advantage of using data that is anonymous by nature and open access.
format Online
Article
Text
id pubmed-9594372
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-95943722022-11-22 An enhanced version of the PHIRI infrastructure: improving the technological solutions Derycke, P Eur J Public Health Parallel Programme The proof of concept tested by PHIRI consisted of the development of several research questions in multiple data hubs using a federated approach. It was possible to embed the use cases’ analytical pipelines in a portable standalone (i.e. docker image) and distribute it in different health data hubs and technological environments sources for execution. The tested solution has the advantage of not moving sensitive data out of the silos and thus protecting privacy - the code meets data and not the opposite. Some precious lessons provide guidance on how to further develop the PHIRI infrastructure. 1) A deep knowledge on what data is available in the different data hubs of a federation is key since the basis for the development of a research query is the construction of a data model that is common to all the nodes in the federation. In an eventual enhanced PHIRI infrastructure, a solution will be implementing a semantic information system that allows the exchange of metadata using federated and interoperable metadata catalogues based on Semantic RDF graph databases, compliant with the W3C DCAT metadata standard and exposing the end-points of the SPARQL querying language of the Web of linked-data. 2) Making available training samples mimicking real-world data within the docker image has been of high added-value for the development of the use cases’ analytical pipelines. In an eventual enhanced PHIRI infrastructure, a generalisation could consist of setting up a “knowledge hub” where synthetic data, twinning the population, data would allow any expert users to search and find data through federated queries and prepare and train their analytical pipelines; the “knowledge hub” would provide a computational environment (e.g. Jupyter as a service playground), the necessary tools (i.e. cookbooks and capacity building services) and training samples to answer research questions, with the advantage of using data that is anonymous by nature and open access. Oxford University Press 2022-10-25 /pmc/articles/PMC9594372/ http://dx.doi.org/10.1093/eurpub/ckac129.469 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the European Public Health Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Parallel Programme
Derycke, P
An enhanced version of the PHIRI infrastructure: improving the technological solutions
title An enhanced version of the PHIRI infrastructure: improving the technological solutions
title_full An enhanced version of the PHIRI infrastructure: improving the technological solutions
title_fullStr An enhanced version of the PHIRI infrastructure: improving the technological solutions
title_full_unstemmed An enhanced version of the PHIRI infrastructure: improving the technological solutions
title_short An enhanced version of the PHIRI infrastructure: improving the technological solutions
title_sort enhanced version of the phiri infrastructure: improving the technological solutions
topic Parallel Programme
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9594372/
http://dx.doi.org/10.1093/eurpub/ckac129.469
work_keys_str_mv AT deryckep anenhancedversionofthephiriinfrastructureimprovingthetechnologicalsolutions
AT deryckep enhancedversionofthephiriinfrastructureimprovingthetechnologicalsolutions