Cargando…
FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital
BACKGROUND: Secondary use of routine medical data is key to large-scale clinical and health services research. In a maximum care hospital, the volume of data generated exceeds the limits of big data on a daily basis. This so-called “real world data” are essential to complement knowledge and results...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10186636/ https://www.ncbi.nlm.nih.gov/pubmed/37189148 http://dx.doi.org/10.1186/s12911-023-02195-3 |
_version_ | 1785042600782200832 |
---|---|
author | Parciak, Marcel Suhr, Markus Schmidt, Christian Bönisch, Caroline Löhnhardt, Benjamin Kesztyüs, Dorothea Kesztyüs, Tibor |
author_facet | Parciak, Marcel Suhr, Markus Schmidt, Christian Bönisch, Caroline Löhnhardt, Benjamin Kesztyüs, Dorothea Kesztyüs, Tibor |
author_sort | Parciak, Marcel |
collection | PubMed |
description | BACKGROUND: Secondary use of routine medical data is key to large-scale clinical and health services research. In a maximum care hospital, the volume of data generated exceeds the limits of big data on a daily basis. This so-called “real world data” are essential to complement knowledge and results from clinical trials. Furthermore, big data may help in establishing precision medicine. However, manual data extraction and annotation workflows to transfer routine data into research data would be complex and inefficient. Generally, best practices for managing research data focus on data output rather than the entire data journey from primary sources to analysis. To eventually make routinely collected data usable and available for research, many hurdles have to be overcome. In this work, we present the implementation of an automated framework for timely processing of clinical care data including free texts and genetic data (non-structured data) and centralized storage as Findable, Accessible, Interoperable, Reusable (FAIR) research data in a maximum care university hospital. METHODS: We identify data processing workflows necessary to operate a medical research data service unit in a maximum care hospital. We decompose structurally equal tasks into elementary sub-processes and propose a framework for general data processing. We base our processes on open-source software-components and, where necessary, custom-built generic tools. RESULTS: We demonstrate the application of our proposed framework in practice by describing its use in our Medical Data Integration Center (MeDIC). Our microservices-based and fully open-source data processing automation framework incorporates a complete recording of data management and manipulation activities. The prototype implementation also includes a metadata schema for data provenance and a process validation concept. All requirements of a MeDIC are orchestrated within the proposed framework: Data input from many heterogeneous sources, pseudonymization and harmonization, integration in a data warehouse and finally possibilities for extraction or aggregation of data for research purposes according to data protection requirements. CONCLUSION: Though the framework is not a panacea for bringing routine-based research data into compliance with FAIR principles, it provides a much-needed possibility to process data in a fully automated, traceable, and reproducible manner. |
format | Online Article Text |
id | pubmed-10186636 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-101866362023-05-17 FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital Parciak, Marcel Suhr, Markus Schmidt, Christian Bönisch, Caroline Löhnhardt, Benjamin Kesztyüs, Dorothea Kesztyüs, Tibor BMC Med Inform Decis Mak Database BACKGROUND: Secondary use of routine medical data is key to large-scale clinical and health services research. In a maximum care hospital, the volume of data generated exceeds the limits of big data on a daily basis. This so-called “real world data” are essential to complement knowledge and results from clinical trials. Furthermore, big data may help in establishing precision medicine. However, manual data extraction and annotation workflows to transfer routine data into research data would be complex and inefficient. Generally, best practices for managing research data focus on data output rather than the entire data journey from primary sources to analysis. To eventually make routinely collected data usable and available for research, many hurdles have to be overcome. In this work, we present the implementation of an automated framework for timely processing of clinical care data including free texts and genetic data (non-structured data) and centralized storage as Findable, Accessible, Interoperable, Reusable (FAIR) research data in a maximum care university hospital. METHODS: We identify data processing workflows necessary to operate a medical research data service unit in a maximum care hospital. We decompose structurally equal tasks into elementary sub-processes and propose a framework for general data processing. We base our processes on open-source software-components and, where necessary, custom-built generic tools. RESULTS: We demonstrate the application of our proposed framework in practice by describing its use in our Medical Data Integration Center (MeDIC). Our microservices-based and fully open-source data processing automation framework incorporates a complete recording of data management and manipulation activities. The prototype implementation also includes a metadata schema for data provenance and a process validation concept. All requirements of a MeDIC are orchestrated within the proposed framework: Data input from many heterogeneous sources, pseudonymization and harmonization, integration in a data warehouse and finally possibilities for extraction or aggregation of data for research purposes according to data protection requirements. CONCLUSION: Though the framework is not a panacea for bringing routine-based research data into compliance with FAIR principles, it provides a much-needed possibility to process data in a fully automated, traceable, and reproducible manner. BioMed Central 2023-05-15 /pmc/articles/PMC10186636/ /pubmed/37189148 http://dx.doi.org/10.1186/s12911-023-02195-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Database Parciak, Marcel Suhr, Markus Schmidt, Christian Bönisch, Caroline Löhnhardt, Benjamin Kesztyüs, Dorothea Kesztyüs, Tibor FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital |
title | FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital |
title_full | FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital |
title_fullStr | FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital |
title_full_unstemmed | FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital |
title_short | FAIRness through automation: development of an automated medical data integration infrastructure for FAIR health data in a maximum care university hospital |
title_sort | fairness through automation: development of an automated medical data integration infrastructure for fair health data in a maximum care university hospital |
topic | Database |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10186636/ https://www.ncbi.nlm.nih.gov/pubmed/37189148 http://dx.doi.org/10.1186/s12911-023-02195-3 |
work_keys_str_mv | AT parciakmarcel fairnessthroughautomationdevelopmentofanautomatedmedicaldataintegrationinfrastructureforfairhealthdatainamaximumcareuniversityhospital AT suhrmarkus fairnessthroughautomationdevelopmentofanautomatedmedicaldataintegrationinfrastructureforfairhealthdatainamaximumcareuniversityhospital AT schmidtchristian fairnessthroughautomationdevelopmentofanautomatedmedicaldataintegrationinfrastructureforfairhealthdatainamaximumcareuniversityhospital AT bonischcaroline fairnessthroughautomationdevelopmentofanautomatedmedicaldataintegrationinfrastructureforfairhealthdatainamaximumcareuniversityhospital AT lohnhardtbenjamin fairnessthroughautomationdevelopmentofanautomatedmedicaldataintegrationinfrastructureforfairhealthdatainamaximumcareuniversityhospital AT kesztyusdorothea fairnessthroughautomationdevelopmentofanautomatedmedicaldataintegrationinfrastructureforfairhealthdatainamaximumcareuniversityhospital AT kesztyustibor fairnessthroughautomationdevelopmentofanautomatedmedicaldataintegrationinfrastructureforfairhealthdatainamaximumcareuniversityhospital |