Cargando…

Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository

Background  Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local...

Descripción completa

Detalles Bibliográficos
Autores principales: Kapsner, Lorenz A., Mang, Jonathan M., Mate, Sebastian, Seuchter, Susanne A., Vengadeswaran, Abishaa, Bathelt, Franziska, Deppenwiese, Noemi, Kadioglu, Dennis, Kraska, Detlef, Prokosch, Hans-Ulrich
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Georg Thieme Verlag KG 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8387126/
https://www.ncbi.nlm.nih.gov/pubmed/34433217
http://dx.doi.org/10.1055/s-0041-1733847
_version_ 1783742391835951104
author Kapsner, Lorenz A.
Mang, Jonathan M.
Mate, Sebastian
Seuchter, Susanne A.
Vengadeswaran, Abishaa
Bathelt, Franziska
Deppenwiese, Noemi
Kadioglu, Dennis
Kraska, Detlef
Prokosch, Hans-Ulrich
author_facet Kapsner, Lorenz A.
Mang, Jonathan M.
Mate, Sebastian
Seuchter, Susanne A.
Vengadeswaran, Abishaa
Bathelt, Franziska
Deppenwiese, Noemi
Kadioglu, Dennis
Kraska, Detlef
Prokosch, Hans-Ulrich
author_sort Kapsner, Lorenz A.
collection PubMed
description Background  Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local and federated analyses. To address concerns regarding possible data quality (DQ) issues of hospital routine data compared with data specifically collected for scientific purposes, we have previously presented a data quality assessment (DQA) tool providing a standardized approach to assess DQ of the research data repositories at the MIRACUM consortium's partner sites. Objectives  Major limitations of the former approach included manual interpretation of the results and hard coding of analyses, making their expansion to new data elements and databases time-consuming and error prone. We here present an enhanced version of the DQA tool by linking it to common data element definitions stored in a metadata repository (MDR), adopting the harmonized DQA framework from Kahn et al and its application within the MIRACUM consortium. Methods  Data quality checks were consequently aligned to a harmonized DQA terminology. Database-specific information were systematically identified and represented in an MDR. Furthermore, a structured representation of logical relations between data elements was developed to model plausibility-statements in the MDR. Results  The MIRACUM DQA tool was linked to data element definitions stored in a consortium-wide MDR. Additional databases used within MIRACUM were linked to the DQ checks by extending the respective data elements in the MDR with the required information. The evaluation of DQ checks was automated. An adaptable software implementation is provided with the R package DQAstats . Conclusion  The enhancements of the DQA tool facilitate the future integration of new data elements and make the tool scalable to other databases and data models. It has been provided to all ten MIRACUM partners and was successfully deployed and integrated into their respective data integration center infrastructure.
format Online
Article
Text
id pubmed-8387126
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Georg Thieme Verlag KG
record_format MEDLINE/PubMed
spelling pubmed-83871262021-09-15 Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository Kapsner, Lorenz A. Mang, Jonathan M. Mate, Sebastian Seuchter, Susanne A. Vengadeswaran, Abishaa Bathelt, Franziska Deppenwiese, Noemi Kadioglu, Dennis Kraska, Detlef Prokosch, Hans-Ulrich Appl Clin Inform Background  Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local and federated analyses. To address concerns regarding possible data quality (DQ) issues of hospital routine data compared with data specifically collected for scientific purposes, we have previously presented a data quality assessment (DQA) tool providing a standardized approach to assess DQ of the research data repositories at the MIRACUM consortium's partner sites. Objectives  Major limitations of the former approach included manual interpretation of the results and hard coding of analyses, making their expansion to new data elements and databases time-consuming and error prone. We here present an enhanced version of the DQA tool by linking it to common data element definitions stored in a metadata repository (MDR), adopting the harmonized DQA framework from Kahn et al and its application within the MIRACUM consortium. Methods  Data quality checks were consequently aligned to a harmonized DQA terminology. Database-specific information were systematically identified and represented in an MDR. Furthermore, a structured representation of logical relations between data elements was developed to model plausibility-statements in the MDR. Results  The MIRACUM DQA tool was linked to data element definitions stored in a consortium-wide MDR. Additional databases used within MIRACUM were linked to the DQ checks by extending the respective data elements in the MDR with the required information. The evaluation of DQ checks was automated. An adaptable software implementation is provided with the R package DQAstats . Conclusion  The enhancements of the DQA tool facilitate the future integration of new data elements and make the tool scalable to other databases and data models. It has been provided to all ten MIRACUM partners and was successfully deployed and integrated into their respective data integration center infrastructure. Georg Thieme Verlag KG 2021-08 2021-08-25 /pmc/articles/PMC8387126/ /pubmed/34433217 http://dx.doi.org/10.1055/s-0041-1733847 Text en The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. ( https://creativecommons.org/licenses/by-nc-nd/4.0/ ) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License, which permits unrestricted reproduction and distribution, for non-commercial purposes only; and use and reproduction, but not distribution, of adapted material for non-commercial purposes only, provided the original work is properly cited.
spellingShingle Kapsner, Lorenz A.
Mang, Jonathan M.
Mate, Sebastian
Seuchter, Susanne A.
Vengadeswaran, Abishaa
Bathelt, Franziska
Deppenwiese, Noemi
Kadioglu, Dennis
Kraska, Detlef
Prokosch, Hans-Ulrich
Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository
title Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository
title_full Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository
title_fullStr Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository
title_full_unstemmed Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository
title_short Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository
title_sort linking a consortium-wide data quality assessment tool with the miracum metadata repository
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8387126/
https://www.ncbi.nlm.nih.gov/pubmed/34433217
http://dx.doi.org/10.1055/s-0041-1733847
work_keys_str_mv AT kapsnerlorenza linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT mangjonathanm linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT matesebastian linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT seuchtersusannea linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT vengadeswaranabishaa linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT batheltfranziska linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT deppenwiesenoemi linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT kadiogludennis linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT kraskadetlef linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository
AT prokoschhansulrich linkingaconsortiumwidedataqualityassessmenttoolwiththemiracummetadatarepository