Cargando…

Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review

Digital data play an increasingly important role in advancing health research and care. However, most digital data in healthcare are in an unstructured and often not readily accessible format for research. Unstructured data are often found in a format that lacks standardization and needs significant...

Descripción completa

Detalles Bibliográficos
Autores principales: Sedlakova, Jana, Daniore, Paola, Horn Wintsch, Andrea, Wolf, Markus, Stanikic, Mina, Haag, Christina, Sieber, Chloé, Schneider, Gerold, Staub, Kaspar, Alois Ettlin, Dominik, Grübner, Oliver, Rinaldi, Fabio, von Wyl, Viktor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10566734/
https://www.ncbi.nlm.nih.gov/pubmed/37819910
http://dx.doi.org/10.1371/journal.pdig.0000347
_version_ 1785118975228641280
author Sedlakova, Jana
Daniore, Paola
Horn Wintsch, Andrea
Wolf, Markus
Stanikic, Mina
Haag, Christina
Sieber, Chloé
Schneider, Gerold
Staub, Kaspar
Alois Ettlin, Dominik
Grübner, Oliver
Rinaldi, Fabio
von Wyl, Viktor
author_facet Sedlakova, Jana
Daniore, Paola
Horn Wintsch, Andrea
Wolf, Markus
Stanikic, Mina
Haag, Christina
Sieber, Chloé
Schneider, Gerold
Staub, Kaspar
Alois Ettlin, Dominik
Grübner, Oliver
Rinaldi, Fabio
von Wyl, Viktor
author_sort Sedlakova, Jana
collection PubMed
description Digital data play an increasingly important role in advancing health research and care. However, most digital data in healthcare are in an unstructured and often not readily accessible format for research. Unstructured data are often found in a format that lacks standardization and needs significant preprocessing and feature extraction efforts. This poses challenges when combining such data with other data sources to enhance the existing knowledge base, which we refer to as digital unstructured data enrichment. Overcoming these methodological challenges requires significant resources and may limit the ability to fully leverage their potential for advancing health research and, ultimately, prevention, and patient care delivery. While prevalent challenges associated with unstructured data use in health research are widely reported across literature, a comprehensive interdisciplinary summary of such challenges and possible solutions to facilitate their use in combination with structured data sources is missing. In this study, we report findings from a systematic narrative review on the seven most prevalent challenge areas connected with the digital unstructured data enrichment in the fields of cardiology, neurology and mental health, along with possible solutions to address these challenges. Based on these findings, we developed a checklist that follows the standard data flow in health research studies. This checklist aims to provide initial systematic guidance to inform early planning and feasibility assessments for health research studies aiming combining unstructured data with existing data sources. Overall, the generality of reported unstructured data enrichment methods in the studies included in this review call for more systematic reporting of such methods to achieve greater reproducibility in future studies.
format Online
Article
Text
id pubmed-10566734
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-105667342023-10-12 Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review Sedlakova, Jana Daniore, Paola Horn Wintsch, Andrea Wolf, Markus Stanikic, Mina Haag, Christina Sieber, Chloé Schneider, Gerold Staub, Kaspar Alois Ettlin, Dominik Grübner, Oliver Rinaldi, Fabio von Wyl, Viktor PLOS Digit Health Research Article Digital data play an increasingly important role in advancing health research and care. However, most digital data in healthcare are in an unstructured and often not readily accessible format for research. Unstructured data are often found in a format that lacks standardization and needs significant preprocessing and feature extraction efforts. This poses challenges when combining such data with other data sources to enhance the existing knowledge base, which we refer to as digital unstructured data enrichment. Overcoming these methodological challenges requires significant resources and may limit the ability to fully leverage their potential for advancing health research and, ultimately, prevention, and patient care delivery. While prevalent challenges associated with unstructured data use in health research are widely reported across literature, a comprehensive interdisciplinary summary of such challenges and possible solutions to facilitate their use in combination with structured data sources is missing. In this study, we report findings from a systematic narrative review on the seven most prevalent challenge areas connected with the digital unstructured data enrichment in the fields of cardiology, neurology and mental health, along with possible solutions to address these challenges. Based on these findings, we developed a checklist that follows the standard data flow in health research studies. This checklist aims to provide initial systematic guidance to inform early planning and feasibility assessments for health research studies aiming combining unstructured data with existing data sources. Overall, the generality of reported unstructured data enrichment methods in the studies included in this review call for more systematic reporting of such methods to achieve greater reproducibility in future studies. Public Library of Science 2023-10-11 /pmc/articles/PMC10566734/ /pubmed/37819910 http://dx.doi.org/10.1371/journal.pdig.0000347 Text en © 2023 Sedlakova et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Sedlakova, Jana
Daniore, Paola
Horn Wintsch, Andrea
Wolf, Markus
Stanikic, Mina
Haag, Christina
Sieber, Chloé
Schneider, Gerold
Staub, Kaspar
Alois Ettlin, Dominik
Grübner, Oliver
Rinaldi, Fabio
von Wyl, Viktor
Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review
title Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review
title_full Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review
title_fullStr Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review
title_full_unstemmed Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review
title_short Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review
title_sort challenges and best practices for digital unstructured data enrichment in health research: a systematic narrative review
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10566734/
https://www.ncbi.nlm.nih.gov/pubmed/37819910
http://dx.doi.org/10.1371/journal.pdig.0000347
work_keys_str_mv AT sedlakovajana challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT daniorepaola challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT hornwintschandrea challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT wolfmarkus challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT stanikicmina challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT haagchristina challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT sieberchloe challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT schneidergerold challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT staubkaspar challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT aloisettlindominik challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT grubneroliver challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT rinaldifabio challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT vonwylviktor challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview
AT challengesandbestpracticesfordigitalunstructureddataenrichmentinhealthresearchasystematicnarrativereview