Cargando…

Development of a data utility framework to support effective health data curation

OBJECTIVES: The value of healthcare data is being increasingly recognised, including the need to improve health dataset utility. There is no established mechanism for evaluating healthcare dataset utility making it difficult to evaluate the effectiveness of activities improving the data. To describe...

Descripción completa

Detalles Bibliográficos
Autores principales: Gordon, Ben, Barrett, Jake, Fennessy, Clara, Cake, Caroline, Milward, Adam, Irwin, Courtney, Jones, Monica, Sebire, Neil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117992/
https://www.ncbi.nlm.nih.gov/pubmed/33980500
http://dx.doi.org/10.1136/bmjhci-2020-100303
_version_ 1783691672201199616
author Gordon, Ben
Barrett, Jake
Fennessy, Clara
Cake, Caroline
Milward, Adam
Irwin, Courtney
Jones, Monica
Sebire, Neil
author_facet Gordon, Ben
Barrett, Jake
Fennessy, Clara
Cake, Caroline
Milward, Adam
Irwin, Courtney
Jones, Monica
Sebire, Neil
author_sort Gordon, Ben
collection PubMed
description OBJECTIVES: The value of healthcare data is being increasingly recognised, including the need to improve health dataset utility. There is no established mechanism for evaluating healthcare dataset utility making it difficult to evaluate the effectiveness of activities improving the data. To describe the method for generating and involving the user community in developing a proposed framework for evaluation and communication of healthcare dataset utility for given research areas. METHODS: An initial version of a matrix to review datasets across a range of dimensions was developed based on previous published findings regarding healthcare data. This was used to initiate a design process through interviews and surveys with data users representing a broad range of user types and use cases, to help develop a focused framework for characterising datasets. RESULTS: Following 21 interviews, 31 survey responses and testing on 43 datasets, five major categories and 13 subcategories were identified as useful for a dataset, including Data Model, Completeness and Linkage. Each sub-category was graded to facilitate rapid and reproducible evaluation of dataset utility for specific use-cases. Testing of applicability to >40 existing datasets demonstrated potential usefulness for subsequent evaluation in real-world practice. DISCUSSION: The research has developed an evidenced-based initial approach for a framework to understand the utility of a healthcare dataset. It is likely to require further refinement following wider application and additional categories may be required. CONCLUSION: The process has resulted in a user-centred designed framework for objectively evaluating the likely utility of specific healthcare datasets, and therefore, should be of value both for potential users of health data, and for data custodians to identify the areas to provide the optimal value for data curation investment.
format Online
Article
Text
id pubmed-8117992
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-81179922021-05-26 Development of a data utility framework to support effective health data curation Gordon, Ben Barrett, Jake Fennessy, Clara Cake, Caroline Milward, Adam Irwin, Courtney Jones, Monica Sebire, Neil BMJ Health Care Inform Original Research OBJECTIVES: The value of healthcare data is being increasingly recognised, including the need to improve health dataset utility. There is no established mechanism for evaluating healthcare dataset utility making it difficult to evaluate the effectiveness of activities improving the data. To describe the method for generating and involving the user community in developing a proposed framework for evaluation and communication of healthcare dataset utility for given research areas. METHODS: An initial version of a matrix to review datasets across a range of dimensions was developed based on previous published findings regarding healthcare data. This was used to initiate a design process through interviews and surveys with data users representing a broad range of user types and use cases, to help develop a focused framework for characterising datasets. RESULTS: Following 21 interviews, 31 survey responses and testing on 43 datasets, five major categories and 13 subcategories were identified as useful for a dataset, including Data Model, Completeness and Linkage. Each sub-category was graded to facilitate rapid and reproducible evaluation of dataset utility for specific use-cases. Testing of applicability to >40 existing datasets demonstrated potential usefulness for subsequent evaluation in real-world practice. DISCUSSION: The research has developed an evidenced-based initial approach for a framework to understand the utility of a healthcare dataset. It is likely to require further refinement following wider application and additional categories may be required. CONCLUSION: The process has resulted in a user-centred designed framework for objectively evaluating the likely utility of specific healthcare datasets, and therefore, should be of value both for potential users of health data, and for data custodians to identify the areas to provide the optimal value for data curation investment. BMJ Publishing Group 2021-05-12 /pmc/articles/PMC8117992/ /pubmed/33980500 http://dx.doi.org/10.1136/bmjhci-2020-100303 Text en © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Original Research
Gordon, Ben
Barrett, Jake
Fennessy, Clara
Cake, Caroline
Milward, Adam
Irwin, Courtney
Jones, Monica
Sebire, Neil
Development of a data utility framework to support effective health data curation
title Development of a data utility framework to support effective health data curation
title_full Development of a data utility framework to support effective health data curation
title_fullStr Development of a data utility framework to support effective health data curation
title_full_unstemmed Development of a data utility framework to support effective health data curation
title_short Development of a data utility framework to support effective health data curation
title_sort development of a data utility framework to support effective health data curation
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117992/
https://www.ncbi.nlm.nih.gov/pubmed/33980500
http://dx.doi.org/10.1136/bmjhci-2020-100303
work_keys_str_mv AT gordonben developmentofadatautilityframeworktosupporteffectivehealthdatacuration
AT barrettjake developmentofadatautilityframeworktosupporteffectivehealthdatacuration
AT fennessyclara developmentofadatautilityframeworktosupporteffectivehealthdatacuration
AT cakecaroline developmentofadatautilityframeworktosupporteffectivehealthdatacuration
AT milwardadam developmentofadatautilityframeworktosupporteffectivehealthdatacuration
AT irwincourtney developmentofadatautilityframeworktosupporteffectivehealthdatacuration
AT jonesmonica developmentofadatautilityframeworktosupporteffectivehealthdatacuration
AT sebireneil developmentofadatautilityframeworktosupporteffectivehealthdatacuration