Cargando…
Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study
BACKGROUND: There is a growing interest in using person-generated wearable device data for biomedical research, but there are also concerns regarding the quality of data such as missing or incorrect data. This emphasizes the importance of assessing data quality before conducting research. In order t...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8738984/ https://www.ncbi.nlm.nih.gov/pubmed/34941540 http://dx.doi.org/10.2196/31618 |
_version_ | 1784629019669430272 |
---|---|
author | Cho, Sylvia Weng, Chunhua Kahn, Michael G Natarajan, Karthik |
author_facet | Cho, Sylvia Weng, Chunhua Kahn, Michael G Natarajan, Karthik |
author_sort | Cho, Sylvia |
collection | PubMed |
description | BACKGROUND: There is a growing interest in using person-generated wearable device data for biomedical research, but there are also concerns regarding the quality of data such as missing or incorrect data. This emphasizes the importance of assessing data quality before conducting research. In order to perform data quality assessments, it is essential to define what data quality means for person-generated wearable device data by identifying the data quality dimensions. OBJECTIVE: This study aims to identify data quality dimensions for person-generated wearable device data for research purposes. METHODS: This study was conducted in 3 phases: literature review, survey, and focus group discussion. The literature review was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guideline to identify factors affecting data quality and its associated data quality challenges. In addition, we conducted a survey to confirm and complement results from the literature review and to understand researchers’ perceptions on data quality dimensions that were previously identified as dimensions for the secondary use of electronic health record (EHR) data. We sent the survey to researchers with experience in analyzing wearable device data. Focus group discussion sessions were conducted with domain experts to derive data quality dimensions for person-generated wearable device data. On the basis of the results from the literature review and survey, a facilitator proposed potential data quality dimensions relevant to person-generated wearable device data, and the domain experts accepted or rejected the suggested dimensions. RESULTS: In total, 19 studies were included in the literature review, and 3 major themes emerged: device- and technical-related, user-related, and data governance–related factors. The associated data quality problems were incomplete data, incorrect data, and heterogeneous data. A total of 20 respondents answered the survey. The major data quality challenges faced by researchers were completeness, accuracy, and plausibility. The importance ratings on data quality dimensions in an existing framework showed that the dimensions for secondary use of EHR data are applicable to person-generated wearable device data. There were 3 focus group sessions with domain experts in data quality and wearable device research. The experts concluded that intrinsic data quality features, such as conformance, completeness, and plausibility, and contextual and fitness-for-use data quality features, such as completeness (breadth and density) and temporal data granularity, are important data quality dimensions for assessing person-generated wearable device data for research purposes. CONCLUSIONS: In this study, intrinsic and contextual and fitness-for-use data quality dimensions for person-generated wearable device data were identified. The dimensions were adapted from data quality terminologies and frameworks for the secondary use of EHR data with a few modifications. Further research on how data quality can be assessed with respect to each dimension is needed. |
format | Online Article Text |
id | pubmed-8738984 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-87389842022-01-21 Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study Cho, Sylvia Weng, Chunhua Kahn, Michael G Natarajan, Karthik JMIR Mhealth Uhealth Original Paper BACKGROUND: There is a growing interest in using person-generated wearable device data for biomedical research, but there are also concerns regarding the quality of data such as missing or incorrect data. This emphasizes the importance of assessing data quality before conducting research. In order to perform data quality assessments, it is essential to define what data quality means for person-generated wearable device data by identifying the data quality dimensions. OBJECTIVE: This study aims to identify data quality dimensions for person-generated wearable device data for research purposes. METHODS: This study was conducted in 3 phases: literature review, survey, and focus group discussion. The literature review was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guideline to identify factors affecting data quality and its associated data quality challenges. In addition, we conducted a survey to confirm and complement results from the literature review and to understand researchers’ perceptions on data quality dimensions that were previously identified as dimensions for the secondary use of electronic health record (EHR) data. We sent the survey to researchers with experience in analyzing wearable device data. Focus group discussion sessions were conducted with domain experts to derive data quality dimensions for person-generated wearable device data. On the basis of the results from the literature review and survey, a facilitator proposed potential data quality dimensions relevant to person-generated wearable device data, and the domain experts accepted or rejected the suggested dimensions. RESULTS: In total, 19 studies were included in the literature review, and 3 major themes emerged: device- and technical-related, user-related, and data governance–related factors. The associated data quality problems were incomplete data, incorrect data, and heterogeneous data. A total of 20 respondents answered the survey. The major data quality challenges faced by researchers were completeness, accuracy, and plausibility. The importance ratings on data quality dimensions in an existing framework showed that the dimensions for secondary use of EHR data are applicable to person-generated wearable device data. There were 3 focus group sessions with domain experts in data quality and wearable device research. The experts concluded that intrinsic data quality features, such as conformance, completeness, and plausibility, and contextual and fitness-for-use data quality features, such as completeness (breadth and density) and temporal data granularity, are important data quality dimensions for assessing person-generated wearable device data for research purposes. CONCLUSIONS: In this study, intrinsic and contextual and fitness-for-use data quality dimensions for person-generated wearable device data were identified. The dimensions were adapted from data quality terminologies and frameworks for the secondary use of EHR data with a few modifications. Further research on how data quality can be assessed with respect to each dimension is needed. JMIR Publications 2021-12-23 /pmc/articles/PMC8738984/ /pubmed/34941540 http://dx.doi.org/10.2196/31618 Text en ©Sylvia Cho, Chunhua Weng, Michael G Kahn, Karthik Natarajan. Originally published in JMIR mHealth and uHealth (https://mhealth.jmir.org), 23.12.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mHealth and uHealth, is properly cited. The complete bibliographic information, a link to the original publication on https://mhealth.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Cho, Sylvia Weng, Chunhua Kahn, Michael G Natarajan, Karthik Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study |
title | Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study |
title_full | Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study |
title_fullStr | Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study |
title_full_unstemmed | Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study |
title_short | Identifying Data Quality Dimensions for Person-Generated Wearable Device Data: Multi-Method Study |
title_sort | identifying data quality dimensions for person-generated wearable device data: multi-method study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8738984/ https://www.ncbi.nlm.nih.gov/pubmed/34941540 http://dx.doi.org/10.2196/31618 |
work_keys_str_mv | AT chosylvia identifyingdataqualitydimensionsforpersongeneratedwearabledevicedatamultimethodstudy AT wengchunhua identifyingdataqualitydimensionsforpersongeneratedwearabledevicedatamultimethodstudy AT kahnmichaelg identifyingdataqualitydimensionsforpersongeneratedwearabledevicedatamultimethodstudy AT natarajankarthik identifyingdataqualitydimensionsforpersongeneratedwearabledevicedatamultimethodstudy |