Cargando…

Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies

OBJECTIVES: We reviewed digital epidemiological studies to characterize how researchers are using digital data by topic domain, study purpose, data source, and analytic method. METHODS: We reviewed research articles published within the last decade that used digital data to answer epidemiological re...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Hyeoun-Ae, Jung, Hyesil, On, Jeongah, Park, Seul Ki, Kang, Hannah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korean Society of Medical Informatics 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6230537/
https://www.ncbi.nlm.nih.gov/pubmed/30443413
http://dx.doi.org/10.4258/hir.2018.24.4.253
_version_ 1783370093608042496
author Park, Hyeoun-Ae
Jung, Hyesil
On, Jeongah
Park, Seul Ki
Kang, Hannah
author_facet Park, Hyeoun-Ae
Jung, Hyesil
On, Jeongah
Park, Seul Ki
Kang, Hannah
author_sort Park, Hyeoun-Ae
collection PubMed
description OBJECTIVES: We reviewed digital epidemiological studies to characterize how researchers are using digital data by topic domain, study purpose, data source, and analytic method. METHODS: We reviewed research articles published within the last decade that used digital data to answer epidemiological research questions. Data were abstracted from these articles using a data collection tool that we developed. Finally, we summarized the characteristics of the digital epidemiological studies. RESULTS: We identified six main topic domains: infectious diseases (58.7%), non-communicable diseases (29.4%), mental health and substance use (8.3%), general population behavior (4.6%), environmental, dietary, and lifestyle (4.6%), and vital status (0.9%). We identified four categories for the study purpose: description (22.9%), exploration (34.9%), explanation (27.5%), and prediction and control (14.7%). We identified eight categories for the data sources: web search query (52.3%), social media posts (31.2%), web portal posts (11.9%), webpage access logs (7.3%), images (7.3%), mobile phone network data (1.8%), global positioning system data (1.8%), and others (2.8%). Of these, 50.5% used correlation analyses, 41.3% regression analyses, 25.6% machine learning, and 19.3% descriptive analyses. CONCLUSIONS: Digital data collected for non-epidemiological purposes are being used to study health phenomena in a variety of topic domains. Digital epidemiology requires access to large datasets and advanced analytics. Ensuring open access is clearly at odds with the desire to have as little personal data as possible in these large datasets to protect privacy. Establishment of data cooperatives with restricted access may be a solution to this dilemma.
format Online
Article
Text
id pubmed-6230537
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Korean Society of Medical Informatics
record_format MEDLINE/PubMed
spelling pubmed-62305372018-11-15 Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies Park, Hyeoun-Ae Jung, Hyesil On, Jeongah Park, Seul Ki Kang, Hannah Healthc Inform Res Review Article OBJECTIVES: We reviewed digital epidemiological studies to characterize how researchers are using digital data by topic domain, study purpose, data source, and analytic method. METHODS: We reviewed research articles published within the last decade that used digital data to answer epidemiological research questions. Data were abstracted from these articles using a data collection tool that we developed. Finally, we summarized the characteristics of the digital epidemiological studies. RESULTS: We identified six main topic domains: infectious diseases (58.7%), non-communicable diseases (29.4%), mental health and substance use (8.3%), general population behavior (4.6%), environmental, dietary, and lifestyle (4.6%), and vital status (0.9%). We identified four categories for the study purpose: description (22.9%), exploration (34.9%), explanation (27.5%), and prediction and control (14.7%). We identified eight categories for the data sources: web search query (52.3%), social media posts (31.2%), web portal posts (11.9%), webpage access logs (7.3%), images (7.3%), mobile phone network data (1.8%), global positioning system data (1.8%), and others (2.8%). Of these, 50.5% used correlation analyses, 41.3% regression analyses, 25.6% machine learning, and 19.3% descriptive analyses. CONCLUSIONS: Digital data collected for non-epidemiological purposes are being used to study health phenomena in a variety of topic domains. Digital epidemiology requires access to large datasets and advanced analytics. Ensuring open access is clearly at odds with the desire to have as little personal data as possible in these large datasets to protect privacy. Establishment of data cooperatives with restricted access may be a solution to this dilemma. Korean Society of Medical Informatics 2018-10 2018-10-31 /pmc/articles/PMC6230537/ /pubmed/30443413 http://dx.doi.org/10.4258/hir.2018.24.4.253 Text en © 2018 The Korean Society of Medical Informatics http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review Article
Park, Hyeoun-Ae
Jung, Hyesil
On, Jeongah
Park, Seul Ki
Kang, Hannah
Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
title Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
title_full Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
title_fullStr Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
title_full_unstemmed Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
title_short Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
title_sort digital epidemiology: use of digital data collected for non-epidemiological purposes in epidemiological studies
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6230537/
https://www.ncbi.nlm.nih.gov/pubmed/30443413
http://dx.doi.org/10.4258/hir.2018.24.4.253
work_keys_str_mv AT parkhyeounae digitalepidemiologyuseofdigitaldatacollectedfornonepidemiologicalpurposesinepidemiologicalstudies
AT junghyesil digitalepidemiologyuseofdigitaldatacollectedfornonepidemiologicalpurposesinepidemiologicalstudies
AT onjeongah digitalepidemiologyuseofdigitaldatacollectedfornonepidemiologicalpurposesinepidemiologicalstudies
AT parkseulki digitalepidemiologyuseofdigitaldatacollectedfornonepidemiologicalpurposesinepidemiologicalstudies
AT kanghannah digitalepidemiologyuseofdigitaldatacollectedfornonepidemiologicalpurposesinepidemiologicalstudies