Cargando…

GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare

A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global...

Descripción completa

Detalles Bibliográficos
Autores principales: Ali, Rahman, Siddiqi, Muhammad Hameed, Idris, Muhammad, Ali, Taqdir, Hussain, Shujaat, Huh, Eui-Nam, Kang, Byeong Ho, Lee, Sungyoung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4541854/
https://www.ncbi.nlm.nih.gov/pubmed/26147731
http://dx.doi.org/10.3390/s150715772
_version_ 1782386448246243328
author Ali, Rahman
Siddiqi, Muhammad Hameed
Idris, Muhammad
Ali, Taqdir
Hussain, Shujaat
Huh, Eui-Nam
Kang, Byeong Ho
Lee, Sungyoung
author_facet Ali, Rahman
Siddiqi, Muhammad Hameed
Idris, Muhammad
Ali, Taqdir
Hussain, Shujaat
Huh, Eui-Nam
Kang, Byeong Ho
Lee, Sungyoung
author_sort Ali, Rahman
collection PubMed
description A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a “data modeler” tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.
format Online
Article
Text
id pubmed-4541854
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-45418542015-08-26 GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare Ali, Rahman Siddiqi, Muhammad Hameed Idris, Muhammad Ali, Taqdir Hussain, Shujaat Huh, Eui-Nam Kang, Byeong Ho Lee, Sungyoung Sensors (Basel) Article A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a “data modeler” tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets. MDPI 2015-07-02 /pmc/articles/PMC4541854/ /pubmed/26147731 http://dx.doi.org/10.3390/s150715772 Text en © 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ali, Rahman
Siddiqi, Muhammad Hameed
Idris, Muhammad
Ali, Taqdir
Hussain, Shujaat
Huh, Eui-Nam
Kang, Byeong Ho
Lee, Sungyoung
GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_full GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_fullStr GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_full_unstemmed GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_short GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_sort gudm: automatic generation of unified datasets for learning and reasoning in healthcare
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4541854/
https://www.ncbi.nlm.nih.gov/pubmed/26147731
http://dx.doi.org/10.3390/s150715772
work_keys_str_mv AT alirahman gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT siddiqimuhammadhameed gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT idrismuhammad gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT alitaqdir gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT hussainshujaat gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT huheuinam gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT kangbyeongho gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT leesungyoung gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare