Cargando…

Data mining in clinical big data: the frequently used databases, steps, and methodological models

Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Wen-Tao, Li, Yuan-Jie, Feng, Ao-Zi, Li, Li, Huang, Tao, Xu, An-Ding, Lyu, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356424/
https://www.ncbi.nlm.nih.gov/pubmed/34380547
http://dx.doi.org/10.1186/s40779-021-00338-z
_version_ 1783736942190395392
author Wu, Wen-Tao
Li, Yuan-Jie
Feng, Ao-Zi
Li, Li
Huang, Tao
Xu, An-Ding
Lyu, Jun
author_facet Wu, Wen-Tao
Li, Yuan-Jie
Feng, Ao-Zi
Li, Li
Huang, Tao
Xu, An-Ding
Lyu, Jun
author_sort Wu, Wen-Tao
collection PubMed
description Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often characterized by a high degree of dimensional heterogeneity, timeliness, scarcity, irregularity, and other characteristics, resulting in the value of these data not being fully utilized. Data-mining technology has been a frontier field in medical research, as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models. Therefore, data mining has unique advantages in clinical big-data research, especially in large-scale medical public databases. This article introduced the main medical public database and described the steps, tasks, and models of data mining in simple language. Additionally, we described data-mining methods along with their practical applications. The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.
format Online
Article
Text
id pubmed-8356424
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-83564242021-08-11 Data mining in clinical big data: the frequently used databases, steps, and methodological models Wu, Wen-Tao Li, Yuan-Jie Feng, Ao-Zi Li, Li Huang, Tao Xu, An-Ding Lyu, Jun Mil Med Res Review Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often characterized by a high degree of dimensional heterogeneity, timeliness, scarcity, irregularity, and other characteristics, resulting in the value of these data not being fully utilized. Data-mining technology has been a frontier field in medical research, as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models. Therefore, data mining has unique advantages in clinical big-data research, especially in large-scale medical public databases. This article introduced the main medical public database and described the steps, tasks, and models of data mining in simple language. Additionally, we described data-mining methods along with their practical applications. The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients. BioMed Central 2021-08-11 /pmc/articles/PMC8356424/ /pubmed/34380547 http://dx.doi.org/10.1186/s40779-021-00338-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Review
Wu, Wen-Tao
Li, Yuan-Jie
Feng, Ao-Zi
Li, Li
Huang, Tao
Xu, An-Ding
Lyu, Jun
Data mining in clinical big data: the frequently used databases, steps, and methodological models
title Data mining in clinical big data: the frequently used databases, steps, and methodological models
title_full Data mining in clinical big data: the frequently used databases, steps, and methodological models
title_fullStr Data mining in clinical big data: the frequently used databases, steps, and methodological models
title_full_unstemmed Data mining in clinical big data: the frequently used databases, steps, and methodological models
title_short Data mining in clinical big data: the frequently used databases, steps, and methodological models
title_sort data mining in clinical big data: the frequently used databases, steps, and methodological models
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356424/
https://www.ncbi.nlm.nih.gov/pubmed/34380547
http://dx.doi.org/10.1186/s40779-021-00338-z
work_keys_str_mv AT wuwentao datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels
AT liyuanjie datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels
AT fengaozi datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels
AT lili datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels
AT huangtao datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels
AT xuanding datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels
AT lyujun datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels