Cargando…
Data mining in clinical big data: the frequently used databases, steps, and methodological models
Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356424/ https://www.ncbi.nlm.nih.gov/pubmed/34380547 http://dx.doi.org/10.1186/s40779-021-00338-z |
_version_ | 1783736942190395392 |
---|---|
author | Wu, Wen-Tao Li, Yuan-Jie Feng, Ao-Zi Li, Li Huang, Tao Xu, An-Ding Lyu, Jun |
author_facet | Wu, Wen-Tao Li, Yuan-Jie Feng, Ao-Zi Li, Li Huang, Tao Xu, An-Ding Lyu, Jun |
author_sort | Wu, Wen-Tao |
collection | PubMed |
description | Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often characterized by a high degree of dimensional heterogeneity, timeliness, scarcity, irregularity, and other characteristics, resulting in the value of these data not being fully utilized. Data-mining technology has been a frontier field in medical research, as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models. Therefore, data mining has unique advantages in clinical big-data research, especially in large-scale medical public databases. This article introduced the main medical public database and described the steps, tasks, and models of data mining in simple language. Additionally, we described data-mining methods along with their practical applications. The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients. |
format | Online Article Text |
id | pubmed-8356424 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-83564242021-08-11 Data mining in clinical big data: the frequently used databases, steps, and methodological models Wu, Wen-Tao Li, Yuan-Jie Feng, Ao-Zi Li, Li Huang, Tao Xu, An-Ding Lyu, Jun Mil Med Res Review Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often characterized by a high degree of dimensional heterogeneity, timeliness, scarcity, irregularity, and other characteristics, resulting in the value of these data not being fully utilized. Data-mining technology has been a frontier field in medical research, as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models. Therefore, data mining has unique advantages in clinical big-data research, especially in large-scale medical public databases. This article introduced the main medical public database and described the steps, tasks, and models of data mining in simple language. Additionally, we described data-mining methods along with their practical applications. The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients. BioMed Central 2021-08-11 /pmc/articles/PMC8356424/ /pubmed/34380547 http://dx.doi.org/10.1186/s40779-021-00338-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Review Wu, Wen-Tao Li, Yuan-Jie Feng, Ao-Zi Li, Li Huang, Tao Xu, An-Ding Lyu, Jun Data mining in clinical big data: the frequently used databases, steps, and methodological models |
title | Data mining in clinical big data: the frequently used databases, steps, and methodological models |
title_full | Data mining in clinical big data: the frequently used databases, steps, and methodological models |
title_fullStr | Data mining in clinical big data: the frequently used databases, steps, and methodological models |
title_full_unstemmed | Data mining in clinical big data: the frequently used databases, steps, and methodological models |
title_short | Data mining in clinical big data: the frequently used databases, steps, and methodological models |
title_sort | data mining in clinical big data: the frequently used databases, steps, and methodological models |
topic | Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356424/ https://www.ncbi.nlm.nih.gov/pubmed/34380547 http://dx.doi.org/10.1186/s40779-021-00338-z |
work_keys_str_mv | AT wuwentao datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels AT liyuanjie datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels AT fengaozi datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels AT lili datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels AT huangtao datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels AT xuanding datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels AT lyujun datamininginclinicalbigdatathefrequentlyuseddatabasesstepsandmethodologicalmodels |