Cargando…

Analysis of disease comorbidity patterns in a large-scale China population

BACKGROUND: Disease comorbidity is popular and has significant indications for disease progress and management. We aim to detect the general disease comorbidity patterns in Chinese populations using a large-scale clinical data set. METHODS: We extracted the diseases from a large-scale anonymized dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Mengfei, Yu, Yanan, Wen, Tiancai, Zhang, Xiaoping, Liu, Baoyan, Zhang, Jin, Zhang, Runshun, Zhang, Yanning, Zhou, Xuezhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6907122/
https://www.ncbi.nlm.nih.gov/pubmed/31829182
http://dx.doi.org/10.1186/s12920-019-0629-x
_version_ 1783478485967175680
author Guo, Mengfei
Yu, Yanan
Wen, Tiancai
Zhang, Xiaoping
Liu, Baoyan
Zhang, Jin
Zhang, Runshun
Zhang, Yanning
Zhou, Xuezhong
author_facet Guo, Mengfei
Yu, Yanan
Wen, Tiancai
Zhang, Xiaoping
Liu, Baoyan
Zhang, Jin
Zhang, Runshun
Zhang, Yanning
Zhou, Xuezhong
author_sort Guo, Mengfei
collection PubMed
description BACKGROUND: Disease comorbidity is popular and has significant indications for disease progress and management. We aim to detect the general disease comorbidity patterns in Chinese populations using a large-scale clinical data set. METHODS: We extracted the diseases from a large-scale anonymized data set derived from 8,572,137 inpatients in 453 hospitals across China. We built a Disease Comorbidity Network (DCN) using correlation analysis and detected the topological patterns of disease comorbidity using both complex network and data mining methods. The comorbidity patterns were further validated by shared molecular mechanisms using disease-gene associations and pathways. To predict the disease occurrence during the whole disease progressions, we applied four machine learning methods to model the disease trajectories of patients. RESULTS: We obtained the DCN with 5702 nodes and 258,535 edges, which shows a power law distribution of the degree and weight. It further indicated that there exists high heterogeneity of comorbidities for different diseases and we found that the DCN is a hierarchical modular network with community structures, which have both homogeneous and heterogeneous disease categories. Furthermore, adhering to the previous work from US and Europe populations, we found that the disease comorbidities have their shared underlying molecular mechanisms. Furthermore, take hypertension and psychiatric disease as instance, we used four classification methods to predicte the disease occurrence using the comorbid disease trajectories and obtained acceptable performance, in which in particular, random forest obtained an overall best performance (with F1-score 0.6689 for hypertension and 0.6802 for psychiatric disease). CONCLUSIONS: Our study indicates that disease comorbidity is significant and valuable to understand the disease incidences and their interactions in real-world populations, which will provide important insights for detection of the patterns of disease classification, diagnosis and prognosis.
format Online
Article
Text
id pubmed-6907122
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69071222019-12-20 Analysis of disease comorbidity patterns in a large-scale China population Guo, Mengfei Yu, Yanan Wen, Tiancai Zhang, Xiaoping Liu, Baoyan Zhang, Jin Zhang, Runshun Zhang, Yanning Zhou, Xuezhong BMC Med Genomics Research BACKGROUND: Disease comorbidity is popular and has significant indications for disease progress and management. We aim to detect the general disease comorbidity patterns in Chinese populations using a large-scale clinical data set. METHODS: We extracted the diseases from a large-scale anonymized data set derived from 8,572,137 inpatients in 453 hospitals across China. We built a Disease Comorbidity Network (DCN) using correlation analysis and detected the topological patterns of disease comorbidity using both complex network and data mining methods. The comorbidity patterns were further validated by shared molecular mechanisms using disease-gene associations and pathways. To predict the disease occurrence during the whole disease progressions, we applied four machine learning methods to model the disease trajectories of patients. RESULTS: We obtained the DCN with 5702 nodes and 258,535 edges, which shows a power law distribution of the degree and weight. It further indicated that there exists high heterogeneity of comorbidities for different diseases and we found that the DCN is a hierarchical modular network with community structures, which have both homogeneous and heterogeneous disease categories. Furthermore, adhering to the previous work from US and Europe populations, we found that the disease comorbidities have their shared underlying molecular mechanisms. Furthermore, take hypertension and psychiatric disease as instance, we used four classification methods to predicte the disease occurrence using the comorbid disease trajectories and obtained acceptable performance, in which in particular, random forest obtained an overall best performance (with F1-score 0.6689 for hypertension and 0.6802 for psychiatric disease). CONCLUSIONS: Our study indicates that disease comorbidity is significant and valuable to understand the disease incidences and their interactions in real-world populations, which will provide important insights for detection of the patterns of disease classification, diagnosis and prognosis. BioMed Central 2019-12-12 /pmc/articles/PMC6907122/ /pubmed/31829182 http://dx.doi.org/10.1186/s12920-019-0629-x Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Guo, Mengfei
Yu, Yanan
Wen, Tiancai
Zhang, Xiaoping
Liu, Baoyan
Zhang, Jin
Zhang, Runshun
Zhang, Yanning
Zhou, Xuezhong
Analysis of disease comorbidity patterns in a large-scale China population
title Analysis of disease comorbidity patterns in a large-scale China population
title_full Analysis of disease comorbidity patterns in a large-scale China population
title_fullStr Analysis of disease comorbidity patterns in a large-scale China population
title_full_unstemmed Analysis of disease comorbidity patterns in a large-scale China population
title_short Analysis of disease comorbidity patterns in a large-scale China population
title_sort analysis of disease comorbidity patterns in a large-scale china population
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6907122/
https://www.ncbi.nlm.nih.gov/pubmed/31829182
http://dx.doi.org/10.1186/s12920-019-0629-x
work_keys_str_mv AT guomengfei analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT yuyanan analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT wentiancai analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT zhangxiaoping analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT liubaoyan analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT zhangjin analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT zhangrunshun analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT zhangyanning analysisofdiseasecomorbiditypatternsinalargescalechinapopulation
AT zhouxuezhong analysisofdiseasecomorbiditypatternsinalargescalechinapopulation