Cargando…
Identifying disease genes by integrating multiple data sources
BACKGROUND: Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243092/ https://www.ncbi.nlm.nih.gov/pubmed/25350511 http://dx.doi.org/10.1186/1755-8794-7-S2-S2 |
_version_ | 1782346057362964480 |
---|---|
author | Chen, Bolin Wang, Jianxin Li, Min Wu, Fang-Xiang |
author_facet | Chen, Bolin Wang, Jianxin Li, Min Wu, Fang-Xiang |
author_sort | Chen, Bolin |
collection | PubMed |
description | BACKGROUND: Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data is an effective method to identify disease genes. RESULTS: In this paper, we propose a multiple data integration method based on the theory of Markov random field (MRF) and the method of Bayesian analysis for identifying human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes. CONCLUSIONS: Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions, pathways and gene expression profiles. Predictions are evaluated by the leave-one-out method. The proposed method achieves an AUC score of 0.743 when integrating all those biological data in our experiments. |
format | Online Article Text |
id | pubmed-4243092 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42430922014-11-26 Identifying disease genes by integrating multiple data sources Chen, Bolin Wang, Jianxin Li, Min Wu, Fang-Xiang BMC Med Genomics Research BACKGROUND: Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data is an effective method to identify disease genes. RESULTS: In this paper, we propose a multiple data integration method based on the theory of Markov random field (MRF) and the method of Bayesian analysis for identifying human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes. CONCLUSIONS: Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions, pathways and gene expression profiles. Predictions are evaluated by the leave-one-out method. The proposed method achieves an AUC score of 0.743 when integrating all those biological data in our experiments. BioMed Central 2014-10-22 /pmc/articles/PMC4243092/ /pubmed/25350511 http://dx.doi.org/10.1186/1755-8794-7-S2-S2 Text en Copyright © 2014 Chen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Chen, Bolin Wang, Jianxin Li, Min Wu, Fang-Xiang Identifying disease genes by integrating multiple data sources |
title | Identifying disease genes by integrating multiple data sources |
title_full | Identifying disease genes by integrating multiple data sources |
title_fullStr | Identifying disease genes by integrating multiple data sources |
title_full_unstemmed | Identifying disease genes by integrating multiple data sources |
title_short | Identifying disease genes by integrating multiple data sources |
title_sort | identifying disease genes by integrating multiple data sources |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243092/ https://www.ncbi.nlm.nih.gov/pubmed/25350511 http://dx.doi.org/10.1186/1755-8794-7-S2-S2 |
work_keys_str_mv | AT chenbolin identifyingdiseasegenesbyintegratingmultipledatasources AT wangjianxin identifyingdiseasegenesbyintegratingmultipledatasources AT limin identifyingdiseasegenesbyintegratingmultipledatasources AT wufangxiang identifyingdiseasegenesbyintegratingmultipledatasources |