Cargando…

Identifying disease genes by integrating multiple data sources

BACKGROUND: Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Bolin, Wang, Jianxin, Li, Min, Wu, Fang-Xiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243092/
https://www.ncbi.nlm.nih.gov/pubmed/25350511
http://dx.doi.org/10.1186/1755-8794-7-S2-S2
_version_ 1782346057362964480
author Chen, Bolin
Wang, Jianxin
Li, Min
Wu, Fang-Xiang
author_facet Chen, Bolin
Wang, Jianxin
Li, Min
Wu, Fang-Xiang
author_sort Chen, Bolin
collection PubMed
description BACKGROUND: Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data is an effective method to identify disease genes. RESULTS: In this paper, we propose a multiple data integration method based on the theory of Markov random field (MRF) and the method of Bayesian analysis for identifying human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes. CONCLUSIONS: Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions, pathways and gene expression profiles. Predictions are evaluated by the leave-one-out method. The proposed method achieves an AUC score of 0.743 when integrating all those biological data in our experiments.
format Online
Article
Text
id pubmed-4243092
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42430922014-11-26 Identifying disease genes by integrating multiple data sources Chen, Bolin Wang, Jianxin Li, Min Wu, Fang-Xiang BMC Med Genomics Research BACKGROUND: Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data is an effective method to identify disease genes. RESULTS: In this paper, we propose a multiple data integration method based on the theory of Markov random field (MRF) and the method of Bayesian analysis for identifying human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes. CONCLUSIONS: Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions, pathways and gene expression profiles. Predictions are evaluated by the leave-one-out method. The proposed method achieves an AUC score of 0.743 when integrating all those biological data in our experiments. BioMed Central 2014-10-22 /pmc/articles/PMC4243092/ /pubmed/25350511 http://dx.doi.org/10.1186/1755-8794-7-S2-S2 Text en Copyright © 2014 Chen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chen, Bolin
Wang, Jianxin
Li, Min
Wu, Fang-Xiang
Identifying disease genes by integrating multiple data sources
title Identifying disease genes by integrating multiple data sources
title_full Identifying disease genes by integrating multiple data sources
title_fullStr Identifying disease genes by integrating multiple data sources
title_full_unstemmed Identifying disease genes by integrating multiple data sources
title_short Identifying disease genes by integrating multiple data sources
title_sort identifying disease genes by integrating multiple data sources
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243092/
https://www.ncbi.nlm.nih.gov/pubmed/25350511
http://dx.doi.org/10.1186/1755-8794-7-S2-S2
work_keys_str_mv AT chenbolin identifyingdiseasegenesbyintegratingmultipledatasources
AT wangjianxin identifyingdiseasegenesbyintegratingmultipledatasources
AT limin identifyingdiseasegenesbyintegratingmultipledatasources
AT wufangxiang identifyingdiseasegenesbyintegratingmultipledatasources