Cargando…

Prediction of Human Disease-Related Gene Clusters by Clustering Analysis

Since genes associated with similar diseases/disorders show an increased tendency for their protein products to interact with each other through protein-protein interactions (PPI), clustering analysis obviously as an efficient technique can be easily used to predict human disease-related gene cluste...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Peng Gang, Gao, Lin, Han, Shan
Formato: Texto
Lenguaje:English
Publicado: Ivyspring International Publisher 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3030143/
https://www.ncbi.nlm.nih.gov/pubmed/21278917
_version_ 1782197260632719360
author Sun, Peng Gang
Gao, Lin
Han, Shan
author_facet Sun, Peng Gang
Gao, Lin
Han, Shan
author_sort Sun, Peng Gang
collection PubMed
description Since genes associated with similar diseases/disorders show an increased tendency for their protein products to interact with each other through protein-protein interactions (PPI), clustering analysis obviously as an efficient technique can be easily used to predict human disease-related gene clusters/subnetworks. Firstly, we used clustering algorithms, Markov cluster algorithm (MCL), Molecular complex detection (MCODE) and Clique percolation method (CPM) to decompose human PPI network into dense clusters as the candidates of disease-related clusters, and then a log likelihood model that integrates multiple biological evidences was proposed to score these dense clusters. Finally, we identified disease-related clusters using these dense clusters if they had higher scores. The efficiency was evaluated by a leave-one-out cross validation procedure. Our method achieved a success rate with 98.59% and recovered the hidden disease-related clusters in 34.04% cases when removed one known disease gene and all its gene-disease associations. We found that the clusters decomposed by CPM outperformed MCL and MCODE as the candidates of disease-related clusters with well-supported biological significance in biological process, molecular function and cellular component of Gene Ontology (GO) and expression of human tissues. We also found that most of the disease-related clusters consisted of tissue-specific genes that were highly expressed only in one or several tissues, and a few of those were composed of housekeeping genes (maintenance genes) that were ubiquitously expressed in most of all the tissues.
format Text
id pubmed-3030143
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Ivyspring International Publisher
record_format MEDLINE/PubMed
spelling pubmed-30301432011-01-28 Prediction of Human Disease-Related Gene Clusters by Clustering Analysis Sun, Peng Gang Gao, Lin Han, Shan Int J Biol Sci Research Paper Since genes associated with similar diseases/disorders show an increased tendency for their protein products to interact with each other through protein-protein interactions (PPI), clustering analysis obviously as an efficient technique can be easily used to predict human disease-related gene clusters/subnetworks. Firstly, we used clustering algorithms, Markov cluster algorithm (MCL), Molecular complex detection (MCODE) and Clique percolation method (CPM) to decompose human PPI network into dense clusters as the candidates of disease-related clusters, and then a log likelihood model that integrates multiple biological evidences was proposed to score these dense clusters. Finally, we identified disease-related clusters using these dense clusters if they had higher scores. The efficiency was evaluated by a leave-one-out cross validation procedure. Our method achieved a success rate with 98.59% and recovered the hidden disease-related clusters in 34.04% cases when removed one known disease gene and all its gene-disease associations. We found that the clusters decomposed by CPM outperformed MCL and MCODE as the candidates of disease-related clusters with well-supported biological significance in biological process, molecular function and cellular component of Gene Ontology (GO) and expression of human tissues. We also found that most of the disease-related clusters consisted of tissue-specific genes that were highly expressed only in one or several tissues, and a few of those were composed of housekeeping genes (maintenance genes) that were ubiquitously expressed in most of all the tissues. Ivyspring International Publisher 2011-01-14 /pmc/articles/PMC3030143/ /pubmed/21278917 Text en © Ivyspring International Publisher. This is an open-access article distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by-nc-nd/3.0/). Reproduction is permitted for personal, noncommercial use, provided that the article is in whole, unmodified, and properly cited.
spellingShingle Research Paper
Sun, Peng Gang
Gao, Lin
Han, Shan
Prediction of Human Disease-Related Gene Clusters by Clustering Analysis
title Prediction of Human Disease-Related Gene Clusters by Clustering Analysis
title_full Prediction of Human Disease-Related Gene Clusters by Clustering Analysis
title_fullStr Prediction of Human Disease-Related Gene Clusters by Clustering Analysis
title_full_unstemmed Prediction of Human Disease-Related Gene Clusters by Clustering Analysis
title_short Prediction of Human Disease-Related Gene Clusters by Clustering Analysis
title_sort prediction of human disease-related gene clusters by clustering analysis
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3030143/
https://www.ncbi.nlm.nih.gov/pubmed/21278917
work_keys_str_mv AT sunpenggang predictionofhumandiseaserelatedgeneclustersbyclusteringanalysis
AT gaolin predictionofhumandiseaserelatedgeneclustersbyclusteringanalysis
AT hanshan predictionofhumandiseaserelatedgeneclustersbyclusteringanalysis