Cargando…

Co-clustering phenome–genome for phenotype classification and disease gene discovery

Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases...

Descripción completa

Detalles Bibliográficos
Autores principales: Hwang, TaeHyun, Atluri, Gowtham, Xie, MaoQiang, Dey, Sanjoy, Hong, Changjin, Kumar, Vipin, Kuang, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3479160/
https://www.ncbi.nlm.nih.gov/pubmed/22735708
http://dx.doi.org/10.1093/nar/gks615
_version_ 1782247415898701824
author Hwang, TaeHyun
Atluri, Gowtham
Xie, MaoQiang
Dey, Sanjoy
Hong, Changjin
Kumar, Vipin
Kuang, Rui
author_facet Hwang, TaeHyun
Atluri, Gowtham
Xie, MaoQiang
Dey, Sanjoy
Hong, Changjin
Kumar, Vipin
Kuang, Rui
author_sort Hwang, TaeHyun
collection PubMed
description Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways.
format Online
Article
Text
id pubmed-3479160
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-34791602012-10-24 Co-clustering phenome–genome for phenotype classification and disease gene discovery Hwang, TaeHyun Atluri, Gowtham Xie, MaoQiang Dey, Sanjoy Hong, Changjin Kumar, Vipin Kuang, Rui Nucleic Acids Res Methods Online Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. Oxford University Press 2012-10 2012-06-25 /pmc/articles/PMC3479160/ /pubmed/22735708 http://dx.doi.org/10.1093/nar/gks615 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Hwang, TaeHyun
Atluri, Gowtham
Xie, MaoQiang
Dey, Sanjoy
Hong, Changjin
Kumar, Vipin
Kuang, Rui
Co-clustering phenome–genome for phenotype classification and disease gene discovery
title Co-clustering phenome–genome for phenotype classification and disease gene discovery
title_full Co-clustering phenome–genome for phenotype classification and disease gene discovery
title_fullStr Co-clustering phenome–genome for phenotype classification and disease gene discovery
title_full_unstemmed Co-clustering phenome–genome for phenotype classification and disease gene discovery
title_short Co-clustering phenome–genome for phenotype classification and disease gene discovery
title_sort co-clustering phenome–genome for phenotype classification and disease gene discovery
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3479160/
https://www.ncbi.nlm.nih.gov/pubmed/22735708
http://dx.doi.org/10.1093/nar/gks615
work_keys_str_mv AT hwangtaehyun coclusteringphenomegenomeforphenotypeclassificationanddiseasegenediscovery
AT atlurigowtham coclusteringphenomegenomeforphenotypeclassificationanddiseasegenediscovery
AT xiemaoqiang coclusteringphenomegenomeforphenotypeclassificationanddiseasegenediscovery
AT deysanjoy coclusteringphenomegenomeforphenotypeclassificationanddiseasegenediscovery
AT hongchangjin coclusteringphenomegenomeforphenotypeclassificationanddiseasegenediscovery
AT kumarvipin coclusteringphenomegenomeforphenotypeclassificationanddiseasegenediscovery
AT kuangrui coclusteringphenomegenomeforphenotypeclassificationanddiseasegenediscovery