Cargando…

Multiconstrained gene clustering based on generalized projections

BACKGROUND: Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple piec...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Jia, Zhu, Shanfeng, Liew, Alan Wee-Chung, Yan, Hong
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098054/
https://www.ncbi.nlm.nih.gov/pubmed/20356386
http://dx.doi.org/10.1186/1471-2105-11-164
_version_ 1782203906713976832
author Zeng, Jia
Zhu, Shanfeng
Liew, Alan Wee-Chung
Yan, Hong
author_facet Zeng, Jia
Zhu, Shanfeng
Liew, Alan Wee-Chung
Yan, Hong
author_sort Zeng, Jia
collection PubMed
description BACKGROUND: Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. RESULTS: We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. CONCLUSIONS: The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions.
format Text
id pubmed-3098054
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30980542011-05-20 Multiconstrained gene clustering based on generalized projections Zeng, Jia Zhu, Shanfeng Liew, Alan Wee-Chung Yan, Hong BMC Bioinformatics Methodology Article BACKGROUND: Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. RESULTS: We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. CONCLUSIONS: The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. BioMed Central 2010-03-31 /pmc/articles/PMC3098054/ /pubmed/20356386 http://dx.doi.org/10.1186/1471-2105-11-164 Text en Copyright ©2010 Zeng et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Zeng, Jia
Zhu, Shanfeng
Liew, Alan Wee-Chung
Yan, Hong
Multiconstrained gene clustering based on generalized projections
title Multiconstrained gene clustering based on generalized projections
title_full Multiconstrained gene clustering based on generalized projections
title_fullStr Multiconstrained gene clustering based on generalized projections
title_full_unstemmed Multiconstrained gene clustering based on generalized projections
title_short Multiconstrained gene clustering based on generalized projections
title_sort multiconstrained gene clustering based on generalized projections
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098054/
https://www.ncbi.nlm.nih.gov/pubmed/20356386
http://dx.doi.org/10.1186/1471-2105-11-164
work_keys_str_mv AT zengjia multiconstrainedgeneclusteringbasedongeneralizedprojections
AT zhushanfeng multiconstrainedgeneclusteringbasedongeneralizedprojections
AT liewalanweechung multiconstrainedgeneclusteringbasedongeneralizedprojections
AT yanhong multiconstrainedgeneclusteringbasedongeneralizedprojections