Cargando…

Revealing functionally coherent subsets using a spectral clustering and an information integration approach

BACKGROUND: Contemporary high-throughput analyses often produce lengthy lists of genes or proteins. It is desirable to divide the genes into functionally coherent subsets for further investigation, by integrating heterogeneous information regarding the genes. Here we report a principled approach for...

Descripción completa

Detalles Bibliográficos
Autores principales: Richards, Adam J, Schwacke, John H, Rohrer, Bärbel, Cowart, L Ashley, Lu, Xinghua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3542577/
https://www.ncbi.nlm.nih.gov/pubmed/23282411
http://dx.doi.org/10.1186/1752-0509-6-S3-S7
_version_ 1782255537807687680
author Richards, Adam J
Schwacke, John H
Rohrer, Bärbel
Cowart, L Ashley
Lu, Xinghua
author_facet Richards, Adam J
Schwacke, John H
Rohrer, Bärbel
Cowart, L Ashley
Lu, Xinghua
author_sort Richards, Adam J
collection PubMed
description BACKGROUND: Contemporary high-throughput analyses often produce lengthy lists of genes or proteins. It is desirable to divide the genes into functionally coherent subsets for further investigation, by integrating heterogeneous information regarding the genes. Here we report a principled approach for managing and integrating multiple data sources within the framework of graph-spectrum analysis in order to identify coherent gene subsets. RESULTS: We investigated several approaches to integrate information derived from different sources that reflect distinct aspects of gene functional relationships including: functional annotations of genes in the form of the Gene Ontology, co-mentioning of genes in the literature, and shared transcription factor binding sites among genes. Given a list of genes, we construct a graph containing the genes in each information space; then the graphs were kernel transformed so they could be integrated; finally functionally coherent subsets were identified using a spectral clustering algorithm. In a series of simulation experiments, known functionally coherent gene sets were mixed and recovered using our approach. CONCLUSIONS: The results indicate that spectral clustering approaches are capable of recovering coherent gene modules even under noisy conditions, and that information integration serves to further enhance this capability. When applied to a real-world data set, our methods revealed biologically sensible modules, and highlighted the importance of information integration. The implementation of the statistical model is provided under the GNU general public license, as an installable Python module, at: http://code.google.com/p/spectralmix.
format Online
Article
Text
id pubmed-3542577
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35425772013-01-11 Revealing functionally coherent subsets using a spectral clustering and an information integration approach Richards, Adam J Schwacke, John H Rohrer, Bärbel Cowart, L Ashley Lu, Xinghua BMC Syst Biol Research BACKGROUND: Contemporary high-throughput analyses often produce lengthy lists of genes or proteins. It is desirable to divide the genes into functionally coherent subsets for further investigation, by integrating heterogeneous information regarding the genes. Here we report a principled approach for managing and integrating multiple data sources within the framework of graph-spectrum analysis in order to identify coherent gene subsets. RESULTS: We investigated several approaches to integrate information derived from different sources that reflect distinct aspects of gene functional relationships including: functional annotations of genes in the form of the Gene Ontology, co-mentioning of genes in the literature, and shared transcription factor binding sites among genes. Given a list of genes, we construct a graph containing the genes in each information space; then the graphs were kernel transformed so they could be integrated; finally functionally coherent subsets were identified using a spectral clustering algorithm. In a series of simulation experiments, known functionally coherent gene sets were mixed and recovered using our approach. CONCLUSIONS: The results indicate that spectral clustering approaches are capable of recovering coherent gene modules even under noisy conditions, and that information integration serves to further enhance this capability. When applied to a real-world data set, our methods revealed biologically sensible modules, and highlighted the importance of information integration. The implementation of the statistical model is provided under the GNU general public license, as an installable Python module, at: http://code.google.com/p/spectralmix. BioMed Central 2012-12-17 /pmc/articles/PMC3542577/ /pubmed/23282411 http://dx.doi.org/10.1186/1752-0509-6-S3-S7 Text en Copyright ©2012 Richards et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Richards, Adam J
Schwacke, John H
Rohrer, Bärbel
Cowart, L Ashley
Lu, Xinghua
Revealing functionally coherent subsets using a spectral clustering and an information integration approach
title Revealing functionally coherent subsets using a spectral clustering and an information integration approach
title_full Revealing functionally coherent subsets using a spectral clustering and an information integration approach
title_fullStr Revealing functionally coherent subsets using a spectral clustering and an information integration approach
title_full_unstemmed Revealing functionally coherent subsets using a spectral clustering and an information integration approach
title_short Revealing functionally coherent subsets using a spectral clustering and an information integration approach
title_sort revealing functionally coherent subsets using a spectral clustering and an information integration approach
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3542577/
https://www.ncbi.nlm.nih.gov/pubmed/23282411
http://dx.doi.org/10.1186/1752-0509-6-S3-S7
work_keys_str_mv AT richardsadamj revealingfunctionallycoherentsubsetsusingaspectralclusteringandaninformationintegrationapproach
AT schwackejohnh revealingfunctionallycoherentsubsetsusingaspectralclusteringandaninformationintegrationapproach
AT rohrerbarbel revealingfunctionallycoherentsubsetsusingaspectralclusteringandaninformationintegrationapproach
AT cowartlashley revealingfunctionallycoherentsubsetsusingaspectralclusteringandaninformationintegrationapproach
AT luxinghua revealingfunctionallycoherentsubsetsusingaspectralclusteringandaninformationintegrationapproach