Cargando…

Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets

BACKGROUND: Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the...

Descripción completa

Detalles Bibliográficos
Autores principales: Salem, Saeed, Ozcaglar, Cagri
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4151083/
https://www.ncbi.nlm.nih.gov/pubmed/25221624
http://dx.doi.org/10.1186/1756-0381-7-16
_version_ 1782332994231468032
author Salem, Saeed
Ozcaglar, Cagri
author_facet Salem, Saeed
Ozcaglar, Cagri
author_sort Salem, Saeed
collection PubMed
description BACKGROUND: Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. RESULTS: We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways.
format Online
Article
Text
id pubmed-4151083
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41510832014-09-12 Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets Salem, Saeed Ozcaglar, Cagri BioData Min Research BACKGROUND: Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. RESULTS: We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. BioMed Central 2014-08-18 /pmc/articles/PMC4151083/ /pubmed/25221624 http://dx.doi.org/10.1186/1756-0381-7-16 Text en Copyright © 2014 Salem and Ozcaglar; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Salem, Saeed
Ozcaglar, Cagri
Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets
title Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets
title_full Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets
title_fullStr Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets
title_full_unstemmed Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets
title_short Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets
title_sort hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4151083/
https://www.ncbi.nlm.nih.gov/pubmed/25221624
http://dx.doi.org/10.1186/1756-0381-7-16
work_keys_str_mv AT salemsaeed hybridcoexpressionlinksimilaritygraphclusteringforminingbiologicalmodulesfrommultiplegeneexpressiondatasets
AT ozcaglarcagri hybridcoexpressionlinksimilaritygraphclusteringforminingbiologicalmodulesfrommultiplegeneexpressiondatasets