Cargando…

CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets

In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways....

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yang, Jourdain, Alexis A., Calvo, Sarah E., Liu, Jun S., Mootha, Vamsi K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5546725/
https://www.ncbi.nlm.nih.gov/pubmed/28719601
http://dx.doi.org/10.1371/journal.pcbi.1005653
_version_ 1783255602374377472
author Li, Yang
Jourdain, Alexis A.
Calvo, Sarah E.
Liu, Jun S.
Mootha, Vamsi K.
author_facet Li, Yang
Jourdain, Alexis A.
Calvo, Sarah E.
Liu, Jun S.
Mootha, Vamsi K.
author_sort Li, Yang
collection PubMed
description In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active.
format Online
Article
Text
id pubmed-5546725
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-55467252017-08-12 CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets Li, Yang Jourdain, Alexis A. Calvo, Sarah E. Liu, Jun S. Mootha, Vamsi K. PLoS Comput Biol Research Article In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active. Public Library of Science 2017-07-18 /pmc/articles/PMC5546725/ /pubmed/28719601 http://dx.doi.org/10.1371/journal.pcbi.1005653 Text en © 2017 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Yang
Jourdain, Alexis A.
Calvo, Sarah E.
Liu, Jun S.
Mootha, Vamsi K.
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets
title CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets
title_full CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets
title_fullStr CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets
title_full_unstemmed CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets
title_short CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets
title_sort clic, a tool for expanding biological pathways based on co-expression across thousands of datasets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5546725/
https://www.ncbi.nlm.nih.gov/pubmed/28719601
http://dx.doi.org/10.1371/journal.pcbi.1005653
work_keys_str_mv AT liyang clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT jourdainalexisa clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT calvosarahe clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT liujuns clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT moothavamsik clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets