Cargando…
NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is sti...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5615262/ https://www.ncbi.nlm.nih.gov/pubmed/28950861 http://dx.doi.org/10.1186/s12918-017-0456-7 |
_version_ | 1783266551063904256 |
---|---|
author | Sun, Duanchen Liu, Yinliang Zhang, Xiang-Sun Wu, Ling-Yun |
author_facet | Sun, Duanchen Liu, Yinliang Zhang, Xiang-Sun Wu, Ling-Yun |
author_sort | Sun, Duanchen |
collection | PubMed |
description | BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. RESULTS: In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub (http://github.com/wulingyun/CopTea/). CONCLUSION: Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-017-0456-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5615262 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56152622017-09-28 NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis Sun, Duanchen Liu, Yinliang Zhang, Xiang-Sun Wu, Ling-Yun BMC Syst Biol Research BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. RESULTS: In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub (http://github.com/wulingyun/CopTea/). CONCLUSION: Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-017-0456-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-09-21 /pmc/articles/PMC5615262/ /pubmed/28950861 http://dx.doi.org/10.1186/s12918-017-0456-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Sun, Duanchen Liu, Yinliang Zhang, Xiang-Sun Wu, Ling-Yun NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis |
title | NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis |
title_full | NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis |
title_fullStr | NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis |
title_full_unstemmed | NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis |
title_short | NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis |
title_sort | netgen: a novel network-based probabilistic generative model for gene set functional enrichment analysis |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5615262/ https://www.ncbi.nlm.nih.gov/pubmed/28950861 http://dx.doi.org/10.1186/s12918-017-0456-7 |
work_keys_str_mv | AT sunduanchen netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis AT liuyinliang netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis AT zhangxiangsun netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis AT wulingyun netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis |