Cargando…

NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis

BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is sti...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Duanchen, Liu, Yinliang, Zhang, Xiang-Sun, Wu, Ling-Yun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5615262/
https://www.ncbi.nlm.nih.gov/pubmed/28950861
http://dx.doi.org/10.1186/s12918-017-0456-7
_version_ 1783266551063904256
author Sun, Duanchen
Liu, Yinliang
Zhang, Xiang-Sun
Wu, Ling-Yun
author_facet Sun, Duanchen
Liu, Yinliang
Zhang, Xiang-Sun
Wu, Ling-Yun
author_sort Sun, Duanchen
collection PubMed
description BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. RESULTS: In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub (http://github.com/wulingyun/CopTea/). CONCLUSION: Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-017-0456-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5615262
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-56152622017-09-28 NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis Sun, Duanchen Liu, Yinliang Zhang, Xiang-Sun Wu, Ling-Yun BMC Syst Biol Research BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. RESULTS: In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub (http://github.com/wulingyun/CopTea/). CONCLUSION: Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-017-0456-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-09-21 /pmc/articles/PMC5615262/ /pubmed/28950861 http://dx.doi.org/10.1186/s12918-017-0456-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Sun, Duanchen
Liu, Yinliang
Zhang, Xiang-Sun
Wu, Ling-Yun
NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
title NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
title_full NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
title_fullStr NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
title_full_unstemmed NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
title_short NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
title_sort netgen: a novel network-based probabilistic generative model for gene set functional enrichment analysis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5615262/
https://www.ncbi.nlm.nih.gov/pubmed/28950861
http://dx.doi.org/10.1186/s12918-017-0456-7
work_keys_str_mv AT sunduanchen netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis
AT liuyinliang netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis
AT zhangxiangsun netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis
AT wulingyun netgenanovelnetworkbasedprobabilisticgenerativemodelforgenesetfunctionalenrichmentanalysis