Cargando…

LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights

Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equiva...

Descripción completa

Detalles Bibliográficos
Autores principales: Dong, Xinran, Hao, Yun, Wang, Xiao, Tian, Weidong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4707541/
https://www.ncbi.nlm.nih.gov/pubmed/26750448
http://dx.doi.org/10.1038/srep18871
_version_ 1782409332081557504
author Dong, Xinran
Hao, Yun
Wang, Xiao
Tian, Weidong
author_facet Dong, Xinran
Hao, Yun
Wang, Xiao
Tian, Weidong
author_sort Dong, Xinran
collection PubMed
description Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
format Online
Article
Text
id pubmed-4707541
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-47075412016-01-20 LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights Dong, Xinran Hao, Yun Wang, Xiao Tian, Weidong Sci Rep Article Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. Nature Publishing Group 2016-01-11 /pmc/articles/PMC4707541/ /pubmed/26750448 http://dx.doi.org/10.1038/srep18871 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Dong, Xinran
Hao, Yun
Wang, Xiao
Tian, Weidong
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights
title LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights
title_full LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights
title_fullStr LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights
title_full_unstemmed LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights
title_short LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights
title_sort lego: a novel method for gene set over-representation analysis by incorporating network-based gene weights
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4707541/
https://www.ncbi.nlm.nih.gov/pubmed/26750448
http://dx.doi.org/10.1038/srep18871
work_keys_str_mv AT dongxinran legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights
AT haoyun legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights
AT wangxiao legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights
AT tianweidong legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights