Cargando…
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights
Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equiva...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4707541/ https://www.ncbi.nlm.nih.gov/pubmed/26750448 http://dx.doi.org/10.1038/srep18871 |
_version_ | 1782409332081557504 |
---|---|
author | Dong, Xinran Hao, Yun Wang, Xiao Tian, Weidong |
author_facet | Dong, Xinran Hao, Yun Wang, Xiao Tian, Weidong |
author_sort | Dong, Xinran |
collection | PubMed |
description | Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. |
format | Online Article Text |
id | pubmed-4707541 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-47075412016-01-20 LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights Dong, Xinran Hao, Yun Wang, Xiao Tian, Weidong Sci Rep Article Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. Nature Publishing Group 2016-01-11 /pmc/articles/PMC4707541/ /pubmed/26750448 http://dx.doi.org/10.1038/srep18871 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Dong, Xinran Hao, Yun Wang, Xiao Tian, Weidong LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights |
title | LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights |
title_full | LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights |
title_fullStr | LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights |
title_full_unstemmed | LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights |
title_short | LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights |
title_sort | lego: a novel method for gene set over-representation analysis by incorporating network-based gene weights |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4707541/ https://www.ncbi.nlm.nih.gov/pubmed/26750448 http://dx.doi.org/10.1038/srep18871 |
work_keys_str_mv | AT dongxinran legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights AT haoyun legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights AT wangxiao legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights AT tianweidong legoanovelmethodforgenesetoverrepresentationanalysisbyincorporatingnetworkbasedgeneweights |