Cargando…
SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics
Gene set analysis is commonly used in functional enrichment and molecular pathway analyses. Most of the present methods are based on the competitive testing methods which assume each gene is independent of the others. However, the false discovery rates of competitive methods are amplified when they...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6603225/ https://www.ncbi.nlm.nih.gov/pubmed/31293623 http://dx.doi.org/10.3389/fgene.2019.00598 |
_version_ | 1783431477255471104 |
---|---|
author | Li, Yiqun Wu, Ying Zhang, Xiaohan Bai, Yunfan Akthar, Luqman Muhammad Lu, Xin Shi, Ming Zhao, Jianxiang Jiang, Qinghua Li, Yu |
author_facet | Li, Yiqun Wu, Ying Zhang, Xiaohan Bai, Yunfan Akthar, Luqman Muhammad Lu, Xin Shi, Ming Zhao, Jianxiang Jiang, Qinghua Li, Yu |
author_sort | Li, Yiqun |
collection | PubMed |
description | Gene set analysis is commonly used in functional enrichment and molecular pathway analyses. Most of the present methods are based on the competitive testing methods which assume each gene is independent of the others. However, the false discovery rates of competitive methods are amplified when they are applied to datasets with high inter-gene correlations. The self-contained testing methods could solve this problem, but there are other restrictions on data characteristics. Therefore, a statistically rigorous testing method applicable to different datasets with various complex characteristics is needed to obtain unbiased and comparable results. We propose a self-contained and competitive incorporated analysis (SCIA) to alleviate the bias caused by the limited application scope of existing gene set analysis methods. This is accomplished through a novel permutation strategy using a priori biological networks to selectively permute gene labels with different probabilities. In simulation studies, SCIA was compared with four representative analysis methods (GSEA, CAMERA, ROAST, and NES), and produced the best performance in both false discovery rate and sensitivity under most conditions with different parameter settings. Further, the KEGG pathway analysis on two real datasets of lung cancer showed that the results found by SCIA in both of the two datasets are much more than that of GSEA and most of them could be supported by literature. Overall, SCIA promisingly offers researchers more reliable and comparable results with different datasets. |
format | Online Article Text |
id | pubmed-6603225 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-66032252019-07-10 SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics Li, Yiqun Wu, Ying Zhang, Xiaohan Bai, Yunfan Akthar, Luqman Muhammad Lu, Xin Shi, Ming Zhao, Jianxiang Jiang, Qinghua Li, Yu Front Genet Genetics Gene set analysis is commonly used in functional enrichment and molecular pathway analyses. Most of the present methods are based on the competitive testing methods which assume each gene is independent of the others. However, the false discovery rates of competitive methods are amplified when they are applied to datasets with high inter-gene correlations. The self-contained testing methods could solve this problem, but there are other restrictions on data characteristics. Therefore, a statistically rigorous testing method applicable to different datasets with various complex characteristics is needed to obtain unbiased and comparable results. We propose a self-contained and competitive incorporated analysis (SCIA) to alleviate the bias caused by the limited application scope of existing gene set analysis methods. This is accomplished through a novel permutation strategy using a priori biological networks to selectively permute gene labels with different probabilities. In simulation studies, SCIA was compared with four representative analysis methods (GSEA, CAMERA, ROAST, and NES), and produced the best performance in both false discovery rate and sensitivity under most conditions with different parameter settings. Further, the KEGG pathway analysis on two real datasets of lung cancer showed that the results found by SCIA in both of the two datasets are much more than that of GSEA and most of them could be supported by literature. Overall, SCIA promisingly offers researchers more reliable and comparable results with different datasets. Frontiers Media S.A. 2019-06-25 /pmc/articles/PMC6603225/ /pubmed/31293623 http://dx.doi.org/10.3389/fgene.2019.00598 Text en Copyright © 2019 Li, Wu, Zhang, Bai, Akthar, Lu, Shi, Zhao, Jiang and Li. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Li, Yiqun Wu, Ying Zhang, Xiaohan Bai, Yunfan Akthar, Luqman Muhammad Lu, Xin Shi, Ming Zhao, Jianxiang Jiang, Qinghua Li, Yu SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics |
title | SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics |
title_full | SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics |
title_fullStr | SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics |
title_full_unstemmed | SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics |
title_short | SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics |
title_sort | scia: a novel gene set analysis applicable to data with different characteristics |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6603225/ https://www.ncbi.nlm.nih.gov/pubmed/31293623 http://dx.doi.org/10.3389/fgene.2019.00598 |
work_keys_str_mv | AT liyiqun sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT wuying sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT zhangxiaohan sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT baiyunfan sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT aktharluqmanmuhammad sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT luxin sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT shiming sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT zhaojianxiang sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT jiangqinghua sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics AT liyu sciaanovelgenesetanalysisapplicabletodatawithdifferentcharacteristics |