Cargando…
Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
Identifying differentially expressed (DE) genes between cancer and normal tissues is of basic importance for studying cancer mechanisms. However, current methods, such as the commonly used Significance Analysis of Microarrays (SAM), are biased to genes with low expression levels. Recently, we propos...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086981/ https://www.ncbi.nlm.nih.gov/pubmed/27796338 http://dx.doi.org/10.1038/srep36227 |
_version_ | 1782463846322012160 |
---|---|
author | Huang, Haiyan Li, Xiangyu Guo, You Zhang, Yuncong Deng, Xusheng Chen, Lufei Zhang, Jiahui Guo, Zheng Ao, Lu |
author_facet | Huang, Haiyan Li, Xiangyu Guo, You Zhang, Yuncong Deng, Xusheng Chen, Lufei Zhang, Jiahui Guo, Zheng Ao, Lu |
author_sort | Huang, Haiyan |
collection | PubMed |
description | Identifying differentially expressed (DE) genes between cancer and normal tissues is of basic importance for studying cancer mechanisms. However, current methods, such as the commonly used Significance Analysis of Microarrays (SAM), are biased to genes with low expression levels. Recently, we proposed an algorithm, named the pairwise difference (PD) algorithm, to identify highly expressed DE genes based on reproducibility evaluation of top-ranked expression differences between paired technical replicates of cells under two experimental conditions. In this study, we extended the application of the algorithm to the identification of DE genes between two types of tissue samples (biological replicates) based on several independent datasets or sub-datasets of a dataset, by constructing multiple paired average gene expression profiles for the two types of samples. Using multiple datasets for lung and esophageal cancers, we demonstrated that PD could identify many DE genes highly expressed in both cancer and normal tissues that tended to be missed by the commonly used SAM. These highly expressed DE genes, including many housekeeping genes, were significantly enriched in many conservative pathways, such as ribosome, proteasome, phagosome and TNF signaling pathways with important functional significances in oncogenesis. |
format | Online Article Text |
id | pubmed-5086981 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-50869812016-11-04 Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets Huang, Haiyan Li, Xiangyu Guo, You Zhang, Yuncong Deng, Xusheng Chen, Lufei Zhang, Jiahui Guo, Zheng Ao, Lu Sci Rep Article Identifying differentially expressed (DE) genes between cancer and normal tissues is of basic importance for studying cancer mechanisms. However, current methods, such as the commonly used Significance Analysis of Microarrays (SAM), are biased to genes with low expression levels. Recently, we proposed an algorithm, named the pairwise difference (PD) algorithm, to identify highly expressed DE genes based on reproducibility evaluation of top-ranked expression differences between paired technical replicates of cells under two experimental conditions. In this study, we extended the application of the algorithm to the identification of DE genes between two types of tissue samples (biological replicates) based on several independent datasets or sub-datasets of a dataset, by constructing multiple paired average gene expression profiles for the two types of samples. Using multiple datasets for lung and esophageal cancers, we demonstrated that PD could identify many DE genes highly expressed in both cancer and normal tissues that tended to be missed by the commonly used SAM. These highly expressed DE genes, including many housekeeping genes, were significantly enriched in many conservative pathways, such as ribosome, proteasome, phagosome and TNF signaling pathways with important functional significances in oncogenesis. Nature Publishing Group 2016-10-31 /pmc/articles/PMC5086981/ /pubmed/27796338 http://dx.doi.org/10.1038/srep36227 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Huang, Haiyan Li, Xiangyu Guo, You Zhang, Yuncong Deng, Xusheng Chen, Lufei Zhang, Jiahui Guo, Zheng Ao, Lu Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets |
title | Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets |
title_full | Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets |
title_fullStr | Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets |
title_full_unstemmed | Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets |
title_short | Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets |
title_sort | identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086981/ https://www.ncbi.nlm.nih.gov/pubmed/27796338 http://dx.doi.org/10.1038/srep36227 |
work_keys_str_mv | AT huanghaiyan identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT lixiangyu identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT guoyou identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT zhangyuncong identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT dengxusheng identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT chenlufei identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT zhangjiahui identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT guozheng identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets AT aolu identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets |