Cargando…

Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets

Identifying differentially expressed (DE) genes between cancer and normal tissues is of basic importance for studying cancer mechanisms. However, current methods, such as the commonly used Significance Analysis of Microarrays (SAM), are biased to genes with low expression levels. Recently, we propos...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Haiyan, Li, Xiangyu, Guo, You, Zhang, Yuncong, Deng, Xusheng, Chen, Lufei, Zhang, Jiahui, Guo, Zheng, Ao, Lu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086981/
https://www.ncbi.nlm.nih.gov/pubmed/27796338
http://dx.doi.org/10.1038/srep36227
_version_ 1782463846322012160
author Huang, Haiyan
Li, Xiangyu
Guo, You
Zhang, Yuncong
Deng, Xusheng
Chen, Lufei
Zhang, Jiahui
Guo, Zheng
Ao, Lu
author_facet Huang, Haiyan
Li, Xiangyu
Guo, You
Zhang, Yuncong
Deng, Xusheng
Chen, Lufei
Zhang, Jiahui
Guo, Zheng
Ao, Lu
author_sort Huang, Haiyan
collection PubMed
description Identifying differentially expressed (DE) genes between cancer and normal tissues is of basic importance for studying cancer mechanisms. However, current methods, such as the commonly used Significance Analysis of Microarrays (SAM), are biased to genes with low expression levels. Recently, we proposed an algorithm, named the pairwise difference (PD) algorithm, to identify highly expressed DE genes based on reproducibility evaluation of top-ranked expression differences between paired technical replicates of cells under two experimental conditions. In this study, we extended the application of the algorithm to the identification of DE genes between two types of tissue samples (biological replicates) based on several independent datasets or sub-datasets of a dataset, by constructing multiple paired average gene expression profiles for the two types of samples. Using multiple datasets for lung and esophageal cancers, we demonstrated that PD could identify many DE genes highly expressed in both cancer and normal tissues that tended to be missed by the commonly used SAM. These highly expressed DE genes, including many housekeeping genes, were significantly enriched in many conservative pathways, such as ribosome, proteasome, phagosome and TNF signaling pathways with important functional significances in oncogenesis.
format Online
Article
Text
id pubmed-5086981
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-50869812016-11-04 Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets Huang, Haiyan Li, Xiangyu Guo, You Zhang, Yuncong Deng, Xusheng Chen, Lufei Zhang, Jiahui Guo, Zheng Ao, Lu Sci Rep Article Identifying differentially expressed (DE) genes between cancer and normal tissues is of basic importance for studying cancer mechanisms. However, current methods, such as the commonly used Significance Analysis of Microarrays (SAM), are biased to genes with low expression levels. Recently, we proposed an algorithm, named the pairwise difference (PD) algorithm, to identify highly expressed DE genes based on reproducibility evaluation of top-ranked expression differences between paired technical replicates of cells under two experimental conditions. In this study, we extended the application of the algorithm to the identification of DE genes between two types of tissue samples (biological replicates) based on several independent datasets or sub-datasets of a dataset, by constructing multiple paired average gene expression profiles for the two types of samples. Using multiple datasets for lung and esophageal cancers, we demonstrated that PD could identify many DE genes highly expressed in both cancer and normal tissues that tended to be missed by the commonly used SAM. These highly expressed DE genes, including many housekeeping genes, were significantly enriched in many conservative pathways, such as ribosome, proteasome, phagosome and TNF signaling pathways with important functional significances in oncogenesis. Nature Publishing Group 2016-10-31 /pmc/articles/PMC5086981/ /pubmed/27796338 http://dx.doi.org/10.1038/srep36227 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Huang, Haiyan
Li, Xiangyu
Guo, You
Zhang, Yuncong
Deng, Xusheng
Chen, Lufei
Zhang, Jiahui
Guo, Zheng
Ao, Lu
Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
title Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
title_full Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
title_fullStr Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
title_full_unstemmed Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
title_short Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
title_sort identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086981/
https://www.ncbi.nlm.nih.gov/pubmed/27796338
http://dx.doi.org/10.1038/srep36227
work_keys_str_mv AT huanghaiyan identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT lixiangyu identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT guoyou identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT zhangyuncong identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT dengxusheng identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT chenlufei identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT zhangjiahui identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT guozheng identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets
AT aolu identifyingreproduciblecancerassociatedhighlyexpressedgeneswithimportantfunctionalsignificancesusingmultipledatasets