Cargando…

Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification

One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furt...

Descripción completa

Detalles Bibliográficos
Autor principal: Shimoni, Yishai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5839591/
https://www.ncbi.nlm.nih.gov/pubmed/29470520
http://dx.doi.org/10.1371/journal.pcbi.1006026
_version_ 1783304438351396864
author Shimoni, Yishai
author_facet Shimoni, Yishai
author_sort Shimoni, Yishai
collection PubMed
description One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes.
format Online
Article
Text
id pubmed-5839591
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-58395912018-03-23 Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification Shimoni, Yishai PLoS Comput Biol Research Article One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes. Public Library of Science 2018-02-22 /pmc/articles/PMC5839591/ /pubmed/29470520 http://dx.doi.org/10.1371/journal.pcbi.1006026 Text en © 2018 Yishai Shimoni http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Shimoni, Yishai
Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
title Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
title_full Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
title_fullStr Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
title_full_unstemmed Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
title_short Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
title_sort association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5839591/
https://www.ncbi.nlm.nih.gov/pubmed/29470520
http://dx.doi.org/10.1371/journal.pcbi.1006026
work_keys_str_mv AT shimoniyishai associationbetweenexpressionofrandomgenesetsandsurvivalisevidentinmultiplecancertypesandmaybeexplainedbysubclassification