Cargando…

Identifying common transcriptome signatures of cancer by interpreting deep learning models

BACKGROUND: Cancer is a set of diseases characterized by unchecked cell proliferation and invasion of surrounding tissues. The many genes that have been genetically associated with cancer or shown to directly contribute to oncogenesis vary widely between tumor types, but common gene signatures that...

Descripción completa

Detalles Bibliográficos
Autores principales: Jha, Anupama, Quesnel-Vallières, Mathieu, Wang, David, Thomas-Tikhonenko, Andrei, Lynch, Kristen W, Barash, Yoseph
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9112525/
https://www.ncbi.nlm.nih.gov/pubmed/35581644
http://dx.doi.org/10.1186/s13059-022-02681-3
_version_ 1784709429248131072
author Jha, Anupama
Quesnel-Vallières, Mathieu
Wang, David
Thomas-Tikhonenko, Andrei
Lynch, Kristen W
Barash, Yoseph
author_facet Jha, Anupama
Quesnel-Vallières, Mathieu
Wang, David
Thomas-Tikhonenko, Andrei
Lynch, Kristen W
Barash, Yoseph
author_sort Jha, Anupama
collection PubMed
description BACKGROUND: Cancer is a set of diseases characterized by unchecked cell proliferation and invasion of surrounding tissues. The many genes that have been genetically associated with cancer or shown to directly contribute to oncogenesis vary widely between tumor types, but common gene signatures that relate to core cancer pathways have also been identified. It is not clear, however, whether there exist additional sets of genes or transcriptomic features that are less well known in cancer biology but that are also commonly deregulated across several cancer types. RESULTS: Here, we agnostically identify transcriptomic features that are commonly shared between cancer types using 13,461 RNA-seq samples from 19 normal tissue types and 18 solid tumor types to train three feed-forward neural networks, based either on protein-coding gene expression, lncRNA expression, or splice junction use, to distinguish between normal and tumor samples. All three models recognize transcriptome signatures that are consistent across tumors. Analysis of attribution values extracted from our models reveals that genes that are commonly altered in cancer by expression or splicing variations are under strong evolutionary and selective constraints. Importantly, we find that genes composing our cancer transcriptome signatures are not frequently affected by mutations or genomic alterations and that their functions differ widely from the genes genetically associated with cancer. CONCLUSIONS: Our results highlighted that deregulation of RNA-processing genes and aberrant splicing are pervasive features on which core cancer pathways might converge across a large array of solid tumor types. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13059-022-02681-3).
format Online
Article
Text
id pubmed-9112525
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91125252022-05-18 Identifying common transcriptome signatures of cancer by interpreting deep learning models Jha, Anupama Quesnel-Vallières, Mathieu Wang, David Thomas-Tikhonenko, Andrei Lynch, Kristen W Barash, Yoseph Genome Biol Research BACKGROUND: Cancer is a set of diseases characterized by unchecked cell proliferation and invasion of surrounding tissues. The many genes that have been genetically associated with cancer or shown to directly contribute to oncogenesis vary widely between tumor types, but common gene signatures that relate to core cancer pathways have also been identified. It is not clear, however, whether there exist additional sets of genes or transcriptomic features that are less well known in cancer biology but that are also commonly deregulated across several cancer types. RESULTS: Here, we agnostically identify transcriptomic features that are commonly shared between cancer types using 13,461 RNA-seq samples from 19 normal tissue types and 18 solid tumor types to train three feed-forward neural networks, based either on protein-coding gene expression, lncRNA expression, or splice junction use, to distinguish between normal and tumor samples. All three models recognize transcriptome signatures that are consistent across tumors. Analysis of attribution values extracted from our models reveals that genes that are commonly altered in cancer by expression or splicing variations are under strong evolutionary and selective constraints. Importantly, we find that genes composing our cancer transcriptome signatures are not frequently affected by mutations or genomic alterations and that their functions differ widely from the genes genetically associated with cancer. CONCLUSIONS: Our results highlighted that deregulation of RNA-processing genes and aberrant splicing are pervasive features on which core cancer pathways might converge across a large array of solid tumor types. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13059-022-02681-3). BioMed Central 2022-05-17 /pmc/articles/PMC9112525/ /pubmed/35581644 http://dx.doi.org/10.1186/s13059-022-02681-3 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Jha, Anupama
Quesnel-Vallières, Mathieu
Wang, David
Thomas-Tikhonenko, Andrei
Lynch, Kristen W
Barash, Yoseph
Identifying common transcriptome signatures of cancer by interpreting deep learning models
title Identifying common transcriptome signatures of cancer by interpreting deep learning models
title_full Identifying common transcriptome signatures of cancer by interpreting deep learning models
title_fullStr Identifying common transcriptome signatures of cancer by interpreting deep learning models
title_full_unstemmed Identifying common transcriptome signatures of cancer by interpreting deep learning models
title_short Identifying common transcriptome signatures of cancer by interpreting deep learning models
title_sort identifying common transcriptome signatures of cancer by interpreting deep learning models
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9112525/
https://www.ncbi.nlm.nih.gov/pubmed/35581644
http://dx.doi.org/10.1186/s13059-022-02681-3
work_keys_str_mv AT jhaanupama identifyingcommontranscriptomesignaturesofcancerbyinterpretingdeeplearningmodels
AT quesnelvallieresmathieu identifyingcommontranscriptomesignaturesofcancerbyinterpretingdeeplearningmodels
AT wangdavid identifyingcommontranscriptomesignaturesofcancerbyinterpretingdeeplearningmodels
AT thomastikhonenkoandrei identifyingcommontranscriptomesignaturesofcancerbyinterpretingdeeplearningmodels
AT lynchkristenw identifyingcommontranscriptomesignaturesofcancerbyinterpretingdeeplearningmodels
AT barashyoseph identifyingcommontranscriptomesignaturesofcancerbyinterpretingdeeplearningmodels