Cargando…

Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer

BACKGROUND: Deregulated gene expression is a hallmark of cancer; however, most studies to date have analyzed short-read RNA sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short-read RNA sequencing to comprehensively s...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Kie Kyon, Huang, Jiawen, Wu, Jeanie Kar Leng, Lee, Minghui, Tay, Su Ting, Kumar, Vikrant, Ramnarayanan, Kalpana, Padmanabhan, Nisha, Xu, Chang, Tan, Angie Lay Keng, Chan, Charlene, Kappei, Dennis, Göke, Jonathan, Tan, Patrick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7821541/
https://www.ncbi.nlm.nih.gov/pubmed/33482911
http://dx.doi.org/10.1186/s13059-021-02261-x
_version_ 1783639444039925760
author Huang, Kie Kyon
Huang, Jiawen
Wu, Jeanie Kar Leng
Lee, Minghui
Tay, Su Ting
Kumar, Vikrant
Ramnarayanan, Kalpana
Padmanabhan, Nisha
Xu, Chang
Tan, Angie Lay Keng
Chan, Charlene
Kappei, Dennis
Göke, Jonathan
Tan, Patrick
author_facet Huang, Kie Kyon
Huang, Jiawen
Wu, Jeanie Kar Leng
Lee, Minghui
Tay, Su Ting
Kumar, Vikrant
Ramnarayanan, Kalpana
Padmanabhan, Nisha
Xu, Chang
Tan, Angie Lay Keng
Chan, Charlene
Kappei, Dennis
Göke, Jonathan
Tan, Patrick
author_sort Huang, Kie Kyon
collection PubMed
description BACKGROUND: Deregulated gene expression is a hallmark of cancer; however, most studies to date have analyzed short-read RNA sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short-read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. RESULTS: We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which > 66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories, are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. CONCLUSIONS: Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-021-02261-x.
format Online
Article
Text
id pubmed-7821541
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-78215412021-01-25 Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer Huang, Kie Kyon Huang, Jiawen Wu, Jeanie Kar Leng Lee, Minghui Tay, Su Ting Kumar, Vikrant Ramnarayanan, Kalpana Padmanabhan, Nisha Xu, Chang Tan, Angie Lay Keng Chan, Charlene Kappei, Dennis Göke, Jonathan Tan, Patrick Genome Biol Research BACKGROUND: Deregulated gene expression is a hallmark of cancer; however, most studies to date have analyzed short-read RNA sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short-read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. RESULTS: We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which > 66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories, are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. CONCLUSIONS: Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-021-02261-x. BioMed Central 2021-01-22 /pmc/articles/PMC7821541/ /pubmed/33482911 http://dx.doi.org/10.1186/s13059-021-02261-x Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Huang, Kie Kyon
Huang, Jiawen
Wu, Jeanie Kar Leng
Lee, Minghui
Tay, Su Ting
Kumar, Vikrant
Ramnarayanan, Kalpana
Padmanabhan, Nisha
Xu, Chang
Tan, Angie Lay Keng
Chan, Charlene
Kappei, Dennis
Göke, Jonathan
Tan, Patrick
Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer
title Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer
title_full Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer
title_fullStr Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer
title_full_unstemmed Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer
title_short Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer
title_sort long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7821541/
https://www.ncbi.nlm.nih.gov/pubmed/33482911
http://dx.doi.org/10.1186/s13059-021-02261-x
work_keys_str_mv AT huangkiekyon longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT huangjiawen longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT wujeaniekarleng longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT leeminghui longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT taysuting longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT kumarvikrant longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT ramnarayanankalpana longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT padmanabhannisha longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT xuchang longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT tanangielaykeng longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT chancharlene longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT kappeidennis longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT gokejonathan longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer
AT tanpatrick longreadtranscriptomesequencingrevealsabundantpromoterdiversityindistinctmolecularsubtypesofgastriccancer