Cargando…

Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D

Completion of eukaryal genomes can be difficult task with the highly repetitive sequences along the chromosomes and short read lengths of second-generation sequencing. Saccharomyces cerevisiae strain CEN.PK113-7D, widely used as a model organism and a cell factory, was selected for this study to dem...

Descripción completa

Detalles Bibliográficos
Autores principales: Jenjaroenpun, Piroon, Wongsurawat, Thidathip, Pereira, Rui, Patumcharoenpol, Preecha, Ussery, David W, Nielsen, Jens, Nookaew, Intawat
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5909453/
https://www.ncbi.nlm.nih.gov/pubmed/29346625
http://dx.doi.org/10.1093/nar/gky014
_version_ 1783315901621207040
author Jenjaroenpun, Piroon
Wongsurawat, Thidathip
Pereira, Rui
Patumcharoenpol, Preecha
Ussery, David W
Nielsen, Jens
Nookaew, Intawat
author_facet Jenjaroenpun, Piroon
Wongsurawat, Thidathip
Pereira, Rui
Patumcharoenpol, Preecha
Ussery, David W
Nielsen, Jens
Nookaew, Intawat
author_sort Jenjaroenpun, Piroon
collection PubMed
description Completion of eukaryal genomes can be difficult task with the highly repetitive sequences along the chromosomes and short read lengths of second-generation sequencing. Saccharomyces cerevisiae strain CEN.PK113-7D, widely used as a model organism and a cell factory, was selected for this study to demonstrate the superior capability of very long sequence reads for de novo genome assembly. We generated long reads using two common third-generation sequencing technologies (Oxford Nanopore Technology (ONT) and Pacific Biosciences (PacBio)) and used short reads obtained using Illumina sequencing for error correction. Assembly of the reads derived from all three technologies resulted in complete sequences for all 16 yeast chromosomes, as well as the mitochondrial chromosome, in one step. Further, we identified three types of DNA methylation (5mC, 4mC and 6mA). Comparison between the reference strain S288C and strain CEN.PK113-7D identified chromosomal rearrangements against a background of similar gene content between the two strains. We identified full-length transcripts through ONT direct RNA sequencing technology. This allows for the identification of transcriptional landscapes, including untranslated regions (UTRs) (5′ UTR and 3′ UTR) as well as differential gene expression quantification. About 91% of the predicted transcripts could be consistently detected across biological replicates grown either on glucose or ethanol. Direct RNA sequencing identified many polyadenylated non-coding RNAs, rRNAs, telomere-RNA, long non-coding RNA and antisense RNA. This work demonstrates a strategy to obtain complete genome sequences and transcriptional landscapes that can be applied to other eukaryal organisms.
format Online
Article
Text
id pubmed-5909453
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-59094532018-04-24 Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D Jenjaroenpun, Piroon Wongsurawat, Thidathip Pereira, Rui Patumcharoenpol, Preecha Ussery, David W Nielsen, Jens Nookaew, Intawat Nucleic Acids Res Methods Online Completion of eukaryal genomes can be difficult task with the highly repetitive sequences along the chromosomes and short read lengths of second-generation sequencing. Saccharomyces cerevisiae strain CEN.PK113-7D, widely used as a model organism and a cell factory, was selected for this study to demonstrate the superior capability of very long sequence reads for de novo genome assembly. We generated long reads using two common third-generation sequencing technologies (Oxford Nanopore Technology (ONT) and Pacific Biosciences (PacBio)) and used short reads obtained using Illumina sequencing for error correction. Assembly of the reads derived from all three technologies resulted in complete sequences for all 16 yeast chromosomes, as well as the mitochondrial chromosome, in one step. Further, we identified three types of DNA methylation (5mC, 4mC and 6mA). Comparison between the reference strain S288C and strain CEN.PK113-7D identified chromosomal rearrangements against a background of similar gene content between the two strains. We identified full-length transcripts through ONT direct RNA sequencing technology. This allows for the identification of transcriptional landscapes, including untranslated regions (UTRs) (5′ UTR and 3′ UTR) as well as differential gene expression quantification. About 91% of the predicted transcripts could be consistently detected across biological replicates grown either on glucose or ethanol. Direct RNA sequencing identified many polyadenylated non-coding RNAs, rRNAs, telomere-RNA, long non-coding RNA and antisense RNA. This work demonstrates a strategy to obtain complete genome sequences and transcriptional landscapes that can be applied to other eukaryal organisms. Oxford University Press 2018-04-20 2018-01-13 /pmc/articles/PMC5909453/ /pubmed/29346625 http://dx.doi.org/10.1093/nar/gky014 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Jenjaroenpun, Piroon
Wongsurawat, Thidathip
Pereira, Rui
Patumcharoenpol, Preecha
Ussery, David W
Nielsen, Jens
Nookaew, Intawat
Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D
title Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D
title_full Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D
title_fullStr Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D
title_full_unstemmed Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D
title_short Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D
title_sort complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of saccharomyces cerevisiae cen.pk113-7d
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5909453/
https://www.ncbi.nlm.nih.gov/pubmed/29346625
http://dx.doi.org/10.1093/nar/gky014
work_keys_str_mv AT jenjaroenpunpiroon completegenomicandtranscriptionallandscapeanalysisusingthirdgenerationsequencingacasestudyofsaccharomycescerevisiaecenpk1137d
AT wongsurawatthidathip completegenomicandtranscriptionallandscapeanalysisusingthirdgenerationsequencingacasestudyofsaccharomycescerevisiaecenpk1137d
AT pereirarui completegenomicandtranscriptionallandscapeanalysisusingthirdgenerationsequencingacasestudyofsaccharomycescerevisiaecenpk1137d
AT patumcharoenpolpreecha completegenomicandtranscriptionallandscapeanalysisusingthirdgenerationsequencingacasestudyofsaccharomycescerevisiaecenpk1137d
AT usserydavidw completegenomicandtranscriptionallandscapeanalysisusingthirdgenerationsequencingacasestudyofsaccharomycescerevisiaecenpk1137d
AT nielsenjens completegenomicandtranscriptionallandscapeanalysisusingthirdgenerationsequencingacasestudyofsaccharomycescerevisiaecenpk1137d
AT nookaewintawat completegenomicandtranscriptionallandscapeanalysisusingthirdgenerationsequencingacasestudyofsaccharomycescerevisiaecenpk1137d