Cargando…

UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast

Sequencing the entire RNA molecule leads to a better understanding of the transcriptome architecture. SMARTer (Switching Mechanism at 5′-End of RNA Template) is a technology aimed at generating full-length cDNA from low amounts of mRNA for sequencing by short-read sequencers such as those from Illum...

Descripción completa

Detalles Bibliográficos
Autores principales: Al kadi, Mohamad, Jung, Nicolas, Ito, Shingo, Kameoka, Shoichiro, Hishida, Takashi, Motooka, Daisuke, Nakamura, Shota, Iida, Tetsuya, Okuzaki, Daisuke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7283198/
https://www.ncbi.nlm.nih.gov/pubmed/31955296
http://dx.doi.org/10.1007/s10142-020-00732-1
_version_ 1783544251472150528
author Al kadi, Mohamad
Jung, Nicolas
Ito, Shingo
Kameoka, Shoichiro
Hishida, Takashi
Motooka, Daisuke
Nakamura, Shota
Iida, Tetsuya
Okuzaki, Daisuke
author_facet Al kadi, Mohamad
Jung, Nicolas
Ito, Shingo
Kameoka, Shoichiro
Hishida, Takashi
Motooka, Daisuke
Nakamura, Shota
Iida, Tetsuya
Okuzaki, Daisuke
author_sort Al kadi, Mohamad
collection PubMed
description Sequencing the entire RNA molecule leads to a better understanding of the transcriptome architecture. SMARTer (Switching Mechanism at 5′-End of RNA Template) is a technology aimed at generating full-length cDNA from low amounts of mRNA for sequencing by short-read sequencers such as those from Illumina. However, short read sequencing such as Illumina technology includes fragmentation that results in bias and information loss. Here, we built a pipeline, UNAGI or UNAnnotated Gene Identifier, to process long reads obtained with nanopore sequencing and compared this pipeline with the standard Illumina pipeline by studying the Saccharomyces cerevisiae transcriptome in full-length cDNA samples generated from two different biological samples: haploid and diploid cells. Additionally, we processed the long reads with another long read tool, FLAIR. Our strand-aware method revealed significant differential gene expression that was masked in Illumina data by antisense transcripts. Our pipeline, UNAGI, outperformed the Illumina pipeline and FLAIR in transcript reconstruction (sensitivity and specificity of 80% and 40% vs. 18% and 34% and 79% and 32%, respectively). Moreover, UNAGI discovered 3877 unannotated transcripts including 1282 intergenic transcripts while the Illumina pipeline discovered only 238 unannotated transcripts. For isoforms profiling, UNAGI also outperformed the Illumina pipeline and FLAIR in terms of sensitivity (91% vs. 82% and 63%, respectively). But the low accuracy of nanopore sequencing led to a closer gap in terms of specificity with Illumina pipeline (70% vs. 63%) and to a huge gap with FLAIR (70% vs 0.02%). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10142-020-00732-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-7283198
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-72831982020-06-15 UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast Al kadi, Mohamad Jung, Nicolas Ito, Shingo Kameoka, Shoichiro Hishida, Takashi Motooka, Daisuke Nakamura, Shota Iida, Tetsuya Okuzaki, Daisuke Funct Integr Genomics Original Article Sequencing the entire RNA molecule leads to a better understanding of the transcriptome architecture. SMARTer (Switching Mechanism at 5′-End of RNA Template) is a technology aimed at generating full-length cDNA from low amounts of mRNA for sequencing by short-read sequencers such as those from Illumina. However, short read sequencing such as Illumina technology includes fragmentation that results in bias and information loss. Here, we built a pipeline, UNAGI or UNAnnotated Gene Identifier, to process long reads obtained with nanopore sequencing and compared this pipeline with the standard Illumina pipeline by studying the Saccharomyces cerevisiae transcriptome in full-length cDNA samples generated from two different biological samples: haploid and diploid cells. Additionally, we processed the long reads with another long read tool, FLAIR. Our strand-aware method revealed significant differential gene expression that was masked in Illumina data by antisense transcripts. Our pipeline, UNAGI, outperformed the Illumina pipeline and FLAIR in transcript reconstruction (sensitivity and specificity of 80% and 40% vs. 18% and 34% and 79% and 32%, respectively). Moreover, UNAGI discovered 3877 unannotated transcripts including 1282 intergenic transcripts while the Illumina pipeline discovered only 238 unannotated transcripts. For isoforms profiling, UNAGI also outperformed the Illumina pipeline and FLAIR in terms of sensitivity (91% vs. 82% and 63%, respectively). But the low accuracy of nanopore sequencing led to a closer gap in terms of specificity with Illumina pipeline (70% vs. 63%) and to a huge gap with FLAIR (70% vs 0.02%). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10142-020-00732-1) contains supplementary material, which is available to authorized users. Springer Berlin Heidelberg 2020-01-18 2020 /pmc/articles/PMC7283198/ /pubmed/31955296 http://dx.doi.org/10.1007/s10142-020-00732-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Original Article
Al kadi, Mohamad
Jung, Nicolas
Ito, Shingo
Kameoka, Shoichiro
Hishida, Takashi
Motooka, Daisuke
Nakamura, Shota
Iida, Tetsuya
Okuzaki, Daisuke
UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast
title UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast
title_full UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast
title_fullStr UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast
title_full_unstemmed UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast
title_short UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast
title_sort unagi: an automated pipeline for nanopore full-length cdna sequencing uncovers novel transcripts and isoforms in yeast
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7283198/
https://www.ncbi.nlm.nih.gov/pubmed/31955296
http://dx.doi.org/10.1007/s10142-020-00732-1
work_keys_str_mv AT alkadimohamad unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT jungnicolas unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT itoshingo unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT kameokashoichiro unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT hishidatakashi unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT motookadaisuke unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT nakamurashota unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT iidatetsuya unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast
AT okuzakidaisuke unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast