Cargando…
UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast
Sequencing the entire RNA molecule leads to a better understanding of the transcriptome architecture. SMARTer (Switching Mechanism at 5′-End of RNA Template) is a technology aimed at generating full-length cDNA from low amounts of mRNA for sequencing by short-read sequencers such as those from Illum...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7283198/ https://www.ncbi.nlm.nih.gov/pubmed/31955296 http://dx.doi.org/10.1007/s10142-020-00732-1 |
_version_ | 1783544251472150528 |
---|---|
author | Al kadi, Mohamad Jung, Nicolas Ito, Shingo Kameoka, Shoichiro Hishida, Takashi Motooka, Daisuke Nakamura, Shota Iida, Tetsuya Okuzaki, Daisuke |
author_facet | Al kadi, Mohamad Jung, Nicolas Ito, Shingo Kameoka, Shoichiro Hishida, Takashi Motooka, Daisuke Nakamura, Shota Iida, Tetsuya Okuzaki, Daisuke |
author_sort | Al kadi, Mohamad |
collection | PubMed |
description | Sequencing the entire RNA molecule leads to a better understanding of the transcriptome architecture. SMARTer (Switching Mechanism at 5′-End of RNA Template) is a technology aimed at generating full-length cDNA from low amounts of mRNA for sequencing by short-read sequencers such as those from Illumina. However, short read sequencing such as Illumina technology includes fragmentation that results in bias and information loss. Here, we built a pipeline, UNAGI or UNAnnotated Gene Identifier, to process long reads obtained with nanopore sequencing and compared this pipeline with the standard Illumina pipeline by studying the Saccharomyces cerevisiae transcriptome in full-length cDNA samples generated from two different biological samples: haploid and diploid cells. Additionally, we processed the long reads with another long read tool, FLAIR. Our strand-aware method revealed significant differential gene expression that was masked in Illumina data by antisense transcripts. Our pipeline, UNAGI, outperformed the Illumina pipeline and FLAIR in transcript reconstruction (sensitivity and specificity of 80% and 40% vs. 18% and 34% and 79% and 32%, respectively). Moreover, UNAGI discovered 3877 unannotated transcripts including 1282 intergenic transcripts while the Illumina pipeline discovered only 238 unannotated transcripts. For isoforms profiling, UNAGI also outperformed the Illumina pipeline and FLAIR in terms of sensitivity (91% vs. 82% and 63%, respectively). But the low accuracy of nanopore sequencing led to a closer gap in terms of specificity with Illumina pipeline (70% vs. 63%) and to a huge gap with FLAIR (70% vs 0.02%). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10142-020-00732-1) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-7283198 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-72831982020-06-15 UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast Al kadi, Mohamad Jung, Nicolas Ito, Shingo Kameoka, Shoichiro Hishida, Takashi Motooka, Daisuke Nakamura, Shota Iida, Tetsuya Okuzaki, Daisuke Funct Integr Genomics Original Article Sequencing the entire RNA molecule leads to a better understanding of the transcriptome architecture. SMARTer (Switching Mechanism at 5′-End of RNA Template) is a technology aimed at generating full-length cDNA from low amounts of mRNA for sequencing by short-read sequencers such as those from Illumina. However, short read sequencing such as Illumina technology includes fragmentation that results in bias and information loss. Here, we built a pipeline, UNAGI or UNAnnotated Gene Identifier, to process long reads obtained with nanopore sequencing and compared this pipeline with the standard Illumina pipeline by studying the Saccharomyces cerevisiae transcriptome in full-length cDNA samples generated from two different biological samples: haploid and diploid cells. Additionally, we processed the long reads with another long read tool, FLAIR. Our strand-aware method revealed significant differential gene expression that was masked in Illumina data by antisense transcripts. Our pipeline, UNAGI, outperformed the Illumina pipeline and FLAIR in transcript reconstruction (sensitivity and specificity of 80% and 40% vs. 18% and 34% and 79% and 32%, respectively). Moreover, UNAGI discovered 3877 unannotated transcripts including 1282 intergenic transcripts while the Illumina pipeline discovered only 238 unannotated transcripts. For isoforms profiling, UNAGI also outperformed the Illumina pipeline and FLAIR in terms of sensitivity (91% vs. 82% and 63%, respectively). But the low accuracy of nanopore sequencing led to a closer gap in terms of specificity with Illumina pipeline (70% vs. 63%) and to a huge gap with FLAIR (70% vs 0.02%). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10142-020-00732-1) contains supplementary material, which is available to authorized users. Springer Berlin Heidelberg 2020-01-18 2020 /pmc/articles/PMC7283198/ /pubmed/31955296 http://dx.doi.org/10.1007/s10142-020-00732-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Original Article Al kadi, Mohamad Jung, Nicolas Ito, Shingo Kameoka, Shoichiro Hishida, Takashi Motooka, Daisuke Nakamura, Shota Iida, Tetsuya Okuzaki, Daisuke UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast |
title | UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast |
title_full | UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast |
title_fullStr | UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast |
title_full_unstemmed | UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast |
title_short | UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast |
title_sort | unagi: an automated pipeline for nanopore full-length cdna sequencing uncovers novel transcripts and isoforms in yeast |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7283198/ https://www.ncbi.nlm.nih.gov/pubmed/31955296 http://dx.doi.org/10.1007/s10142-020-00732-1 |
work_keys_str_mv | AT alkadimohamad unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT jungnicolas unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT itoshingo unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT kameokashoichiro unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT hishidatakashi unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT motookadaisuke unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT nakamurashota unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT iidatetsuya unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast AT okuzakidaisuke unagianautomatedpipelinefornanoporefulllengthcdnasequencinguncoversnoveltranscriptsandisoformsinyeast |