Cargando…

Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns

Long-read transcriptomics require understanding error sources inherent to technologies. Current approaches cannot compare methods for an individual RNA molecule. Here, we present a novel platform-comparison method that combines barcoding strategies and long-read sequencing to sequence cDNA copies re...

Descripción completa

Detalles Bibliográficos
Autores principales: Mikheenko, Alla, Prjibelski, Andrey D., Joglekar, Anoushka, Tilgner, Hagen U.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8997348/
https://www.ncbi.nlm.nih.gov/pubmed/35301264
http://dx.doi.org/10.1101/gr.276405.121
_version_ 1784684682363797504
author Mikheenko, Alla
Prjibelski, Andrey D.
Joglekar, Anoushka
Tilgner, Hagen U.
author_facet Mikheenko, Alla
Prjibelski, Andrey D.
Joglekar, Anoushka
Tilgner, Hagen U.
author_sort Mikheenko, Alla
collection PubMed
description Long-read transcriptomics require understanding error sources inherent to technologies. Current approaches cannot compare methods for an individual RNA molecule. Here, we present a novel platform-comparison method that combines barcoding strategies and long-read sequencing to sequence cDNA copies representing an individual RNA molecule on both Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). We compare these long-read pairs in terms of sequence content and isoform patterns. Although individual read pairs show high similarity, we find differences in (1) aligned length, (2) transcription start site (TSS), (3) polyadenylation site (poly(A)-site) assignment, and (4) exon–intron structures. Overall, 25% of read pairs disagree on either TSS, poly(A)-site, or splice site. Intron-chain disagreement typically arises from alignment errors of microexons and complicated splice sites. Our single-molecule technology comparison reveals that inconsistencies are often caused by sequencing error–induced inaccurate ONT alignments, especially to downstream GUNNGU donor motifs. However, annotation-disagreeing upstream shifts in NAGNAG acceptors in ONT are often confirmed by PacBio and are thus likely real. In both barcoded and nonbarcoded ONT reads, we find that intron number and proximity of GU/AGs better predict inconsistencies with the annotation than read quality alone. We summarize these findings in an annotation-based algorithm for spliced alignment correction that improves subsequent transcript construction with ONT reads.
format Online
Article
Text
id pubmed-8997348
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-89973482022-04-22 Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns Mikheenko, Alla Prjibelski, Andrey D. Joglekar, Anoushka Tilgner, Hagen U. Genome Res Method Long-read transcriptomics require understanding error sources inherent to technologies. Current approaches cannot compare methods for an individual RNA molecule. Here, we present a novel platform-comparison method that combines barcoding strategies and long-read sequencing to sequence cDNA copies representing an individual RNA molecule on both Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). We compare these long-read pairs in terms of sequence content and isoform patterns. Although individual read pairs show high similarity, we find differences in (1) aligned length, (2) transcription start site (TSS), (3) polyadenylation site (poly(A)-site) assignment, and (4) exon–intron structures. Overall, 25% of read pairs disagree on either TSS, poly(A)-site, or splice site. Intron-chain disagreement typically arises from alignment errors of microexons and complicated splice sites. Our single-molecule technology comparison reveals that inconsistencies are often caused by sequencing error–induced inaccurate ONT alignments, especially to downstream GUNNGU donor motifs. However, annotation-disagreeing upstream shifts in NAGNAG acceptors in ONT are often confirmed by PacBio and are thus likely real. In both barcoded and nonbarcoded ONT reads, we find that intron number and proximity of GU/AGs better predict inconsistencies with the annotation than read quality alone. We summarize these findings in an annotation-based algorithm for spliced alignment correction that improves subsequent transcript construction with ONT reads. Cold Spring Harbor Laboratory Press 2022-04 /pmc/articles/PMC8997348/ /pubmed/35301264 http://dx.doi.org/10.1101/gr.276405.121 Text en © 2022 Mikheenko et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Method
Mikheenko, Alla
Prjibelski, Andrey D.
Joglekar, Anoushka
Tilgner, Hagen U.
Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns
title Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns
title_full Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns
title_fullStr Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns
title_full_unstemmed Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns
title_short Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns
title_sort sequencing of individual barcoded cdnas using pacific biosciences and oxford nanopore technologies reveals platform-specific error patterns
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8997348/
https://www.ncbi.nlm.nih.gov/pubmed/35301264
http://dx.doi.org/10.1101/gr.276405.121
work_keys_str_mv AT mikheenkoalla sequencingofindividualbarcodedcdnasusingpacificbiosciencesandoxfordnanoporetechnologiesrevealsplatformspecificerrorpatterns
AT prjibelskiandreyd sequencingofindividualbarcodedcdnasusingpacificbiosciencesandoxfordnanoporetechnologiesrevealsplatformspecificerrorpatterns
AT joglekaranoushka sequencingofindividualbarcodedcdnasusingpacificbiosciencesandoxfordnanoporetechnologiesrevealsplatformspecificerrorpatterns
AT tilgnerhagenu sequencingofindividualbarcodedcdnasusingpacificbiosciencesandoxfordnanoporetechnologiesrevealsplatformspecificerrorpatterns