Cargando…

Robust and scalable barcoding for massively parallel long-read sequencing

Nucleic-acid barcoding is an enabling technique for many applications, but its use remains limited in emerging long-read sequencing technologies with intrinsically low raw accuracy. Here, we apply so-called NS-watermark barcodes, whose error correction capability was previously validated in silico,...

Descripción completa

Detalles Bibliográficos
Autores principales: Ezpeleta, Joaquín, Garcia Labari, Ignacio, Villanova, Gabriela Vanina, Bulacio, Pilar, Lavista-Llanos, Sofía, Posner, Victoria, Krsticevic, Flavia, Arranz, Silvia, Tapia, Elizabeth
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9090787/
https://www.ncbi.nlm.nih.gov/pubmed/35538127
http://dx.doi.org/10.1038/s41598-022-11656-0
_version_ 1784704800741392384
author Ezpeleta, Joaquín
Garcia Labari, Ignacio
Villanova, Gabriela Vanina
Bulacio, Pilar
Lavista-Llanos, Sofía
Posner, Victoria
Krsticevic, Flavia
Arranz, Silvia
Tapia, Elizabeth
author_facet Ezpeleta, Joaquín
Garcia Labari, Ignacio
Villanova, Gabriela Vanina
Bulacio, Pilar
Lavista-Llanos, Sofía
Posner, Victoria
Krsticevic, Flavia
Arranz, Silvia
Tapia, Elizabeth
author_sort Ezpeleta, Joaquín
collection PubMed
description Nucleic-acid barcoding is an enabling technique for many applications, but its use remains limited in emerging long-read sequencing technologies with intrinsically low raw accuracy. Here, we apply so-called NS-watermark barcodes, whose error correction capability was previously validated in silico, in a proof of concept where we synthesize 3840 NS-watermark barcodes and use them to asymmetrically tag and simultaneously sequence amplicons from two evolutionarily distant species (namely Bordetella pertussis and Drosophila mojavensis) on the ONT MinION platform. To our knowledge, this is the largest number of distinct, non-random tags ever sequenced in parallel and the first report of microarray-based synthesis as a source for large oligonucleotide pools for barcoding. We recovered the identity of more than 86% of the barcodes, with a crosstalk rate of 0.17% (i.e., one misassignment every 584 reads). This falls in the range of the index hopping rate of established, high-accuracy Illumina sequencing, despite the increased number of tags and the relatively low accuracy of both microarray-based synthesis and long-read sequencing. The robustness of NS-watermark barcodes, together with their scalable design and compatibility with low-cost massive synthesis, makes them promising for present and future sequencing applications requiring massive labeling, such as long-read single-cell RNA-Seq.
format Online
Article
Text
id pubmed-9090787
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-90907872022-05-12 Robust and scalable barcoding for massively parallel long-read sequencing Ezpeleta, Joaquín Garcia Labari, Ignacio Villanova, Gabriela Vanina Bulacio, Pilar Lavista-Llanos, Sofía Posner, Victoria Krsticevic, Flavia Arranz, Silvia Tapia, Elizabeth Sci Rep Article Nucleic-acid barcoding is an enabling technique for many applications, but its use remains limited in emerging long-read sequencing technologies with intrinsically low raw accuracy. Here, we apply so-called NS-watermark barcodes, whose error correction capability was previously validated in silico, in a proof of concept where we synthesize 3840 NS-watermark barcodes and use them to asymmetrically tag and simultaneously sequence amplicons from two evolutionarily distant species (namely Bordetella pertussis and Drosophila mojavensis) on the ONT MinION platform. To our knowledge, this is the largest number of distinct, non-random tags ever sequenced in parallel and the first report of microarray-based synthesis as a source for large oligonucleotide pools for barcoding. We recovered the identity of more than 86% of the barcodes, with a crosstalk rate of 0.17% (i.e., one misassignment every 584 reads). This falls in the range of the index hopping rate of established, high-accuracy Illumina sequencing, despite the increased number of tags and the relatively low accuracy of both microarray-based synthesis and long-read sequencing. The robustness of NS-watermark barcodes, together with their scalable design and compatibility with low-cost massive synthesis, makes them promising for present and future sequencing applications requiring massive labeling, such as long-read single-cell RNA-Seq. Nature Publishing Group UK 2022-05-10 /pmc/articles/PMC9090787/ /pubmed/35538127 http://dx.doi.org/10.1038/s41598-022-11656-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Ezpeleta, Joaquín
Garcia Labari, Ignacio
Villanova, Gabriela Vanina
Bulacio, Pilar
Lavista-Llanos, Sofía
Posner, Victoria
Krsticevic, Flavia
Arranz, Silvia
Tapia, Elizabeth
Robust and scalable barcoding for massively parallel long-read sequencing
title Robust and scalable barcoding for massively parallel long-read sequencing
title_full Robust and scalable barcoding for massively parallel long-read sequencing
title_fullStr Robust and scalable barcoding for massively parallel long-read sequencing
title_full_unstemmed Robust and scalable barcoding for massively parallel long-read sequencing
title_short Robust and scalable barcoding for massively parallel long-read sequencing
title_sort robust and scalable barcoding for massively parallel long-read sequencing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9090787/
https://www.ncbi.nlm.nih.gov/pubmed/35538127
http://dx.doi.org/10.1038/s41598-022-11656-0
work_keys_str_mv AT ezpeletajoaquin robustandscalablebarcodingformassivelyparallellongreadsequencing
AT garcialabariignacio robustandscalablebarcodingformassivelyparallellongreadsequencing
AT villanovagabrielavanina robustandscalablebarcodingformassivelyparallellongreadsequencing
AT bulaciopilar robustandscalablebarcodingformassivelyparallellongreadsequencing
AT lavistallanossofia robustandscalablebarcodingformassivelyparallellongreadsequencing
AT posnervictoria robustandscalablebarcodingformassivelyparallellongreadsequencing
AT krsticevicflavia robustandscalablebarcodingformassivelyparallellongreadsequencing
AT arranzsilvia robustandscalablebarcodingformassivelyparallellongreadsequencing
AT tapiaelizabeth robustandscalablebarcodingformassivelyparallellongreadsequencing