Cargando…

QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data

MOTIVATION: Recently introduced, linked-read technologies, such as the 10× chromium system, use microfluidics to tag multiple short reads from the same long fragment (50–200 kb) with a small sequence, called a barcode. They are inexpensive and easy to prepare, combining the accuracy of short-read se...

Descripción completa

Detalles Bibliográficos
Autores principales: Faure, Roland, Lavenier, Dominique
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710601/
https://www.ncbi.nlm.nih.gov/pubmed/36699389
http://dx.doi.org/10.1093/bioadv/vbac068
_version_ 1784841401277612032
author Faure, Roland
Lavenier, Dominique
author_facet Faure, Roland
Lavenier, Dominique
author_sort Faure, Roland
collection PubMed
description MOTIVATION: Recently introduced, linked-read technologies, such as the 10× chromium system, use microfluidics to tag multiple short reads from the same long fragment (50–200 kb) with a small sequence, called a barcode. They are inexpensive and easy to prepare, combining the accuracy of short-read sequencing with the long-range information of barcodes. The same barcode can be used for several different fragments, which complicates the analyses. RESULTS: We present QuickDeconvolution (QD), a new software for deconvolving a set of reads sharing a barcode, i.e. separating the reads from the different fragments. QD only takes sequencing data as input, without the need for a reference genome. We show that QD outperforms existing software in terms of accuracy, speed and scalability, making it capable of deconvolving previously inaccessible data sets. In particular, we demonstrate here the first example in the literature of a successfully deconvoluted animal sequencing dataset, a 33-Gb Drosophila melanogaster dataset. We show that the taxonomic assignment of linked reads can be improved by deconvoluting reads with QD before taxonomic classification. AVAILABILITY AND IMPLEMENTATION: Code and instructions are available on https://github.com/RolandFaure/QuickDeconvolution. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9710601
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97106012023-01-24 QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data Faure, Roland Lavenier, Dominique Bioinform Adv Original Paper MOTIVATION: Recently introduced, linked-read technologies, such as the 10× chromium system, use microfluidics to tag multiple short reads from the same long fragment (50–200 kb) with a small sequence, called a barcode. They are inexpensive and easy to prepare, combining the accuracy of short-read sequencing with the long-range information of barcodes. The same barcode can be used for several different fragments, which complicates the analyses. RESULTS: We present QuickDeconvolution (QD), a new software for deconvolving a set of reads sharing a barcode, i.e. separating the reads from the different fragments. QD only takes sequencing data as input, without the need for a reference genome. We show that QD outperforms existing software in terms of accuracy, speed and scalability, making it capable of deconvolving previously inaccessible data sets. In particular, we demonstrate here the first example in the literature of a successfully deconvoluted animal sequencing dataset, a 33-Gb Drosophila melanogaster dataset. We show that the taxonomic assignment of linked reads can be improved by deconvoluting reads with QD before taxonomic classification. AVAILABILITY AND IMPLEMENTATION: Code and instructions are available on https://github.com/RolandFaure/QuickDeconvolution. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2022-09-26 /pmc/articles/PMC9710601/ /pubmed/36699389 http://dx.doi.org/10.1093/bioadv/vbac068 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Faure, Roland
Lavenier, Dominique
QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data
title QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data
title_full QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data
title_fullStr QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data
title_full_unstemmed QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data
title_short QuickDeconvolution: fast and scalable deconvolution of linked-read sequencing data
title_sort quickdeconvolution: fast and scalable deconvolution of linked-read sequencing data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710601/
https://www.ncbi.nlm.nih.gov/pubmed/36699389
http://dx.doi.org/10.1093/bioadv/vbac068
work_keys_str_mv AT faureroland quickdeconvolutionfastandscalabledeconvolutionoflinkedreadsequencingdata
AT lavenierdominique quickdeconvolutionfastandscalabledeconvolutionoflinkedreadsequencingdata