Cargando…

LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data

MOTIVATION: Linked-Reads technologies combine both the high quality and low cost of short-reads sequencing and long-range information, through the use of barcodes tagging reads which originate from a common long DNA molecule. This technology has been employed in a broad range of applications includi...

Descripción completa

Detalles Bibliográficos
Autores principales: Morisse, Pierre, Lemaitre, Claire, Legeai, Fabrice
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710615/
https://www.ncbi.nlm.nih.gov/pubmed/36700107
http://dx.doi.org/10.1093/bioadv/vbab022
_version_ 1784841404603695104
author Morisse, Pierre
Lemaitre, Claire
Legeai, Fabrice
author_facet Morisse, Pierre
Lemaitre, Claire
Legeai, Fabrice
author_sort Morisse, Pierre
collection PubMed
description MOTIVATION: Linked-Reads technologies combine both the high quality and low cost of short-reads sequencing and long-range information, through the use of barcodes tagging reads which originate from a common long DNA molecule. This technology has been employed in a broad range of applications including genome assembly, phasing and scaffolding, as well as structural variant calling. However, to date, no tool or API dedicated to the manipulation of Linked-Reads data exist. RESULTS: We introduce LRez, a C++ API and toolkit that allows easy management of Linked-Reads data. LRez includes various functionalities, for computing numbers of common barcodes between genomic regions, extracting barcodes from BAM files, as well as indexing and querying BAM, FASTQ and gzipped FASTQ files to quickly fetch all reads or alignments containing a given barcode. LRez is compatible with a wide range of Linked-Reads sequencing technologies, and can thus be used in any tool or pipeline requiring barcode processing or indexing, in order to improve their performances. AVAILABILITY AND IMPLEMENTATION: LRez is implemented in C++, supported on Unix-based platforms and available under AGPL-3.0 License at https://github.com/morispi/LRez, and as a bioconda module. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9710615
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97106152023-01-24 LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data Morisse, Pierre Lemaitre, Claire Legeai, Fabrice Bioinform Adv Application Note MOTIVATION: Linked-Reads technologies combine both the high quality and low cost of short-reads sequencing and long-range information, through the use of barcodes tagging reads which originate from a common long DNA molecule. This technology has been employed in a broad range of applications including genome assembly, phasing and scaffolding, as well as structural variant calling. However, to date, no tool or API dedicated to the manipulation of Linked-Reads data exist. RESULTS: We introduce LRez, a C++ API and toolkit that allows easy management of Linked-Reads data. LRez includes various functionalities, for computing numbers of common barcodes between genomic regions, extracting barcodes from BAM files, as well as indexing and querying BAM, FASTQ and gzipped FASTQ files to quickly fetch all reads or alignments containing a given barcode. LRez is compatible with a wide range of Linked-Reads sequencing technologies, and can thus be used in any tool or pipeline requiring barcode processing or indexing, in order to improve their performances. AVAILABILITY AND IMPLEMENTATION: LRez is implemented in C++, supported on Unix-based platforms and available under AGPL-3.0 License at https://github.com/morispi/LRez, and as a bioconda module. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2021-09-25 /pmc/articles/PMC9710615/ /pubmed/36700107 http://dx.doi.org/10.1093/bioadv/vbab022 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Application Note
Morisse, Pierre
Lemaitre, Claire
Legeai, Fabrice
LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data
title LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data
title_full LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data
title_fullStr LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data
title_full_unstemmed LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data
title_short LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data
title_sort lrez: a c++ api and toolkit for analyzing and managing linked-reads data
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710615/
https://www.ncbi.nlm.nih.gov/pubmed/36700107
http://dx.doi.org/10.1093/bioadv/vbab022
work_keys_str_mv AT morissepierre lrezacapiandtoolkitforanalyzingandmanaginglinkedreadsdata
AT lemaitreclaire lrezacapiandtoolkitforanalyzingandmanaginglinkedreadsdata
AT legeaifabrice lrezacapiandtoolkitforanalyzingandmanaginglinkedreadsdata