Cargando…
LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data
MOTIVATION: Linked-Reads technologies combine both the high quality and low cost of short-reads sequencing and long-range information, through the use of barcodes tagging reads which originate from a common long DNA molecule. This technology has been employed in a broad range of applications includi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710615/ https://www.ncbi.nlm.nih.gov/pubmed/36700107 http://dx.doi.org/10.1093/bioadv/vbab022 |
_version_ | 1784841404603695104 |
---|---|
author | Morisse, Pierre Lemaitre, Claire Legeai, Fabrice |
author_facet | Morisse, Pierre Lemaitre, Claire Legeai, Fabrice |
author_sort | Morisse, Pierre |
collection | PubMed |
description | MOTIVATION: Linked-Reads technologies combine both the high quality and low cost of short-reads sequencing and long-range information, through the use of barcodes tagging reads which originate from a common long DNA molecule. This technology has been employed in a broad range of applications including genome assembly, phasing and scaffolding, as well as structural variant calling. However, to date, no tool or API dedicated to the manipulation of Linked-Reads data exist. RESULTS: We introduce LRez, a C++ API and toolkit that allows easy management of Linked-Reads data. LRez includes various functionalities, for computing numbers of common barcodes between genomic regions, extracting barcodes from BAM files, as well as indexing and querying BAM, FASTQ and gzipped FASTQ files to quickly fetch all reads or alignments containing a given barcode. LRez is compatible with a wide range of Linked-Reads sequencing technologies, and can thus be used in any tool or pipeline requiring barcode processing or indexing, in order to improve their performances. AVAILABILITY AND IMPLEMENTATION: LRez is implemented in C++, supported on Unix-based platforms and available under AGPL-3.0 License at https://github.com/morispi/LRez, and as a bioconda module. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-9710615 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-97106152023-01-24 LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data Morisse, Pierre Lemaitre, Claire Legeai, Fabrice Bioinform Adv Application Note MOTIVATION: Linked-Reads technologies combine both the high quality and low cost of short-reads sequencing and long-range information, through the use of barcodes tagging reads which originate from a common long DNA molecule. This technology has been employed in a broad range of applications including genome assembly, phasing and scaffolding, as well as structural variant calling. However, to date, no tool or API dedicated to the manipulation of Linked-Reads data exist. RESULTS: We introduce LRez, a C++ API and toolkit that allows easy management of Linked-Reads data. LRez includes various functionalities, for computing numbers of common barcodes between genomic regions, extracting barcodes from BAM files, as well as indexing and querying BAM, FASTQ and gzipped FASTQ files to quickly fetch all reads or alignments containing a given barcode. LRez is compatible with a wide range of Linked-Reads sequencing technologies, and can thus be used in any tool or pipeline requiring barcode processing or indexing, in order to improve their performances. AVAILABILITY AND IMPLEMENTATION: LRez is implemented in C++, supported on Unix-based platforms and available under AGPL-3.0 License at https://github.com/morispi/LRez, and as a bioconda module. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2021-09-25 /pmc/articles/PMC9710615/ /pubmed/36700107 http://dx.doi.org/10.1093/bioadv/vbab022 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Application Note Morisse, Pierre Lemaitre, Claire Legeai, Fabrice LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data |
title | LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data |
title_full | LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data |
title_fullStr | LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data |
title_full_unstemmed | LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data |
title_short | LRez: a C++ API and toolkit for analyzing and managing Linked-Reads data |
title_sort | lrez: a c++ api and toolkit for analyzing and managing linked-reads data |
topic | Application Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710615/ https://www.ncbi.nlm.nih.gov/pubmed/36700107 http://dx.doi.org/10.1093/bioadv/vbab022 |
work_keys_str_mv | AT morissepierre lrezacapiandtoolkitforanalyzingandmanaginglinkedreadsdata AT lemaitreclaire lrezacapiandtoolkitforanalyzingandmanaginglinkedreadsdata AT legeaifabrice lrezacapiandtoolkitforanalyzingandmanaginglinkedreadsdata |