Cargando…

MTG-Link: leveraging barcode information from linked-reads to assemble specific loci

BACKGROUND: Local assembly with short and long reads has proven to be very useful in many applications: reconstruction of the sequence of a locus of interest, gap-filling in draft assemblies, as well as alternative allele reconstruction of large Structural Variants. Whereas linked-read technologies...

Descripción completa

Detalles Bibliográficos
Autores principales: Guichard, Anne, Legeai, Fabrice, Tagu, Denis, Lemaitre, Claire
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10347852/
https://www.ncbi.nlm.nih.gov/pubmed/37452278
http://dx.doi.org/10.1186/s12859-023-05395-w
Descripción
Sumario:BACKGROUND: Local assembly with short and long reads has proven to be very useful in many applications: reconstruction of the sequence of a locus of interest, gap-filling in draft assemblies, as well as alternative allele reconstruction of large Structural Variants. Whereas linked-read technologies have a great potential to assemble specific loci as they provide long-range information while maintaining the power and accuracy of short-read sequencing, there is a lack of local assembly tools for linked-read data. RESULTS: We present MTG-Link, a novel local assembly tool dedicated to linked-reads. The originality of the method lies in its read subsampling step which takes advantage of the barcode information contained in linked-reads mapped in flanking regions. We validated our approach on several datasets from different linked-read technologies. We show that MTG-Link is able to assemble successfully large sequences, up to dozens of Kb. We also demonstrate that the read subsampling step of MTG-Link considerably improves the local assembly of specific loci compared to other existing short-read local assembly tools. Furthermore, MTG-Link was able to fully characterize large insertion variants and deletion breakpoints in a human genome and to reconstruct dark regions in clinically-relevant human genes. It also improved the contiguity of a 1.3 Mb locus of biological interest in several individual genomes of the mimetic butterfly Heliconius numata. CONCLUSIONS: MTG-Link is an efficient local assembly tool designed for different linked-read sequencing technologies. MTG-Link source code is available at https://github.com/anne-gcd/MTG-Link and as a Bioconda package. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05395-w.