Cargando…

Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment

BACKGROUND: Long-read sequencing has enabled unprecedented surveys of structural variation across the entire human genome. To maximize the potential of long-read sequencing in this context, novel mapping methods have emerged that have primarily focused on either speed or accuracy. Various heuristics...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fu, Yilei, Mahmoud, Medhat, Muraliraman, Viginesh Vaibhav, Sedlazeck, Fritz J, Treangen, Todd J
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Technical Note
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8463296/ https://www.ncbi.nlm.nih.gov/pubmed/34561697 http://dx.doi.org/10.1093/gigascience/giab063

_version_	1784572372406239232
author	Fu, Yilei Mahmoud, Medhat Muraliraman, Viginesh Vaibhav Sedlazeck, Fritz J Treangen, Todd J
author_facet	Fu, Yilei Mahmoud, Medhat Muraliraman, Viginesh Vaibhav Sedlazeck, Fritz J Treangen, Todd J
author_sort	Fu, Yilei
collection	PubMed
description	BACKGROUND: Long-read sequencing has enabled unprecedented surveys of structural variation across the entire human genome. To maximize the potential of long-read sequencing in this context, novel mapping methods have emerged that have primarily focused on either speed or accuracy. Various heuristics and scoring schemas have been implemented in widely used read mappers (minimap2 and NGMLR) to optimize for speed or accuracy, which have variable performance across different genomic regions and for specific structural variants. Our hypothesis is that constraining read mapping to the use of a single gap penalty across distinct mutational hot spots reduces read alignment accuracy and impedes structural variant detection. FINDINGS: We tested our hypothesis by implementing a read-mapping pipeline called Vulcan that uses two distinct gap penalty modes, which we refer to as dual-mode alignment. The high-level idea is that Vulcan leverages the computed normalized edit distance of the mapped reads via minimap2 to identify poorly aligned reads and realigns them using the more accurate yet computationally more expensive long-read mapper (NGMLR). In support of our hypothesis, we show that Vulcan improves the alignments for Oxford Nanopore Technology long reads for both simulated and real datasets. These improvements, in turn, lead to improved accuracy for structural variant calling performance on human genome datasets compared to either of the read-mapping methods alone. CONCLUSIONS: Vulcan is the first long-read mapping framework that combines two distinct gap penalty modes for improved structural variant recall and precision. Vulcan is open-source and available under the MIT License at https://gitlab.com/treangenlab/vulcan.
format	Online Article Text
id	pubmed-8463296
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-84632962021-09-27 Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment Fu, Yilei Mahmoud, Medhat Muraliraman, Viginesh Vaibhav Sedlazeck, Fritz J Treangen, Todd J Gigascience Technical Note BACKGROUND: Long-read sequencing has enabled unprecedented surveys of structural variation across the entire human genome. To maximize the potential of long-read sequencing in this context, novel mapping methods have emerged that have primarily focused on either speed or accuracy. Various heuristics and scoring schemas have been implemented in widely used read mappers (minimap2 and NGMLR) to optimize for speed or accuracy, which have variable performance across different genomic regions and for specific structural variants. Our hypothesis is that constraining read mapping to the use of a single gap penalty across distinct mutational hot spots reduces read alignment accuracy and impedes structural variant detection. FINDINGS: We tested our hypothesis by implementing a read-mapping pipeline called Vulcan that uses two distinct gap penalty modes, which we refer to as dual-mode alignment. The high-level idea is that Vulcan leverages the computed normalized edit distance of the mapped reads via minimap2 to identify poorly aligned reads and realigns them using the more accurate yet computationally more expensive long-read mapper (NGMLR). In support of our hypothesis, we show that Vulcan improves the alignments for Oxford Nanopore Technology long reads for both simulated and real datasets. These improvements, in turn, lead to improved accuracy for structural variant calling performance on human genome datasets compared to either of the read-mapping methods alone. CONCLUSIONS: Vulcan is the first long-read mapping framework that combines two distinct gap penalty modes for improved structural variant recall and precision. Vulcan is open-source and available under the MIT License at https://gitlab.com/treangenlab/vulcan. Oxford University Press 2021-09-24 /pmc/articles/PMC8463296/ /pubmed/34561697 http://dx.doi.org/10.1093/gigascience/giab063 Text en © The Author(s) 2021. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Technical Note Fu, Yilei Mahmoud, Medhat Muraliraman, Viginesh Vaibhav Sedlazeck, Fritz J Treangen, Todd J Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment
title	Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment
title_full	Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment
title_fullStr	Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment
title_full_unstemmed	Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment
title_short	Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment
title_sort	vulcan: improved long-read mapping and structural variant calling via dual-mode alignment
topic	Technical Note
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8463296/ https://www.ncbi.nlm.nih.gov/pubmed/34561697 http://dx.doi.org/10.1093/gigascience/giab063
work_keys_str_mv	AT fuyilei vulcanimprovedlongreadmappingandstructuralvariantcallingviadualmodealignment AT mahmoudmedhat vulcanimprovedlongreadmappingandstructuralvariantcallingviadualmodealignment AT muraliramanvigineshvaibhav vulcanimprovedlongreadmappingandstructuralvariantcallingviadualmodealignment AT sedlazeckfritzj vulcanimprovedlongreadmappingandstructuralvariantcallingviadualmodealignment AT treangentoddj vulcanimprovedlongreadmappingandstructuralvariantcallingviadualmodealignment

Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment

Ejemplares similares