Cargando…

Fast read alignment with incorporation of known genomic variants

BACKGROUND: Many genetic variants have been reported from sequencing projects due to decreasing experimental costs. Compared to the current typical paradigm, read mapping incorporating existing variants can improve the performance of subsequent analysis. This method is supposed to map sequencing rea...

Descripción completa

Detalles Bibliográficos
Autores principales:	Guo, Hongzhe, Liu, Bo, Guan, Dengfeng, Fu, Yilei, Wang, Yadong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6921400/ https://www.ncbi.nlm.nih.gov/pubmed/31856811 http://dx.doi.org/10.1186/s12911-019-0960-3

_version_	1783481153240432640
author	Guo, Hongzhe Liu, Bo Guan, Dengfeng Fu, Yilei Wang, Yadong
author_facet	Guo, Hongzhe Liu, Bo Guan, Dengfeng Fu, Yilei Wang, Yadong
author_sort	Guo, Hongzhe
collection	PubMed
description	BACKGROUND: Many genetic variants have been reported from sequencing projects due to decreasing experimental costs. Compared to the current typical paradigm, read mapping incorporating existing variants can improve the performance of subsequent analysis. This method is supposed to map sequencing reads efficiently to a graphical index with a reference genome and known variation to increase alignment quality and variant calling accuracy. However, storing and indexing various types of variation require costly RAM space. METHODS: Aligning reads to a graph model-based index including the whole set of variants is ultimately an NP-hard problem in theory. Here, we propose a variation-aware read alignment algorithm (VARA), which generates the alignment between read and multiple genomic sequences simultaneously utilizing the schema of the Landau-Vishkin algorithm. VARA dynamically extracts regional variants to construct a pseudo tree-based structure on-the-fly for seed extension without loading the whole genome variation into memory space. RESULTS: We developed the novel high-throughput sequencing read aligner deBGA-VARA by integrating VARA into deBGA. The deBGA-VARA is benchmarked both on simulated reads and the NA12878 sequencing dataset. The experimental results demonstrate that read alignment incorporating genetic variation knowledge can achieve high sensitivity and accuracy. CONCLUSIONS: Due to its efficiency, VARA provides a promising solution for further improvement of variant calling while maintaining small memory footprints. The deBGA-VARA is available at: https://github.com/hitbc/deBGA-VARA.
format	Online Article Text
id	pubmed-6921400
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-69214002019-12-30 Fast read alignment with incorporation of known genomic variants Guo, Hongzhe Liu, Bo Guan, Dengfeng Fu, Yilei Wang, Yadong BMC Med Inform Decis Mak Research BACKGROUND: Many genetic variants have been reported from sequencing projects due to decreasing experimental costs. Compared to the current typical paradigm, read mapping incorporating existing variants can improve the performance of subsequent analysis. This method is supposed to map sequencing reads efficiently to a graphical index with a reference genome and known variation to increase alignment quality and variant calling accuracy. However, storing and indexing various types of variation require costly RAM space. METHODS: Aligning reads to a graph model-based index including the whole set of variants is ultimately an NP-hard problem in theory. Here, we propose a variation-aware read alignment algorithm (VARA), which generates the alignment between read and multiple genomic sequences simultaneously utilizing the schema of the Landau-Vishkin algorithm. VARA dynamically extracts regional variants to construct a pseudo tree-based structure on-the-fly for seed extension without loading the whole genome variation into memory space. RESULTS: We developed the novel high-throughput sequencing read aligner deBGA-VARA by integrating VARA into deBGA. The deBGA-VARA is benchmarked both on simulated reads and the NA12878 sequencing dataset. The experimental results demonstrate that read alignment incorporating genetic variation knowledge can achieve high sensitivity and accuracy. CONCLUSIONS: Due to its efficiency, VARA provides a promising solution for further improvement of variant calling while maintaining small memory footprints. The deBGA-VARA is available at: https://github.com/hitbc/deBGA-VARA. BioMed Central 2019-12-19 /pmc/articles/PMC6921400/ /pubmed/31856811 http://dx.doi.org/10.1186/s12911-019-0960-3 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Guo, Hongzhe Liu, Bo Guan, Dengfeng Fu, Yilei Wang, Yadong Fast read alignment with incorporation of known genomic variants
title	Fast read alignment with incorporation of known genomic variants
title_full	Fast read alignment with incorporation of known genomic variants
title_fullStr	Fast read alignment with incorporation of known genomic variants
title_full_unstemmed	Fast read alignment with incorporation of known genomic variants
title_short	Fast read alignment with incorporation of known genomic variants
title_sort	fast read alignment with incorporation of known genomic variants
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6921400/ https://www.ncbi.nlm.nih.gov/pubmed/31856811 http://dx.doi.org/10.1186/s12911-019-0960-3
work_keys_str_mv	AT guohongzhe fastreadalignmentwithincorporationofknowngenomicvariants AT liubo fastreadalignmentwithincorporationofknowngenomicvariants AT guandengfeng fastreadalignmentwithincorporationofknowngenomicvariants AT fuyilei fastreadalignmentwithincorporationofknowngenomicvariants AT wangyadong fastreadalignmentwithincorporationofknowngenomicvariants

Fast read alignment with incorporation of known genomic variants

Ejemplares similares