Cargando…

Fast alignment of reads to a variation graph with application to SNP detection

Sequencing technologies has provided the basis of most modern genome sequencing studies due to its high base-level accuracy and relatively low cost. One of the most demanding step is mapping reads to the human reference genome. The reliance on a single reference human genome could introduce substant...

Descripción completa

Detalles Bibliográficos
Autores principales: Monsu, Maurilio, Comin, Matteo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: De Gruyter 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8709736/
https://www.ncbi.nlm.nih.gov/pubmed/34783230
http://dx.doi.org/10.1515/jib-2021-0032
_version_ 1784623007605456896
author Monsu, Maurilio
Comin, Matteo
author_facet Monsu, Maurilio
Comin, Matteo
author_sort Monsu, Maurilio
collection PubMed
description Sequencing technologies has provided the basis of most modern genome sequencing studies due to its high base-level accuracy and relatively low cost. One of the most demanding step is mapping reads to the human reference genome. The reliance on a single reference human genome could introduce substantial biases in downstream analyses. Pangenomic graph reference representations offer an attractive approach for storing genetic variations. Moreover, it is possible to include known variants in the reference in order to make read mapping, variant calling, and genotyping variant-aware. Only recently a framework for variation graphs, vg [Garrison E, Adam MN, Siren J, et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol 2018;36:875–9], have improved variation-aware alignment and variant calling in general. The major bottleneck of vg is its high cost of reads mapping to a variation graph. In this paper we study the problem of SNP calling on a variation graph and we present a fast reads alignment tool, named VG SNP-Aware. VG SNP-Aware is able align reads exactly to a variation graph and detect SNPs based on these aligned reads. The results show that VG SNP-Aware can efficiently map reads to a variation graph with a speedup of 40× with respect to vg and similar accuracy on SNPs detection.
format Online
Article
Text
id pubmed-8709736
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher De Gruyter
record_format MEDLINE/PubMed
spelling pubmed-87097362022-01-20 Fast alignment of reads to a variation graph with application to SNP detection Monsu, Maurilio Comin, Matteo J Integr Bioinform Article Sequencing technologies has provided the basis of most modern genome sequencing studies due to its high base-level accuracy and relatively low cost. One of the most demanding step is mapping reads to the human reference genome. The reliance on a single reference human genome could introduce substantial biases in downstream analyses. Pangenomic graph reference representations offer an attractive approach for storing genetic variations. Moreover, it is possible to include known variants in the reference in order to make read mapping, variant calling, and genotyping variant-aware. Only recently a framework for variation graphs, vg [Garrison E, Adam MN, Siren J, et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol 2018;36:875–9], have improved variation-aware alignment and variant calling in general. The major bottleneck of vg is its high cost of reads mapping to a variation graph. In this paper we study the problem of SNP calling on a variation graph and we present a fast reads alignment tool, named VG SNP-Aware. VG SNP-Aware is able align reads exactly to a variation graph and detect SNPs based on these aligned reads. The results show that VG SNP-Aware can efficiently map reads to a variation graph with a speedup of 40× with respect to vg and similar accuracy on SNPs detection. De Gruyter 2021-11-16 /pmc/articles/PMC8709736/ /pubmed/34783230 http://dx.doi.org/10.1515/jib-2021-0032 Text en © 2021 Maurilio Monsu and Matteo Comin published by De Gruyter, Berlin/Boston https://creativecommons.org/licenses/by/4.0/This work is licensed under the Creative Commons Attribution 4.0 International License.
spellingShingle Article
Monsu, Maurilio
Comin, Matteo
Fast alignment of reads to a variation graph with application to SNP detection
title Fast alignment of reads to a variation graph with application to SNP detection
title_full Fast alignment of reads to a variation graph with application to SNP detection
title_fullStr Fast alignment of reads to a variation graph with application to SNP detection
title_full_unstemmed Fast alignment of reads to a variation graph with application to SNP detection
title_short Fast alignment of reads to a variation graph with application to SNP detection
title_sort fast alignment of reads to a variation graph with application to snp detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8709736/
https://www.ncbi.nlm.nih.gov/pubmed/34783230
http://dx.doi.org/10.1515/jib-2021-0032
work_keys_str_mv AT monsumaurilio fastalignmentofreadstoavariationgraphwithapplicationtosnpdetection
AT cominmatteo fastalignmentofreadstoavariationgraphwithapplicationtosnpdetection