Cargando…

Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods

BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can...

Descripción completa

Detalles Bibliográficos
Autores principales: Grytten, Ivar, Rand, Knut D., Nederbragt, Alexander J., Sandve, Geir K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7132971/
https://www.ncbi.nlm.nih.gov/pubmed/32252628
http://dx.doi.org/10.1186/s12864-020-6685-y
_version_ 1783517538070560768
author Grytten, Ivar
Rand, Knut D.
Nederbragt, Alexander J.
Sandve, Geir K.
author_facet Grytten, Ivar
Rand, Knut D.
Nederbragt, Alexander J.
Sandve, Geir K.
author_sort Grytten, Ivar
collection PubMed
description BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can improve accuracy as compared to methods using linear references. Both of these methods index the sequences for most paths up to a certain length in the graph in order to enable direct mapping of reads containing common variants. However, the combinatorial explosion of possible paths through nearby variants also leads to a huge search space and an increased chance of false positive alignments to highly variable regions. RESULTS: We here assess three prominent graph-based read mappers against a hybrid baseline approach that combines an initial path determination with a tuned linear read mapping method. We show, using a previously proposed benchmark, that this simple approach is able to improve overall accuracy of read-mapping to graph-based reference genomes. CONCLUSIONS: Our method is implemented in a tool Two-step Graph Mapper, which is available at https://github.com/uio-bmi/two_step_graph_mapperalong with data and scripts for reproducing the experiments. Our method highlights characteristics of the current generation of graph-based read mappers and shows potential for improvement for future graph-based read mappers.
format Online
Article
Text
id pubmed-7132971
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71329712020-04-11 Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods Grytten, Ivar Rand, Knut D. Nederbragt, Alexander J. Sandve, Geir K. BMC Genomics Methodology Article BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can improve accuracy as compared to methods using linear references. Both of these methods index the sequences for most paths up to a certain length in the graph in order to enable direct mapping of reads containing common variants. However, the combinatorial explosion of possible paths through nearby variants also leads to a huge search space and an increased chance of false positive alignments to highly variable regions. RESULTS: We here assess three prominent graph-based read mappers against a hybrid baseline approach that combines an initial path determination with a tuned linear read mapping method. We show, using a previously proposed benchmark, that this simple approach is able to improve overall accuracy of read-mapping to graph-based reference genomes. CONCLUSIONS: Our method is implemented in a tool Two-step Graph Mapper, which is available at https://github.com/uio-bmi/two_step_graph_mapperalong with data and scripts for reproducing the experiments. Our method highlights characteristics of the current generation of graph-based read mappers and shows potential for improvement for future graph-based read mappers. BioMed Central 2020-04-06 /pmc/articles/PMC7132971/ /pubmed/32252628 http://dx.doi.org/10.1186/s12864-020-6685-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Grytten, Ivar
Rand, Knut D.
Nederbragt, Alexander J.
Sandve, Geir K.
Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
title Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
title_full Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
title_fullStr Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
title_full_unstemmed Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
title_short Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
title_sort assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7132971/
https://www.ncbi.nlm.nih.gov/pubmed/32252628
http://dx.doi.org/10.1186/s12864-020-6685-y
work_keys_str_mv AT gryttenivar assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods
AT randknutd assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods
AT nederbragtalexanderj assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods
AT sandvegeirk assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods