Cargando…
Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7132971/ https://www.ncbi.nlm.nih.gov/pubmed/32252628 http://dx.doi.org/10.1186/s12864-020-6685-y |
_version_ | 1783517538070560768 |
---|---|
author | Grytten, Ivar Rand, Knut D. Nederbragt, Alexander J. Sandve, Geir K. |
author_facet | Grytten, Ivar Rand, Knut D. Nederbragt, Alexander J. Sandve, Geir K. |
author_sort | Grytten, Ivar |
collection | PubMed |
description | BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can improve accuracy as compared to methods using linear references. Both of these methods index the sequences for most paths up to a certain length in the graph in order to enable direct mapping of reads containing common variants. However, the combinatorial explosion of possible paths through nearby variants also leads to a huge search space and an increased chance of false positive alignments to highly variable regions. RESULTS: We here assess three prominent graph-based read mappers against a hybrid baseline approach that combines an initial path determination with a tuned linear read mapping method. We show, using a previously proposed benchmark, that this simple approach is able to improve overall accuracy of read-mapping to graph-based reference genomes. CONCLUSIONS: Our method is implemented in a tool Two-step Graph Mapper, which is available at https://github.com/uio-bmi/two_step_graph_mapperalong with data and scripts for reproducing the experiments. Our method highlights characteristics of the current generation of graph-based read mappers and shows potential for improvement for future graph-based read mappers. |
format | Online Article Text |
id | pubmed-7132971 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-71329712020-04-11 Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods Grytten, Ivar Rand, Knut D. Nederbragt, Alexander J. Sandve, Geir K. BMC Genomics Methodology Article BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can improve accuracy as compared to methods using linear references. Both of these methods index the sequences for most paths up to a certain length in the graph in order to enable direct mapping of reads containing common variants. However, the combinatorial explosion of possible paths through nearby variants also leads to a huge search space and an increased chance of false positive alignments to highly variable regions. RESULTS: We here assess three prominent graph-based read mappers against a hybrid baseline approach that combines an initial path determination with a tuned linear read mapping method. We show, using a previously proposed benchmark, that this simple approach is able to improve overall accuracy of read-mapping to graph-based reference genomes. CONCLUSIONS: Our method is implemented in a tool Two-step Graph Mapper, which is available at https://github.com/uio-bmi/two_step_graph_mapperalong with data and scripts for reproducing the experiments. Our method highlights characteristics of the current generation of graph-based read mappers and shows potential for improvement for future graph-based read mappers. BioMed Central 2020-04-06 /pmc/articles/PMC7132971/ /pubmed/32252628 http://dx.doi.org/10.1186/s12864-020-6685-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Grytten, Ivar Rand, Knut D. Nederbragt, Alexander J. Sandve, Geir K. Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods |
title | Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods |
title_full | Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods |
title_fullStr | Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods |
title_full_unstemmed | Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods |
title_short | Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods |
title_sort | assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7132971/ https://www.ncbi.nlm.nih.gov/pubmed/32252628 http://dx.doi.org/10.1186/s12864-020-6685-y |
work_keys_str_mv | AT gryttenivar assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods AT randknutd assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods AT nederbragtalexanderj assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods AT sandvegeirk assessinggraphbasedreadmappersagainstabaselineapproachhighlightsstrengthsandweaknessesofcurrentmethods |