Cargando…

Rapid and precise alignment of raw reads against redundant databases with KMA

BACKGROUND: As the cost of sequencing has declined, clinical diagnostics based on next generation sequencing (NGS) have become reality. Diagnostics based on sequencing will require rapid and precise mapping against redundant databases because some of the most important determinants, such as antimicr...

Descripción completa

Detalles Bibliográficos
Autores principales: Clausen, Philip T. L. C., Aarestrup, Frank M., Lund, Ole
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6116485/
https://www.ncbi.nlm.nih.gov/pubmed/30157759
http://dx.doi.org/10.1186/s12859-018-2336-6
_version_ 1783351617634959360
author Clausen, Philip T. L. C.
Aarestrup, Frank M.
Lund, Ole
author_facet Clausen, Philip T. L. C.
Aarestrup, Frank M.
Lund, Ole
author_sort Clausen, Philip T. L. C.
collection PubMed
description BACKGROUND: As the cost of sequencing has declined, clinical diagnostics based on next generation sequencing (NGS) have become reality. Diagnostics based on sequencing will require rapid and precise mapping against redundant databases because some of the most important determinants, such as antimicrobial resistance and core genome multilocus sequence typing (MLST) alleles, are highly similar to one another. In order to facilitate this, a novel mapping method, KMA (k-mer alignment), was designed. KMA is able to map raw reads directly against redundant databases, it also scales well for large redundant databases. KMA uses k-mer seeding to speed up mapping and the Needleman-Wunsch algorithm to accurately align extensions from k-mer seeds. Multi-mapping reads are resolved using a novel sorting scheme (ConClave scheme), ensuring an accurate selection of templates. RESULTS: The functionality of KMA was compared with SRST2, MGmapper, BWA-MEM, Bowtie2, Minimap2 and Salmon, using both simulated data and a dataset of Escherichia coli mapped against resistance genes and core genome MLST alleles. KMA outperforms current methods with respect to both accuracy and speed, while using a comparable amount of memory. CONCLUSION: With KMA, it was possible map raw reads directly against redundant databases with high accuracy, speed and memory efficiency. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2336-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6116485
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61164852018-10-02 Rapid and precise alignment of raw reads against redundant databases with KMA Clausen, Philip T. L. C. Aarestrup, Frank M. Lund, Ole BMC Bioinformatics Research Article BACKGROUND: As the cost of sequencing has declined, clinical diagnostics based on next generation sequencing (NGS) have become reality. Diagnostics based on sequencing will require rapid and precise mapping against redundant databases because some of the most important determinants, such as antimicrobial resistance and core genome multilocus sequence typing (MLST) alleles, are highly similar to one another. In order to facilitate this, a novel mapping method, KMA (k-mer alignment), was designed. KMA is able to map raw reads directly against redundant databases, it also scales well for large redundant databases. KMA uses k-mer seeding to speed up mapping and the Needleman-Wunsch algorithm to accurately align extensions from k-mer seeds. Multi-mapping reads are resolved using a novel sorting scheme (ConClave scheme), ensuring an accurate selection of templates. RESULTS: The functionality of KMA was compared with SRST2, MGmapper, BWA-MEM, Bowtie2, Minimap2 and Salmon, using both simulated data and a dataset of Escherichia coli mapped against resistance genes and core genome MLST alleles. KMA outperforms current methods with respect to both accuracy and speed, while using a comparable amount of memory. CONCLUSION: With KMA, it was possible map raw reads directly against redundant databases with high accuracy, speed and memory efficiency. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2336-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-08-29 /pmc/articles/PMC6116485/ /pubmed/30157759 http://dx.doi.org/10.1186/s12859-018-2336-6 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Clausen, Philip T. L. C.
Aarestrup, Frank M.
Lund, Ole
Rapid and precise alignment of raw reads against redundant databases with KMA
title Rapid and precise alignment of raw reads against redundant databases with KMA
title_full Rapid and precise alignment of raw reads against redundant databases with KMA
title_fullStr Rapid and precise alignment of raw reads against redundant databases with KMA
title_full_unstemmed Rapid and precise alignment of raw reads against redundant databases with KMA
title_short Rapid and precise alignment of raw reads against redundant databases with KMA
title_sort rapid and precise alignment of raw reads against redundant databases with kma
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6116485/
https://www.ncbi.nlm.nih.gov/pubmed/30157759
http://dx.doi.org/10.1186/s12859-018-2336-6
work_keys_str_mv AT clausenphiliptlc rapidandprecisealignmentofrawreadsagainstredundantdatabaseswithkma
AT aarestrupfrankm rapidandprecisealignmentofrawreadsagainstredundantdatabaseswithkma
AT lundole rapidandprecisealignmentofrawreadsagainstredundantdatabaseswithkma