Cargando…

MentaLiST – A fast MLST caller for large MLST schemes

MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance. Traditionally, MLST is based on identifying sequence types from a small number of housekeeping genes. With the increasing availability of whole-genome sequencing data...

Descripción completa

Detalles Bibliográficos
Autores principales: Feijao, Pedro, Yao, Hua-Ting, Fornika, Dan, Gardy, Jennifer, Hsiao, William, Chauve, Cedric, Chindelevitch, Leonid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5857373/
https://www.ncbi.nlm.nih.gov/pubmed/29319471
http://dx.doi.org/10.1099/mgen.0.000146
_version_ 1783307459417341952
author Feijao, Pedro
Yao, Hua-Ting
Fornika, Dan
Gardy, Jennifer
Hsiao, William
Chauve, Cedric
Chindelevitch, Leonid
author_facet Feijao, Pedro
Yao, Hua-Ting
Fornika, Dan
Gardy, Jennifer
Hsiao, William
Chauve, Cedric
Chindelevitch, Leonid
author_sort Feijao, Pedro
collection PubMed
description MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance. Traditionally, MLST is based on identifying sequence types from a small number of housekeeping genes. With the increasing availability of whole-genome sequencing data, MLST methods have evolved towards larger typing schemes, based on a few hundred genes [core genome MLST (cgMLST)] to a few thousand genes [whole genome MLST (wgMLST)]. Such large-scale MLST schemes have been shown to provide a finer resolution and are increasingly used in various contexts such as hospital outbreaks or foodborne pathogen outbreaks. This methodological shift raises new computational challenges, especially given the large size of the schemes involved. Very few available MLST callers are currently capable of dealing with large MLST schemes. We introduce MentaLiST, a new MLST caller, based on a k-mer voting algorithm and written in the Julia language, specifically designed and implemented to handle large typing schemes. We test it on real and simulated data to show that MentaLiST is faster than any other available MLST caller while providing the same or better accuracy, and is capable of dealing with MLST schemes with up to thousands of genes while requiring limited computational resources. MentaLiST source code and easy installation instructions using a Conda package are available at https://github.com/WGS-TB/MentaLiST.
format Online
Article
Text
id pubmed-5857373
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-58573732018-05-31 MentaLiST – A fast MLST caller for large MLST schemes Feijao, Pedro Yao, Hua-Ting Fornika, Dan Gardy, Jennifer Hsiao, William Chauve, Cedric Chindelevitch, Leonid Microb Genom Methods Paper MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance. Traditionally, MLST is based on identifying sequence types from a small number of housekeeping genes. With the increasing availability of whole-genome sequencing data, MLST methods have evolved towards larger typing schemes, based on a few hundred genes [core genome MLST (cgMLST)] to a few thousand genes [whole genome MLST (wgMLST)]. Such large-scale MLST schemes have been shown to provide a finer resolution and are increasingly used in various contexts such as hospital outbreaks or foodborne pathogen outbreaks. This methodological shift raises new computational challenges, especially given the large size of the schemes involved. Very few available MLST callers are currently capable of dealing with large MLST schemes. We introduce MentaLiST, a new MLST caller, based on a k-mer voting algorithm and written in the Julia language, specifically designed and implemented to handle large typing schemes. We test it on real and simulated data to show that MentaLiST is faster than any other available MLST caller while providing the same or better accuracy, and is capable of dealing with MLST schemes with up to thousands of genes while requiring limited computational resources. MentaLiST source code and easy installation instructions using a Conda package are available at https://github.com/WGS-TB/MentaLiST. Microbiology Society 2018-01-10 /pmc/articles/PMC5857373/ /pubmed/29319471 http://dx.doi.org/10.1099/mgen.0.000146 Text en http://creativecommons.org/licenses/by/4.0/ This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.
spellingShingle Methods Paper
Feijao, Pedro
Yao, Hua-Ting
Fornika, Dan
Gardy, Jennifer
Hsiao, William
Chauve, Cedric
Chindelevitch, Leonid
MentaLiST – A fast MLST caller for large MLST schemes
title MentaLiST – A fast MLST caller for large MLST schemes
title_full MentaLiST – A fast MLST caller for large MLST schemes
title_fullStr MentaLiST – A fast MLST caller for large MLST schemes
title_full_unstemmed MentaLiST – A fast MLST caller for large MLST schemes
title_short MentaLiST – A fast MLST caller for large MLST schemes
title_sort mentalist – a fast mlst caller for large mlst schemes
topic Methods Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5857373/
https://www.ncbi.nlm.nih.gov/pubmed/29319471
http://dx.doi.org/10.1099/mgen.0.000146
work_keys_str_mv AT feijaopedro mentalistafastmlstcallerforlargemlstschemes
AT yaohuating mentalistafastmlstcallerforlargemlstschemes
AT fornikadan mentalistafastmlstcallerforlargemlstschemes
AT gardyjennifer mentalistafastmlstcallerforlargemlstschemes
AT hsiaowilliam mentalistafastmlstcallerforlargemlstschemes
AT chauvecedric mentalistafastmlstcallerforlargemlstschemes
AT chindelevitchleonid mentalistafastmlstcallerforlargemlstschemes