Cargando…
MentaLiST – A fast MLST caller for large MLST schemes
MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance. Traditionally, MLST is based on identifying sequence types from a small number of housekeeping genes. With the increasing availability of whole-genome sequencing data...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Microbiology Society
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5857373/ https://www.ncbi.nlm.nih.gov/pubmed/29319471 http://dx.doi.org/10.1099/mgen.0.000146 |
_version_ | 1783307459417341952 |
---|---|
author | Feijao, Pedro Yao, Hua-Ting Fornika, Dan Gardy, Jennifer Hsiao, William Chauve, Cedric Chindelevitch, Leonid |
author_facet | Feijao, Pedro Yao, Hua-Ting Fornika, Dan Gardy, Jennifer Hsiao, William Chauve, Cedric Chindelevitch, Leonid |
author_sort | Feijao, Pedro |
collection | PubMed |
description | MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance. Traditionally, MLST is based on identifying sequence types from a small number of housekeeping genes. With the increasing availability of whole-genome sequencing data, MLST methods have evolved towards larger typing schemes, based on a few hundred genes [core genome MLST (cgMLST)] to a few thousand genes [whole genome MLST (wgMLST)]. Such large-scale MLST schemes have been shown to provide a finer resolution and are increasingly used in various contexts such as hospital outbreaks or foodborne pathogen outbreaks. This methodological shift raises new computational challenges, especially given the large size of the schemes involved. Very few available MLST callers are currently capable of dealing with large MLST schemes. We introduce MentaLiST, a new MLST caller, based on a k-mer voting algorithm and written in the Julia language, specifically designed and implemented to handle large typing schemes. We test it on real and simulated data to show that MentaLiST is faster than any other available MLST caller while providing the same or better accuracy, and is capable of dealing with MLST schemes with up to thousands of genes while requiring limited computational resources. MentaLiST source code and easy installation instructions using a Conda package are available at https://github.com/WGS-TB/MentaLiST. |
format | Online Article Text |
id | pubmed-5857373 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Microbiology Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-58573732018-05-31 MentaLiST – A fast MLST caller for large MLST schemes Feijao, Pedro Yao, Hua-Ting Fornika, Dan Gardy, Jennifer Hsiao, William Chauve, Cedric Chindelevitch, Leonid Microb Genom Methods Paper MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance. Traditionally, MLST is based on identifying sequence types from a small number of housekeeping genes. With the increasing availability of whole-genome sequencing data, MLST methods have evolved towards larger typing schemes, based on a few hundred genes [core genome MLST (cgMLST)] to a few thousand genes [whole genome MLST (wgMLST)]. Such large-scale MLST schemes have been shown to provide a finer resolution and are increasingly used in various contexts such as hospital outbreaks or foodborne pathogen outbreaks. This methodological shift raises new computational challenges, especially given the large size of the schemes involved. Very few available MLST callers are currently capable of dealing with large MLST schemes. We introduce MentaLiST, a new MLST caller, based on a k-mer voting algorithm and written in the Julia language, specifically designed and implemented to handle large typing schemes. We test it on real and simulated data to show that MentaLiST is faster than any other available MLST caller while providing the same or better accuracy, and is capable of dealing with MLST schemes with up to thousands of genes while requiring limited computational resources. MentaLiST source code and easy installation instructions using a Conda package are available at https://github.com/WGS-TB/MentaLiST. Microbiology Society 2018-01-10 /pmc/articles/PMC5857373/ /pubmed/29319471 http://dx.doi.org/10.1099/mgen.0.000146 Text en http://creativecommons.org/licenses/by/4.0/ This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Methods Paper Feijao, Pedro Yao, Hua-Ting Fornika, Dan Gardy, Jennifer Hsiao, William Chauve, Cedric Chindelevitch, Leonid MentaLiST – A fast MLST caller for large MLST schemes |
title | MentaLiST – A fast MLST caller for large MLST schemes |
title_full | MentaLiST – A fast MLST caller for large MLST schemes |
title_fullStr | MentaLiST – A fast MLST caller for large MLST schemes |
title_full_unstemmed | MentaLiST – A fast MLST caller for large MLST schemes |
title_short | MentaLiST – A fast MLST caller for large MLST schemes |
title_sort | mentalist – a fast mlst caller for large mlst schemes |
topic | Methods Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5857373/ https://www.ncbi.nlm.nih.gov/pubmed/29319471 http://dx.doi.org/10.1099/mgen.0.000146 |
work_keys_str_mv | AT feijaopedro mentalistafastmlstcallerforlargemlstschemes AT yaohuating mentalistafastmlstcallerforlargemlstschemes AT fornikadan mentalistafastmlstcallerforlargemlstschemes AT gardyjennifer mentalistafastmlstcallerforlargemlstschemes AT hsiaowilliam mentalistafastmlstcallerforlargemlstschemes AT chauvecedric mentalistafastmlstcallerforlargemlstschemes AT chindelevitchleonid mentalistafastmlstcallerforlargemlstschemes |