Cargando…

MLSTar: automatic multilocus sequence typing of bacterial genomes in R

Multilocus sequence typing (MLST) is a standard tool in population genetics and bacterial epidemiology that assesses the genetic variation present in a reduced number of housekeeping genes (typically seven) along the genome. This methodology assigns arbitrary integer identifiers to genetic variation...

Descripción completa

Detalles Bibliográficos
Autores principales: Ferrés, Ignacio, Iraola, Gregorio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6005169/
https://www.ncbi.nlm.nih.gov/pubmed/29922519
http://dx.doi.org/10.7717/peerj.5098
_version_ 1783332650281336832
author Ferrés, Ignacio
Iraola, Gregorio
author_facet Ferrés, Ignacio
Iraola, Gregorio
author_sort Ferrés, Ignacio
collection PubMed
description Multilocus sequence typing (MLST) is a standard tool in population genetics and bacterial epidemiology that assesses the genetic variation present in a reduced number of housekeeping genes (typically seven) along the genome. This methodology assigns arbitrary integer identifiers to genetic variations at these loci which allows us to efficiently compare bacterial isolates using allele-based methods. Now, the increasing availability of whole-genome sequences for hundreds to thousands of strains from the same bacterial species has allowed us to apply and extend MLST schemes by automatic extraction of allele information from the genomes. The PubMLST database is the most comprehensive resource of described schemes available for a wide variety of species. Here we present MLSTar as the first R package that allows us to (i) connect with the PubMLST database to select a target scheme, (ii) screen a desired set of genomes to assign alleles and sequence types, and (iii) interact with other widely used R packages to analyze and produce graphical representations of the data. We applied MLSTar to analyze more than 2,500 bacterial genomes from different species, showing great accuracy, and comparable performance with previously published command-line tools. MLSTar can be freely downloaded from http://github.com/iferres/MLSTar.
format Online
Article
Text
id pubmed-6005169
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-60051692018-06-19 MLSTar: automatic multilocus sequence typing of bacterial genomes in R Ferrés, Ignacio Iraola, Gregorio PeerJ Bioinformatics Multilocus sequence typing (MLST) is a standard tool in population genetics and bacterial epidemiology that assesses the genetic variation present in a reduced number of housekeeping genes (typically seven) along the genome. This methodology assigns arbitrary integer identifiers to genetic variations at these loci which allows us to efficiently compare bacterial isolates using allele-based methods. Now, the increasing availability of whole-genome sequences for hundreds to thousands of strains from the same bacterial species has allowed us to apply and extend MLST schemes by automatic extraction of allele information from the genomes. The PubMLST database is the most comprehensive resource of described schemes available for a wide variety of species. Here we present MLSTar as the first R package that allows us to (i) connect with the PubMLST database to select a target scheme, (ii) screen a desired set of genomes to assign alleles and sequence types, and (iii) interact with other widely used R packages to analyze and produce graphical representations of the data. We applied MLSTar to analyze more than 2,500 bacterial genomes from different species, showing great accuracy, and comparable performance with previously published command-line tools. MLSTar can be freely downloaded from http://github.com/iferres/MLSTar. PeerJ Inc. 2018-06-15 /pmc/articles/PMC6005169/ /pubmed/29922519 http://dx.doi.org/10.7717/peerj.5098 Text en © 2018 Ferrés and Iraola http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Ferrés, Ignacio
Iraola, Gregorio
MLSTar: automatic multilocus sequence typing of bacterial genomes in R
title MLSTar: automatic multilocus sequence typing of bacterial genomes in R
title_full MLSTar: automatic multilocus sequence typing of bacterial genomes in R
title_fullStr MLSTar: automatic multilocus sequence typing of bacterial genomes in R
title_full_unstemmed MLSTar: automatic multilocus sequence typing of bacterial genomes in R
title_short MLSTar: automatic multilocus sequence typing of bacterial genomes in R
title_sort mlstar: automatic multilocus sequence typing of bacterial genomes in r
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6005169/
https://www.ncbi.nlm.nih.gov/pubmed/29922519
http://dx.doi.org/10.7717/peerj.5098
work_keys_str_mv AT ferresignacio mlstarautomaticmultilocussequencetypingofbacterialgenomesinr
AT iraolagregorio mlstarautomaticmultilocussequencetypingofbacterialgenomesinr