Cargando…

snpTree - a web-server to identify and construct SNP trees from whole genome sequence data

BACKGROUND: The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to different...

Descripción completa

Detalles Bibliográficos
Autores principales: Leekitcharoenphon, Pimlapas, Kaas, Rolf S, Thomsen, Martin Christen Frølund, Friis, Carsten, Rasmussen, Simon, Aarestrup, Frank M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521233/
https://www.ncbi.nlm.nih.gov/pubmed/23281601
http://dx.doi.org/10.1186/1471-2164-13-S7-S6
_version_ 1782252911005270016
author Leekitcharoenphon, Pimlapas
Kaas, Rolf S
Thomsen, Martin Christen Frølund
Friis, Carsten
Rasmussen, Simon
Aarestrup, Frank M
author_facet Leekitcharoenphon, Pimlapas
Kaas, Rolf S
Thomsen, Martin Christen Frølund
Friis, Carsten
Rasmussen, Simon
Aarestrup, Frank M
author_sort Leekitcharoenphon, Pimlapas
collection PubMed
description BACKGROUND: The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. RESULTS: Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script. The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evalution results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. CONCLUSIONS: The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/.
format Online
Article
Text
id pubmed-3521233
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35212332012-12-14 snpTree - a web-server to identify and construct SNP trees from whole genome sequence data Leekitcharoenphon, Pimlapas Kaas, Rolf S Thomsen, Martin Christen Frølund Friis, Carsten Rasmussen, Simon Aarestrup, Frank M BMC Genomics Proceedings BACKGROUND: The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. RESULTS: Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script. The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evalution results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. CONCLUSIONS: The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/. BioMed Central 2012-12-07 /pmc/articles/PMC3521233/ /pubmed/23281601 http://dx.doi.org/10.1186/1471-2164-13-S7-S6 Text en Copyright ©2012 Leekitcharoenphon et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Leekitcharoenphon, Pimlapas
Kaas, Rolf S
Thomsen, Martin Christen Frølund
Friis, Carsten
Rasmussen, Simon
Aarestrup, Frank M
snpTree - a web-server to identify and construct SNP trees from whole genome sequence data
title snpTree - a web-server to identify and construct SNP trees from whole genome sequence data
title_full snpTree - a web-server to identify and construct SNP trees from whole genome sequence data
title_fullStr snpTree - a web-server to identify and construct SNP trees from whole genome sequence data
title_full_unstemmed snpTree - a web-server to identify and construct SNP trees from whole genome sequence data
title_short snpTree - a web-server to identify and construct SNP trees from whole genome sequence data
title_sort snptree - a web-server to identify and construct snp trees from whole genome sequence data
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521233/
https://www.ncbi.nlm.nih.gov/pubmed/23281601
http://dx.doi.org/10.1186/1471-2164-13-S7-S6
work_keys_str_mv AT leekitcharoenphonpimlapas snptreeawebservertoidentifyandconstructsnptreesfromwholegenomesequencedata
AT kaasrolfs snptreeawebservertoidentifyandconstructsnptreesfromwholegenomesequencedata
AT thomsenmartinchristenfrølund snptreeawebservertoidentifyandconstructsnptreesfromwholegenomesequencedata
AT friiscarsten snptreeawebservertoidentifyandconstructsnptreesfromwholegenomesequencedata
AT rasmussensimon snptreeawebservertoidentifyandconstructsnptreesfromwholegenomesequencedata
AT aarestrupfrankm snptreeawebservertoidentifyandconstructsnptreesfromwholegenomesequencedata