Cargando…

BPGA- an ultra-fast pan-genome analysis pipeline

Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the g...

Descripción completa

Detalles Bibliográficos
Autores principales: Chaudhari, Narendrakumar M., Gupta, Vinod Kumar, Dutta, Chitra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4829868/
https://www.ncbi.nlm.nih.gov/pubmed/27071527
http://dx.doi.org/10.1038/srep24373
_version_ 1782426812949725184
author Chaudhari, Narendrakumar M.
Gupta, Vinod Kumar
Dutta, Chitra
author_facet Chaudhari, Narendrakumar M.
Gupta, Vinod Kumar
Dutta, Chitra
author_sort Chaudhari, Narendrakumar M.
collection PubMed
description Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG & COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains.
format Online
Article
Text
id pubmed-4829868
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-48298682016-04-19 BPGA- an ultra-fast pan-genome analysis pipeline Chaudhari, Narendrakumar M. Gupta, Vinod Kumar Dutta, Chitra Sci Rep Article Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG & COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains. Nature Publishing Group 2016-04-13 /pmc/articles/PMC4829868/ /pubmed/27071527 http://dx.doi.org/10.1038/srep24373 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Chaudhari, Narendrakumar M.
Gupta, Vinod Kumar
Dutta, Chitra
BPGA- an ultra-fast pan-genome analysis pipeline
title BPGA- an ultra-fast pan-genome analysis pipeline
title_full BPGA- an ultra-fast pan-genome analysis pipeline
title_fullStr BPGA- an ultra-fast pan-genome analysis pipeline
title_full_unstemmed BPGA- an ultra-fast pan-genome analysis pipeline
title_short BPGA- an ultra-fast pan-genome analysis pipeline
title_sort bpga- an ultra-fast pan-genome analysis pipeline
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4829868/
https://www.ncbi.nlm.nih.gov/pubmed/27071527
http://dx.doi.org/10.1038/srep24373
work_keys_str_mv AT chaudharinarendrakumarm bpgaanultrafastpangenomeanalysispipeline
AT guptavinodkumar bpgaanultrafastpangenomeanalysispipeline
AT duttachitra bpgaanultrafastpangenomeanalysispipeline