Cargando…
BPGA- an ultra-fast pan-genome analysis pipeline
Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the g...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4829868/ https://www.ncbi.nlm.nih.gov/pubmed/27071527 http://dx.doi.org/10.1038/srep24373 |
_version_ | 1782426812949725184 |
---|---|
author | Chaudhari, Narendrakumar M. Gupta, Vinod Kumar Dutta, Chitra |
author_facet | Chaudhari, Narendrakumar M. Gupta, Vinod Kumar Dutta, Chitra |
author_sort | Chaudhari, Narendrakumar M. |
collection | PubMed |
description | Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG & COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains. |
format | Online Article Text |
id | pubmed-4829868 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-48298682016-04-19 BPGA- an ultra-fast pan-genome analysis pipeline Chaudhari, Narendrakumar M. Gupta, Vinod Kumar Dutta, Chitra Sci Rep Article Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG & COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains. Nature Publishing Group 2016-04-13 /pmc/articles/PMC4829868/ /pubmed/27071527 http://dx.doi.org/10.1038/srep24373 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Chaudhari, Narendrakumar M. Gupta, Vinod Kumar Dutta, Chitra BPGA- an ultra-fast pan-genome analysis pipeline |
title | BPGA- an ultra-fast pan-genome analysis pipeline |
title_full | BPGA- an ultra-fast pan-genome analysis pipeline |
title_fullStr | BPGA- an ultra-fast pan-genome analysis pipeline |
title_full_unstemmed | BPGA- an ultra-fast pan-genome analysis pipeline |
title_short | BPGA- an ultra-fast pan-genome analysis pipeline |
title_sort | bpga- an ultra-fast pan-genome analysis pipeline |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4829868/ https://www.ncbi.nlm.nih.gov/pubmed/27071527 http://dx.doi.org/10.1038/srep24373 |
work_keys_str_mv | AT chaudharinarendrakumarm bpgaanultrafastpangenomeanalysispipeline AT guptavinodkumar bpgaanultrafastpangenomeanalysispipeline AT duttachitra bpgaanultrafastpangenomeanalysispipeline |