Cargando…
cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
Summary: Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organism...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2916713/ https://www.ncbi.nlm.nih.gov/pubmed/20538725 http://dx.doi.org/10.1093/bioinformatics/btq299 |
_version_ | 1782185013099364352 |
---|---|
author | Zhou, Fengfeng Xu, Ying |
author_facet | Zhou, Fengfeng Xu, Ying |
author_sort | Zhou, Fengfeng |
collection | PubMed |
description | Summary: Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organisms, possibly from different kingdoms. Computational methods for prediction of genomic elements such as genes are significantly different for chromosomes and plasmids, hence raising the need for separation of chromosomal from plasmid sequences in a metagenome. We present a program for classification of a metagenome set into chromosomal and plasmid sequences, based on their distinguishing pentamer frequencies. On a large training set consisting of all the sequenced prokaryotic chromosomes and plasmids, the program achieves ∼92% in classification accuracy. On a large set of simulated metagenomes with sequence lengths ranging from 300 bp to 100 kbp, the program has classification accuracy from 64.45% to 88.75%. On a large independent test set, the program achieves 88.29% classification accuracy. Availability: The program has been implemented as a standalone prediction program, cBar, which is available at http://csbl.bmb.uga.edu/∼ffzhou/cBar Contact: xyn@bmb.uga.edu Supplementary information:Supplementary data are available at Bioinformatics online. |
format | Text |
id | pubmed-2916713 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-29167132010-08-06 cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data Zhou, Fengfeng Xu, Ying Bioinformatics Applications Note Summary: Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organisms, possibly from different kingdoms. Computational methods for prediction of genomic elements such as genes are significantly different for chromosomes and plasmids, hence raising the need for separation of chromosomal from plasmid sequences in a metagenome. We present a program for classification of a metagenome set into chromosomal and plasmid sequences, based on their distinguishing pentamer frequencies. On a large training set consisting of all the sequenced prokaryotic chromosomes and plasmids, the program achieves ∼92% in classification accuracy. On a large set of simulated metagenomes with sequence lengths ranging from 300 bp to 100 kbp, the program has classification accuracy from 64.45% to 88.75%. On a large independent test set, the program achieves 88.29% classification accuracy. Availability: The program has been implemented as a standalone prediction program, cBar, which is available at http://csbl.bmb.uga.edu/∼ffzhou/cBar Contact: xyn@bmb.uga.edu Supplementary information:Supplementary data are available at Bioinformatics online. Oxford University Press 2010-08-15 2010-08-02 /pmc/articles/PMC2916713/ /pubmed/20538725 http://dx.doi.org/10.1093/bioinformatics/btq299 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Zhou, Fengfeng Xu, Ying cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data |
title | cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data |
title_full | cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data |
title_fullStr | cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data |
title_full_unstemmed | cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data |
title_short | cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data |
title_sort | cbar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2916713/ https://www.ncbi.nlm.nih.gov/pubmed/20538725 http://dx.doi.org/10.1093/bioinformatics/btq299 |
work_keys_str_mv | AT zhoufengfeng cbaracomputerprogramtodistinguishplasmidderivedfromchromosomederivedsequencefragmentsinmetagenomicsdata AT xuying cbaracomputerprogramtodistinguishplasmidderivedfromchromosomederivedsequencefragmentsinmetagenomicsdata |