Cargando…

cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data

Summary: Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organism...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Fengfeng, Xu, Ying
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2916713/
https://www.ncbi.nlm.nih.gov/pubmed/20538725
http://dx.doi.org/10.1093/bioinformatics/btq299
_version_ 1782185013099364352
author Zhou, Fengfeng
Xu, Ying
author_facet Zhou, Fengfeng
Xu, Ying
author_sort Zhou, Fengfeng
collection PubMed
description Summary: Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organisms, possibly from different kingdoms. Computational methods for prediction of genomic elements such as genes are significantly different for chromosomes and plasmids, hence raising the need for separation of chromosomal from plasmid sequences in a metagenome. We present a program for classification of a metagenome set into chromosomal and plasmid sequences, based on their distinguishing pentamer frequencies. On a large training set consisting of all the sequenced prokaryotic chromosomes and plasmids, the program achieves ∼92% in classification accuracy. On a large set of simulated metagenomes with sequence lengths ranging from 300 bp to 100 kbp, the program has classification accuracy from 64.45% to 88.75%. On a large independent test set, the program achieves 88.29% classification accuracy. Availability: The program has been implemented as a standalone prediction program, cBar, which is available at http://csbl.bmb.uga.edu/∼ffzhou/cBar Contact: xyn@bmb.uga.edu Supplementary information:Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2916713
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29167132010-08-06 cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data Zhou, Fengfeng Xu, Ying Bioinformatics Applications Note Summary: Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organisms, possibly from different kingdoms. Computational methods for prediction of genomic elements such as genes are significantly different for chromosomes and plasmids, hence raising the need for separation of chromosomal from plasmid sequences in a metagenome. We present a program for classification of a metagenome set into chromosomal and plasmid sequences, based on their distinguishing pentamer frequencies. On a large training set consisting of all the sequenced prokaryotic chromosomes and plasmids, the program achieves ∼92% in classification accuracy. On a large set of simulated metagenomes with sequence lengths ranging from 300 bp to 100 kbp, the program has classification accuracy from 64.45% to 88.75%. On a large independent test set, the program achieves 88.29% classification accuracy. Availability: The program has been implemented as a standalone prediction program, cBar, which is available at http://csbl.bmb.uga.edu/∼ffzhou/cBar Contact: xyn@bmb.uga.edu Supplementary information:Supplementary data are available at Bioinformatics online. Oxford University Press 2010-08-15 2010-08-02 /pmc/articles/PMC2916713/ /pubmed/20538725 http://dx.doi.org/10.1093/bioinformatics/btq299 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Zhou, Fengfeng
Xu, Ying
cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
title cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
title_full cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
title_fullStr cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
title_full_unstemmed cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
title_short cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
title_sort cbar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2916713/
https://www.ncbi.nlm.nih.gov/pubmed/20538725
http://dx.doi.org/10.1093/bioinformatics/btq299
work_keys_str_mv AT zhoufengfeng cbaracomputerprogramtodistinguishplasmidderivedfromchromosomederivedsequencefragmentsinmetagenomicsdata
AT xuying cbaracomputerprogramtodistinguishplasmidderivedfromchromosomederivedsequencefragmentsinmetagenomicsdata