Cargando…

ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs

BACKGROUND: The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function...

Descripción completa

Detalles Bibliográficos
Autores principales: Meiler, Arno, Klinger, Claudia, Kaufmann, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3468366/
https://www.ncbi.nlm.nih.gov/pubmed/22958836
http://dx.doi.org/10.1186/1471-2105-13-223
_version_ 1782245936804659200
author Meiler, Arno
Klinger, Claudia
Kaufmann, Michael
author_facet Meiler, Arno
Klinger, Claudia
Kaufmann, Michael
author_sort Meiler, Arno
collection PubMed
description BACKGROUND: The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. RESULTS: Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. CONCLUSIONS: Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.
format Online
Article
Text
id pubmed-3468366
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34683662012-10-11 ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs Meiler, Arno Klinger, Claudia Kaufmann, Michael BMC Bioinformatics Software BACKGROUND: The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. RESULTS: Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. CONCLUSIONS: Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills. BioMed Central 2012-09-08 /pmc/articles/PMC3468366/ /pubmed/22958836 http://dx.doi.org/10.1186/1471-2105-13-223 Text en Copyright ©2012 Meiler et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Meiler, Arno
Klinger, Claudia
Kaufmann, Michael
ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs
title ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs
title_full ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs
title_fullStr ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs
title_full_unstemmed ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs
title_short ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs
title_sort ancac: amino acid, nucleotide, and codon analysis of cogs – a tool for sequence bias analysis in microbial orthologs
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3468366/
https://www.ncbi.nlm.nih.gov/pubmed/22958836
http://dx.doi.org/10.1186/1471-2105-13-223
work_keys_str_mv AT meilerarno ancacaminoacidnucleotideandcodonanalysisofcogsatoolforsequencebiasanalysisinmicrobialorthologs
AT klingerclaudia ancacaminoacidnucleotideandcodonanalysisofcogsatoolforsequencebiasanalysisinmicrobialorthologs
AT kaufmannmichael ancacaminoacidnucleotideandcodonanalysisofcogsatoolforsequencebiasanalysisinmicrobialorthologs