Cargando…

Extension of the COG and arCOG databases by amino acid and nucleotide sequences

BACKGROUND: The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. RESULTS: Using sequence information obtained from GenBank f...

Descripción completa

Detalles Bibliográficos
Autores principales: Meereis, Florian, Kaufmann, Michael
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2588464/
https://www.ncbi.nlm.nih.gov/pubmed/19014535
http://dx.doi.org/10.1186/1471-2105-9-479
_version_ 1782160937598320640
author Meereis, Florian
Kaufmann, Michael
author_facet Meereis, Florian
Kaufmann, Michael
author_sort Meereis, Florian
collection PubMed
description BACKGROUND: The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. RESULTS: Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . CONCLUSION: NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document.
format Text
id pubmed-2588464
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25884642008-11-27 Extension of the COG and arCOG databases by amino acid and nucleotide sequences Meereis, Florian Kaufmann, Michael BMC Bioinformatics Database BACKGROUND: The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. RESULTS: Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . CONCLUSION: NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. BioMed Central 2008-11-13 /pmc/articles/PMC2588464/ /pubmed/19014535 http://dx.doi.org/10.1186/1471-2105-9-479 Text en Copyright © 2008 Meereis and Kaufmann; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database
Meereis, Florian
Kaufmann, Michael
Extension of the COG and arCOG databases by amino acid and nucleotide sequences
title Extension of the COG and arCOG databases by amino acid and nucleotide sequences
title_full Extension of the COG and arCOG databases by amino acid and nucleotide sequences
title_fullStr Extension of the COG and arCOG databases by amino acid and nucleotide sequences
title_full_unstemmed Extension of the COG and arCOG databases by amino acid and nucleotide sequences
title_short Extension of the COG and arCOG databases by amino acid and nucleotide sequences
title_sort extension of the cog and arcog databases by amino acid and nucleotide sequences
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2588464/
https://www.ncbi.nlm.nih.gov/pubmed/19014535
http://dx.doi.org/10.1186/1471-2105-9-479
work_keys_str_mv AT meereisflorian extensionofthecogandarcogdatabasesbyaminoacidandnucleotidesequences
AT kaufmannmichael extensionofthecogandarcogdatabasesbyaminoacidandnucleotidesequences