Cargando…

KEGG as a reference resource for gene and protein annotation

KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database....

Descripción completa

Detalles Bibliográficos
Autores principales: Kanehisa, Minoru, Sato, Yoko, Kawashima, Masayuki, Furumichi, Miho, Tanabe, Mao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702792/
https://www.ncbi.nlm.nih.gov/pubmed/26476454
http://dx.doi.org/10.1093/nar/gkv1070
_version_ 1782408651351261184
author Kanehisa, Minoru
Sato, Yoko
Kawashima, Masayuki
Furumichi, Miho
Tanabe, Mao
author_facet Kanehisa, Minoru
Sato, Yoko
Kawashima, Masayuki
Furumichi, Miho
Tanabe, Mao
author_sort Kanehisa, Minoru
collection PubMed
description KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database. The KEGG pathway maps, BRITE hierarchies and KEGG modules are developed as networks of KO nodes, representing high-level functions of the cell and the organism. Currently, more than 4000 complete genomes are annotated with KOs in the KEGG GENES database, which can be used as a reference data set for KO assignment and subsequent reconstruction of KEGG pathways and other molecular networks. As an annotation resource, the following improvements have been made. First, each KO record is re-examined and associated with protein sequence data used in experiments of functional characterization. Second, the GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes. Third, new automatic annotation servers, BlastKOALA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from the GENES database. As a resource for translational bioinformatics, various data sets are created for antimicrobial resistance and drug interaction networks.
format Online
Article
Text
id pubmed-4702792
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47027922016-01-07 KEGG as a reference resource for gene and protein annotation Kanehisa, Minoru Sato, Yoko Kawashima, Masayuki Furumichi, Miho Tanabe, Mao Nucleic Acids Res Database Issue KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database. The KEGG pathway maps, BRITE hierarchies and KEGG modules are developed as networks of KO nodes, representing high-level functions of the cell and the organism. Currently, more than 4000 complete genomes are annotated with KOs in the KEGG GENES database, which can be used as a reference data set for KO assignment and subsequent reconstruction of KEGG pathways and other molecular networks. As an annotation resource, the following improvements have been made. First, each KO record is re-examined and associated with protein sequence data used in experiments of functional characterization. Second, the GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes. Third, new automatic annotation servers, BlastKOALA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from the GENES database. As a resource for translational bioinformatics, various data sets are created for antimicrobial resistance and drug interaction networks. Oxford University Press 2016-01-04 2015-10-17 /pmc/articles/PMC4702792/ /pubmed/26476454 http://dx.doi.org/10.1093/nar/gkv1070 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Database Issue
Kanehisa, Minoru
Sato, Yoko
Kawashima, Masayuki
Furumichi, Miho
Tanabe, Mao
KEGG as a reference resource for gene and protein annotation
title KEGG as a reference resource for gene and protein annotation
title_full KEGG as a reference resource for gene and protein annotation
title_fullStr KEGG as a reference resource for gene and protein annotation
title_full_unstemmed KEGG as a reference resource for gene and protein annotation
title_short KEGG as a reference resource for gene and protein annotation
title_sort kegg as a reference resource for gene and protein annotation
topic Database Issue
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702792/
https://www.ncbi.nlm.nih.gov/pubmed/26476454
http://dx.doi.org/10.1093/nar/gkv1070
work_keys_str_mv AT kanehisaminoru keggasareferenceresourceforgeneandproteinannotation
AT satoyoko keggasareferenceresourceforgeneandproteinannotation
AT kawashimamasayuki keggasareferenceresourceforgeneandproteinannotation
AT furumichimiho keggasareferenceresourceforgeneandproteinannotation
AT tanabemao keggasareferenceresourceforgeneandproteinannotation