Cargando…

HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot

The growth in the number of completely sequenced microbial genomes (bacterial and archaeal) has generated a need for a procedure that provides UniProtKB/Swiss-Prot-quality annotation to as many protein sequences as possible. We have devised a semi-automated system, HAMAP (High-quality Automated and...

Descripción completa

Detalles Bibliográficos
Autores principales: Lima, Tania, Auchincloss, Andrea H., Coudert, Elisabeth, Keller, Guillaume, Michoud, Karine, Rivoire, Catherine, Bulliard, Virginie, de Castro, Edouard, Lachaize, Corinne, Baratin, Delphine, Phan, Isabelle, Bougueleret, Lydie, Bairoch, Amos
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686602/
https://www.ncbi.nlm.nih.gov/pubmed/18849571
http://dx.doi.org/10.1093/nar/gkn661
_version_ 1782167446216507392
author Lima, Tania
Auchincloss, Andrea H.
Coudert, Elisabeth
Keller, Guillaume
Michoud, Karine
Rivoire, Catherine
Bulliard, Virginie
de Castro, Edouard
Lachaize, Corinne
Baratin, Delphine
Phan, Isabelle
Bougueleret, Lydie
Bairoch, Amos
author_facet Lima, Tania
Auchincloss, Andrea H.
Coudert, Elisabeth
Keller, Guillaume
Michoud, Karine
Rivoire, Catherine
Bulliard, Virginie
de Castro, Edouard
Lachaize, Corinne
Baratin, Delphine
Phan, Isabelle
Bougueleret, Lydie
Bairoch, Amos
author_sort Lima, Tania
collection PubMed
description The growth in the number of completely sequenced microbial genomes (bacterial and archaeal) has generated a need for a procedure that provides UniProtKB/Swiss-Prot-quality annotation to as many protein sequences as possible. We have devised a semi-automated system, HAMAP (High-quality Automated and Manual Annotation of microbial Proteomes), that uses manually built annotation templates for protein families to propagate annotation to all members of manually defined protein families, using very strict criteria. The HAMAP system is composed of two databases, the proteome database and the family database, and of an automatic annotation pipeline. The proteome database comprises biological and sequence information for each completely sequenced microbial proteome, and it offers several tools for CDS searches, BLAST options and retrieval of specific sets of proteins. The family database currently comprises more than 1500 manually curated protein families and their annotation templates that are used to annotate proteins that belong to one of the HAMAP families. On the HAMAP website, individual sequences as well as whole genomes can be scanned against all HAMAP families. The system provides warnings for the absence of conserved amino acid residues, unusual sequence length, etc. Thanks to the implementation of HAMAP, more than 200 000 microbial proteins have been fully annotated in UniProtKB/Swiss-Prot (HAMAP website: http://www.expasy.org/sprot/hamap).
format Text
id pubmed-2686602
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-26866022009-06-15 HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot Lima, Tania Auchincloss, Andrea H. Coudert, Elisabeth Keller, Guillaume Michoud, Karine Rivoire, Catherine Bulliard, Virginie de Castro, Edouard Lachaize, Corinne Baratin, Delphine Phan, Isabelle Bougueleret, Lydie Bairoch, Amos Nucleic Acids Res Articles The growth in the number of completely sequenced microbial genomes (bacterial and archaeal) has generated a need for a procedure that provides UniProtKB/Swiss-Prot-quality annotation to as many protein sequences as possible. We have devised a semi-automated system, HAMAP (High-quality Automated and Manual Annotation of microbial Proteomes), that uses manually built annotation templates for protein families to propagate annotation to all members of manually defined protein families, using very strict criteria. The HAMAP system is composed of two databases, the proteome database and the family database, and of an automatic annotation pipeline. The proteome database comprises biological and sequence information for each completely sequenced microbial proteome, and it offers several tools for CDS searches, BLAST options and retrieval of specific sets of proteins. The family database currently comprises more than 1500 manually curated protein families and their annotation templates that are used to annotate proteins that belong to one of the HAMAP families. On the HAMAP website, individual sequences as well as whole genomes can be scanned against all HAMAP families. The system provides warnings for the absence of conserved amino acid residues, unusual sequence length, etc. Thanks to the implementation of HAMAP, more than 200 000 microbial proteins have been fully annotated in UniProtKB/Swiss-Prot (HAMAP website: http://www.expasy.org/sprot/hamap). Oxford University Press 2009-01 2008-10-11 /pmc/articles/PMC2686602/ /pubmed/18849571 http://dx.doi.org/10.1093/nar/gkn661 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Lima, Tania
Auchincloss, Andrea H.
Coudert, Elisabeth
Keller, Guillaume
Michoud, Karine
Rivoire, Catherine
Bulliard, Virginie
de Castro, Edouard
Lachaize, Corinne
Baratin, Delphine
Phan, Isabelle
Bougueleret, Lydie
Bairoch, Amos
HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot
title HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot
title_full HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot
title_fullStr HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot
title_full_unstemmed HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot
title_short HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot
title_sort hamap: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in uniprotkb/swiss-prot
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686602/
https://www.ncbi.nlm.nih.gov/pubmed/18849571
http://dx.doi.org/10.1093/nar/gkn661
work_keys_str_mv AT limatania hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT auchinclossandreah hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT coudertelisabeth hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT kellerguillaume hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT michoudkarine hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT rivoirecatherine hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT bulliardvirginie hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT decastroedouard hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT lachaizecorinne hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT baratindelphine hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT phanisabelle hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT bougueleretlydie hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot
AT bairochamos hamapadatabaseofcompletelysequencedmicrobialproteomesetsandmanuallycuratedmicrobialproteinfamiliesinuniprotkbswissprot