Cargando…

Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis

Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular...

Descripción completa

Detalles Bibliográficos
Autores principales: Lees, Jonathan G., Lee, David, Studer, Romain A., Dawson, Natalie L., Sillitoe, Ian, Das, Sayoni, Yeats, Corin, Dessailly, Benoit H., Rentzsch, Robert, Orengo, Christine A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965083/
https://www.ncbi.nlm.nih.gov/pubmed/24270792
http://dx.doi.org/10.1093/nar/gkt1205
_version_ 1782479294766776320
author Lees, Jonathan G.
Lee, David
Studer, Romain A.
Dawson, Natalie L.
Sillitoe, Ian
Das, Sayoni
Yeats, Corin
Dessailly, Benoit H.
Rentzsch, Robert
Orengo, Christine A.
author_facet Lees, Jonathan G.
Lee, David
Studer, Romain A.
Dawson, Natalie L.
Sillitoe, Ian
Das, Sayoni
Yeats, Corin
Dessailly, Benoit H.
Rentzsch, Robert
Orengo, Christine A.
author_sort Lees, Jonathan G.
collection PubMed
description Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year.
format Online
Article
Text
id pubmed-3965083
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-39650832014-03-25 Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis Lees, Jonathan G. Lee, David Studer, Romain A. Dawson, Natalie L. Sillitoe, Ian Das, Sayoni Yeats, Corin Dessailly, Benoit H. Rentzsch, Robert Orengo, Christine A. Nucleic Acids Res II. Protein sequence and structure, motifs and domains Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year. Oxford University Press 2014-01-01 2013-11-21 /pmc/articles/PMC3965083/ /pubmed/24270792 http://dx.doi.org/10.1093/nar/gkt1205 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle II. Protein sequence and structure, motifs and domains
Lees, Jonathan G.
Lee, David
Studer, Romain A.
Dawson, Natalie L.
Sillitoe, Ian
Das, Sayoni
Yeats, Corin
Dessailly, Benoit H.
Rentzsch, Robert
Orengo, Christine A.
Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
title Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
title_full Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
title_fullStr Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
title_full_unstemmed Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
title_short Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
title_sort gene3d: multi-domain annotations for protein sequence and comparative genome analysis
topic II. Protein sequence and structure, motifs and domains
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965083/
https://www.ncbi.nlm.nih.gov/pubmed/24270792
http://dx.doi.org/10.1093/nar/gkt1205
work_keys_str_mv AT leesjonathang gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT leedavid gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT studerromaina gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT dawsonnataliel gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT sillitoeian gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT dassayoni gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT yeatscorin gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT dessaillybenoith gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT rentzschrobert gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis
AT orengochristinea gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis