Cargando…
Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965083/ https://www.ncbi.nlm.nih.gov/pubmed/24270792 http://dx.doi.org/10.1093/nar/gkt1205 |
_version_ | 1782479294766776320 |
---|---|
author | Lees, Jonathan G. Lee, David Studer, Romain A. Dawson, Natalie L. Sillitoe, Ian Das, Sayoni Yeats, Corin Dessailly, Benoit H. Rentzsch, Robert Orengo, Christine A. |
author_facet | Lees, Jonathan G. Lee, David Studer, Romain A. Dawson, Natalie L. Sillitoe, Ian Das, Sayoni Yeats, Corin Dessailly, Benoit H. Rentzsch, Robert Orengo, Christine A. |
author_sort | Lees, Jonathan G. |
collection | PubMed |
description | Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year. |
format | Online Article Text |
id | pubmed-3965083 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-39650832014-03-25 Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis Lees, Jonathan G. Lee, David Studer, Romain A. Dawson, Natalie L. Sillitoe, Ian Das, Sayoni Yeats, Corin Dessailly, Benoit H. Rentzsch, Robert Orengo, Christine A. Nucleic Acids Res II. Protein sequence and structure, motifs and domains Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year. Oxford University Press 2014-01-01 2013-11-21 /pmc/articles/PMC3965083/ /pubmed/24270792 http://dx.doi.org/10.1093/nar/gkt1205 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | II. Protein sequence and structure, motifs and domains Lees, Jonathan G. Lee, David Studer, Romain A. Dawson, Natalie L. Sillitoe, Ian Das, Sayoni Yeats, Corin Dessailly, Benoit H. Rentzsch, Robert Orengo, Christine A. Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis |
title | Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis |
title_full | Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis |
title_fullStr | Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis |
title_full_unstemmed | Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis |
title_short | Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis |
title_sort | gene3d: multi-domain annotations for protein sequence and comparative genome analysis |
topic | II. Protein sequence and structure, motifs and domains |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965083/ https://www.ncbi.nlm.nih.gov/pubmed/24270792 http://dx.doi.org/10.1093/nar/gkt1205 |
work_keys_str_mv | AT leesjonathang gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT leedavid gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT studerromaina gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT dawsonnataliel gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT sillitoeian gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT dassayoni gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT yeatscorin gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT dessaillybenoith gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT rentzschrobert gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis AT orengochristinea gene3dmultidomainannotationsforproteinsequenceandcomparativegenomeanalysis |