Cargando…

Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models

The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Lei, Bourne, Philip E
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1188274/
https://www.ncbi.nlm.nih.gov/pubmed/16118666
http://dx.doi.org/10.1371/journal.pcbi.0010031
_version_ 1782124786826084352
author Xie, Lei
Bourne, Philip E
author_facet Xie, Lei
Bourne, Philip E
author_sort Xie, Lei
collection PubMed
description The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB), target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB), it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the “most wanted list” that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.
format Text
id pubmed-1188274
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-11882742005-09-12 Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models Xie, Lei Bourne, Philip E PLoS Comput Biol Research Article The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB), target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB), it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the “most wanted list” that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html. Public Library of Science 2005-08 2005-08-19 /pmc/articles/PMC1188274/ /pubmed/16118666 http://dx.doi.org/10.1371/journal.pcbi.0010031 Text en Copyright: © 2005 Xie and Bourne. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xie, Lei
Bourne, Philip E
Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models
title Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models
title_full Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models
title_fullStr Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models
title_full_unstemmed Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models
title_short Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models
title_sort functional coverage of the human genome by existing structures, structural genomics targets, and homology models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1188274/
https://www.ncbi.nlm.nih.gov/pubmed/16118666
http://dx.doi.org/10.1371/journal.pcbi.0010031
work_keys_str_mv AT xielei functionalcoverageofthehumangenomebyexistingstructuresstructuralgenomicstargetsandhomologymodels
AT bournephilipe functionalcoverageofthehumangenomebyexistingstructuresstructuralgenomicstargetsandhomologymodels