Cargando…

An estimated 5% of new protein structures solved today represent a new Pfam family

High-resolution structural knowledge is key to understanding how proteins function at the molecular level. The number of entries in the Protein Data Bank (PDB), the repository of all publicly available protein structures, continues to increase, with more than 8000 structures released in 2012 alone....

Descripción completa

Detalles Bibliográficos
Autores principales: Mistry, Jaina, Kloppmann, Edda, Rost, Burkhard, Punta, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: International Union of Crystallography 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3817691/
https://www.ncbi.nlm.nih.gov/pubmed/24189229
http://dx.doi.org/10.1107/S0907444913027157
_version_ 1782478112754237440
author Mistry, Jaina
Kloppmann, Edda
Rost, Burkhard
Punta, Marco
author_facet Mistry, Jaina
Kloppmann, Edda
Rost, Burkhard
Punta, Marco
author_sort Mistry, Jaina
collection PubMed
description High-resolution structural knowledge is key to understanding how proteins function at the molecular level. The number of entries in the Protein Data Bank (PDB), the repository of all publicly available protein structures, continues to increase, with more than 8000 structures released in 2012 alone. The authors of this article have studied how structural coverage of the protein-sequence space has changed over time by monitoring the number of Pfam families that acquired their first representative structure each year from 1976 to 2012. Twenty years ago, for every 100 new PDB entries released, an estimated 20 Pfam families acquired their first structure. By 2012, this decreased to only about five families per 100 structures. The reasons behind the slower pace at which previously uncharacterized families are being structurally covered were investigated. It was found that although more than 50% of current Pfam families are still without a structural representative, this set is enriched in families that are small, functionally uncharacterized or rich in problem features such as intrinsically disordered and transmembrane regions. While these are important constraints, the reasons why it may not yet be time to give up the pursuit of a targeted but more comprehensive structural coverage of the protein-sequence space are discussed.
format Online
Article
Text
id pubmed-3817691
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher International Union of Crystallography
record_format MEDLINE/PubMed
spelling pubmed-38176912013-11-06 An estimated 5% of new protein structures solved today represent a new Pfam family Mistry, Jaina Kloppmann, Edda Rost, Burkhard Punta, Marco Acta Crystallogr D Biol Crystallogr Research Papers High-resolution structural knowledge is key to understanding how proteins function at the molecular level. The number of entries in the Protein Data Bank (PDB), the repository of all publicly available protein structures, continues to increase, with more than 8000 structures released in 2012 alone. The authors of this article have studied how structural coverage of the protein-sequence space has changed over time by monitoring the number of Pfam families that acquired their first representative structure each year from 1976 to 2012. Twenty years ago, for every 100 new PDB entries released, an estimated 20 Pfam families acquired their first structure. By 2012, this decreased to only about five families per 100 structures. The reasons behind the slower pace at which previously uncharacterized families are being structurally covered were investigated. It was found that although more than 50% of current Pfam families are still without a structural representative, this set is enriched in families that are small, functionally uncharacterized or rich in problem features such as intrinsically disordered and transmembrane regions. While these are important constraints, the reasons why it may not yet be time to give up the pursuit of a targeted but more comprehensive structural coverage of the protein-sequence space are discussed. International Union of Crystallography 2013-11-01 2013-10-12 /pmc/articles/PMC3817691/ /pubmed/24189229 http://dx.doi.org/10.1107/S0907444913027157 Text en © Mistry et al. 2013 http://creativecommons.org/licenses/by/2.0/uk/ This is an open-access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.
spellingShingle Research Papers
Mistry, Jaina
Kloppmann, Edda
Rost, Burkhard
Punta, Marco
An estimated 5% of new protein structures solved today represent a new Pfam family
title An estimated 5% of new protein structures solved today represent a new Pfam family
title_full An estimated 5% of new protein structures solved today represent a new Pfam family
title_fullStr An estimated 5% of new protein structures solved today represent a new Pfam family
title_full_unstemmed An estimated 5% of new protein structures solved today represent a new Pfam family
title_short An estimated 5% of new protein structures solved today represent a new Pfam family
title_sort estimated 5% of new protein structures solved today represent a new pfam family
topic Research Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3817691/
https://www.ncbi.nlm.nih.gov/pubmed/24189229
http://dx.doi.org/10.1107/S0907444913027157
work_keys_str_mv AT mistryjaina anestimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily
AT kloppmannedda anestimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily
AT rostburkhard anestimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily
AT puntamarco anestimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily
AT mistryjaina estimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily
AT kloppmannedda estimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily
AT rostburkhard estimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily
AT puntamarco estimated5ofnewproteinstructuressolvedtodayrepresentanewpfamfamily