Cargando…
The Pfam protein families database: towards a more sustainable future
In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702930/ https://www.ncbi.nlm.nih.gov/pubmed/26673716 http://dx.doi.org/10.1093/nar/gkv1344 |
_version_ | 1782408682970021888 |
---|---|
author | Finn, Robert D. Coggill, Penelope Eberhardt, Ruth Y. Eddy, Sean R. Mistry, Jaina Mitchell, Alex L. Potter, Simon C. Punta, Marco Qureshi, Matloob Sangrador-Vegas, Amaia Salazar, Gustavo A. Tate, John Bateman, Alex |
author_facet | Finn, Robert D. Coggill, Penelope Eberhardt, Ruth Y. Eddy, Sean R. Mistry, Jaina Mitchell, Alex L. Potter, Simon C. Punta, Marco Qureshi, Matloob Sangrador-Vegas, Amaia Salazar, Gustavo A. Tate, John Bateman, Alex |
author_sort | Finn, Robert D. |
collection | PubMed |
description | In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool. |
format | Online Article Text |
id | pubmed-4702930 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-47029302016-01-07 The Pfam protein families database: towards a more sustainable future Finn, Robert D. Coggill, Penelope Eberhardt, Ruth Y. Eddy, Sean R. Mistry, Jaina Mitchell, Alex L. Potter, Simon C. Punta, Marco Qureshi, Matloob Sangrador-Vegas, Amaia Salazar, Gustavo A. Tate, John Bateman, Alex Nucleic Acids Res Database Issue In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool. Oxford University Press 2016-01-04 2015-12-15 /pmc/articles/PMC4702930/ /pubmed/26673716 http://dx.doi.org/10.1093/nar/gkv1344 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Database Issue Finn, Robert D. Coggill, Penelope Eberhardt, Ruth Y. Eddy, Sean R. Mistry, Jaina Mitchell, Alex L. Potter, Simon C. Punta, Marco Qureshi, Matloob Sangrador-Vegas, Amaia Salazar, Gustavo A. Tate, John Bateman, Alex The Pfam protein families database: towards a more sustainable future |
title | The Pfam protein families database: towards a more sustainable future |
title_full | The Pfam protein families database: towards a more sustainable future |
title_fullStr | The Pfam protein families database: towards a more sustainable future |
title_full_unstemmed | The Pfam protein families database: towards a more sustainable future |
title_short | The Pfam protein families database: towards a more sustainable future |
title_sort | pfam protein families database: towards a more sustainable future |
topic | Database Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702930/ https://www.ncbi.nlm.nih.gov/pubmed/26673716 http://dx.doi.org/10.1093/nar/gkv1344 |
work_keys_str_mv | AT finnrobertd thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT coggillpenelope thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT eberhardtruthy thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT eddyseanr thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT mistryjaina thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT mitchellalexl thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT pottersimonc thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT puntamarco thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT qureshimatloob thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT sangradorvegasamaia thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT salazargustavoa thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT tatejohn thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT batemanalex thepfamproteinfamiliesdatabasetowardsamoresustainablefuture AT finnrobertd pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT coggillpenelope pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT eberhardtruthy pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT eddyseanr pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT mistryjaina pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT mitchellalexl pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT pottersimonc pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT puntamarco pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT qureshimatloob pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT sangradorvegasamaia pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT salazargustavoa pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT tatejohn pfamproteinfamiliesdatabasetowardsamoresustainablefuture AT batemanalex pfamproteinfamiliesdatabasetowardsamoresustainablefuture |