Cargando…
Pseudofam: the pseudogene families database
Pseudofam (http://pseudofam.pseudogene.org) is a database of pseudogene families based on the protein families from the Pfam database. It provides resources for analyzing the family structure of pseudogenes including query tools, statistical summaries and sequence alignments. The current version of...
Autores principales: | , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686518/ https://www.ncbi.nlm.nih.gov/pubmed/18957444 http://dx.doi.org/10.1093/nar/gkn758 |
_version_ | 1782167426215968768 |
---|---|
author | Lam, Hugo Y. K. Khurana, Ekta Fang, Gang Cayting, Philip Carriero, Nicholas Cheung, Kei-Hoi Gerstein, Mark B. |
author_facet | Lam, Hugo Y. K. Khurana, Ekta Fang, Gang Cayting, Philip Carriero, Nicholas Cheung, Kei-Hoi Gerstein, Mark B. |
author_sort | Lam, Hugo Y. K. |
collection | PubMed |
description | Pseudofam (http://pseudofam.pseudogene.org) is a database of pseudogene families based on the protein families from the Pfam database. It provides resources for analyzing the family structure of pseudogenes including query tools, statistical summaries and sequence alignments. The current version of Pseudofam contains more than 125 000 pseudogenes identified from 10 eukaryotic genomes and aligned within nearly 3000 families (approximately one-third of the total families in PfamA). Pseudofam uses a large-scale parallelized homology search algorithm (implemented as an extension of the PseudoPipe pipeline) to identify pseudogenes. Each identified pseudogene is assigned to its parent protein family and subsequently aligned to each other by transferring the parent domain alignments from the Pfam family. Pseudogenes are also given additional annotation based on an ontology, reflecting their mode of creation and subsequent history. In particular, our annotation highlights the association of pseudogene families with genomic features, such as segmental duplications. In addition, pseudogene families are associated with key statistics, which identify outlier families with an unusual degree of pseudogenization. The statistics also show how the number of genes and pseudogenes in families correlates across different species. Overall, they highlight the fact that housekeeping families tend to be enriched with a large number of pseudogenes. |
format | Text |
id | pubmed-2686518 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-26865182009-06-15 Pseudofam: the pseudogene families database Lam, Hugo Y. K. Khurana, Ekta Fang, Gang Cayting, Philip Carriero, Nicholas Cheung, Kei-Hoi Gerstein, Mark B. Nucleic Acids Res Articles Pseudofam (http://pseudofam.pseudogene.org) is a database of pseudogene families based on the protein families from the Pfam database. It provides resources for analyzing the family structure of pseudogenes including query tools, statistical summaries and sequence alignments. The current version of Pseudofam contains more than 125 000 pseudogenes identified from 10 eukaryotic genomes and aligned within nearly 3000 families (approximately one-third of the total families in PfamA). Pseudofam uses a large-scale parallelized homology search algorithm (implemented as an extension of the PseudoPipe pipeline) to identify pseudogenes. Each identified pseudogene is assigned to its parent protein family and subsequently aligned to each other by transferring the parent domain alignments from the Pfam family. Pseudogenes are also given additional annotation based on an ontology, reflecting their mode of creation and subsequent history. In particular, our annotation highlights the association of pseudogene families with genomic features, such as segmental duplications. In addition, pseudogene families are associated with key statistics, which identify outlier families with an unusual degree of pseudogenization. The statistics also show how the number of genes and pseudogenes in families correlates across different species. Overall, they highlight the fact that housekeeping families tend to be enriched with a large number of pseudogenes. Oxford University Press 2009-01 2008-10-28 /pmc/articles/PMC2686518/ /pubmed/18957444 http://dx.doi.org/10.1093/nar/gkn758 Text en Published by Oxford University Press 2008 http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Articles Lam, Hugo Y. K. Khurana, Ekta Fang, Gang Cayting, Philip Carriero, Nicholas Cheung, Kei-Hoi Gerstein, Mark B. Pseudofam: the pseudogene families database |
title | Pseudofam: the pseudogene families database |
title_full | Pseudofam: the pseudogene families database |
title_fullStr | Pseudofam: the pseudogene families database |
title_full_unstemmed | Pseudofam: the pseudogene families database |
title_short | Pseudofam: the pseudogene families database |
title_sort | pseudofam: the pseudogene families database |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686518/ https://www.ncbi.nlm.nih.gov/pubmed/18957444 http://dx.doi.org/10.1093/nar/gkn758 |
work_keys_str_mv | AT lamhugoyk pseudofamthepseudogenefamiliesdatabase AT khuranaekta pseudofamthepseudogenefamiliesdatabase AT fanggang pseudofamthepseudogenefamiliesdatabase AT caytingphilip pseudofamthepseudogenefamiliesdatabase AT carrieronicholas pseudofamthepseudogenefamiliesdatabase AT cheungkeihoi pseudofamthepseudogenefamiliesdatabase AT gersteinmarkb pseudofamthepseudogenefamiliesdatabase |