Cargando…
Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach
BACKGROUND: Most profile and motif databases strive to classify protein sequences into a broad spectrum of protein families. The next step of such database studies should include the development of classification systems capable of distinguishing between subfamilies within a structurally and functio...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2002
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC65048/ https://www.ncbi.nlm.nih.gov/pubmed/11818024 http://dx.doi.org/10.1186/1471-2105-3-1 |
_version_ | 1782120158470340608 |
---|---|
author | Truong, Kevin Ikura, Mitsuhiko |
author_facet | Truong, Kevin Ikura, Mitsuhiko |
author_sort | Truong, Kevin |
collection | PubMed |
description | BACKGROUND: Most profile and motif databases strive to classify protein sequences into a broad spectrum of protein families. The next step of such database studies should include the development of classification systems capable of distinguishing between subfamilies within a structurally and functionally diverse superfamily. This would be helpful in elucidating sequence-structure-function relationships of proteins. RESULTS: Here, we present a method to diagnose sequences into subfamilies by employing hidden Markov models (HMMs) to find windows of residues that are distinct among subfamilies (called signatures). The method starts with a multiple sequence alignment (MSA) of the subfamily. Then, we build a HMM database representing all sliding windows of the MSA of a fixed size. Finally, we construct a HMM histogram of the matches of each sliding window in the entire superfamily. To illustrate the efficacy of the method, we have applied the analysis to find subfamily signatures in two well-studied superfamilies: the cadherin and the EF-hand protein superfamilies. As a corollary, the HMM histograms of the analyzed subfamilies revealed information about their Ca(2+) binding sites and loops. CONCLUSIONS: The method is used to create HMM databases to diagnose subfamilies of protein superfamilies that complement broad profile and motif databases such as BLOCKS, PROSITE, Pfam, SMART, PRINTS and InterPro. |
format | Text |
id | pubmed-65048 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2002 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-650482002-01-31 Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach Truong, Kevin Ikura, Mitsuhiko BMC Bioinformatics Methodology article BACKGROUND: Most profile and motif databases strive to classify protein sequences into a broad spectrum of protein families. The next step of such database studies should include the development of classification systems capable of distinguishing between subfamilies within a structurally and functionally diverse superfamily. This would be helpful in elucidating sequence-structure-function relationships of proteins. RESULTS: Here, we present a method to diagnose sequences into subfamilies by employing hidden Markov models (HMMs) to find windows of residues that are distinct among subfamilies (called signatures). The method starts with a multiple sequence alignment (MSA) of the subfamily. Then, we build a HMM database representing all sliding windows of the MSA of a fixed size. Finally, we construct a HMM histogram of the matches of each sliding window in the entire superfamily. To illustrate the efficacy of the method, we have applied the analysis to find subfamily signatures in two well-studied superfamilies: the cadherin and the EF-hand protein superfamilies. As a corollary, the HMM histograms of the analyzed subfamilies revealed information about their Ca(2+) binding sites and loops. CONCLUSIONS: The method is used to create HMM databases to diagnose subfamilies of protein superfamilies that complement broad profile and motif databases such as BLOCKS, PROSITE, Pfam, SMART, PRINTS and InterPro. BioMed Central 2002-01-10 /pmc/articles/PMC65048/ /pubmed/11818024 http://dx.doi.org/10.1186/1471-2105-3-1 Text en Copyright ©2002 Truong and Ikura; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. |
spellingShingle | Methodology article Truong, Kevin Ikura, Mitsuhiko Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach |
title | Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach |
title_full | Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach |
title_fullStr | Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach |
title_full_unstemmed | Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach |
title_short | Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach |
title_sort | identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden markov model approach |
topic | Methodology article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC65048/ https://www.ncbi.nlm.nih.gov/pubmed/11818024 http://dx.doi.org/10.1186/1471-2105-3-1 |
work_keys_str_mv | AT truongkevin identificationandcharacterizationofsubfamilyspecificsignaturesinalargeproteinsuperfamilybyahiddenmarkovmodelapproach AT ikuramitsuhiko identificationandcharacterizationofsubfamilyspecificsignaturesinalargeproteinsuperfamilybyahiddenmarkovmodelapproach |