Cargando…

Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures

BACKGROUND: Hidden Markov Models (HMMs) have proven very useful in computational biology for such applications as sequence pattern matching, gene-finding, and structure prediction. Thus far, however, they have been confined to representing 1D sequence (or the aspects of structure that could be repre...

Descripción completa

Detalles Bibliográficos
Autores principales: Alexandrov, Vadim, Gerstein, Mark
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC344530/
https://www.ncbi.nlm.nih.gov/pubmed/14715091
http://dx.doi.org/10.1186/1471-2105-5-2
_version_ 1782121237575630848
author Alexandrov, Vadim
Gerstein, Mark
author_facet Alexandrov, Vadim
Gerstein, Mark
author_sort Alexandrov, Vadim
collection PubMed
description BACKGROUND: Hidden Markov Models (HMMs) have proven very useful in computational biology for such applications as sequence pattern matching, gene-finding, and structure prediction. Thus far, however, they have been confined to representing 1D sequence (or the aspects of structure that could be represented by character strings). RESULTS: We develop an HMM formalism that explicitly uses 3D coordinates in its match states. The match states are modeled by 3D Gaussian distributions centered on the mean coordinate position of each alpha carbon in a large structural alignment. The transition probabilities depend on the spread of the neighboring match states and on the number of gaps found in the structural alignment. We also develop methods for aligning query structures against 3D HMMs and scoring the result probabilistically. For 1D HMMs these tasks are accomplished by the Viterbi and forward algorithms. However, these will not work in unmodified form for the 3D problem, due to non-local quality of structural alignment, so we develop extensions of these algorithms for the 3D case. Several applications of 3D HMMs for protein structure classification are reported. A good separation of scores for different fold families suggests that the described construct is quite useful for protein structure analysis. CONCLUSION: We have created a rigorous 3D HMM representation for protein structures and implemented a complete set of routines for building 3D HMMs in C and Perl. The code is freely available from , and at this site we also have a simple prototype server to demonstrate the features of the described approach.
format Text
id pubmed-344530
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-3445302004-02-24 Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures Alexandrov, Vadim Gerstein, Mark BMC Bioinformatics Research Article BACKGROUND: Hidden Markov Models (HMMs) have proven very useful in computational biology for such applications as sequence pattern matching, gene-finding, and structure prediction. Thus far, however, they have been confined to representing 1D sequence (or the aspects of structure that could be represented by character strings). RESULTS: We develop an HMM formalism that explicitly uses 3D coordinates in its match states. The match states are modeled by 3D Gaussian distributions centered on the mean coordinate position of each alpha carbon in a large structural alignment. The transition probabilities depend on the spread of the neighboring match states and on the number of gaps found in the structural alignment. We also develop methods for aligning query structures against 3D HMMs and scoring the result probabilistically. For 1D HMMs these tasks are accomplished by the Viterbi and forward algorithms. However, these will not work in unmodified form for the 3D problem, due to non-local quality of structural alignment, so we develop extensions of these algorithms for the 3D case. Several applications of 3D HMMs for protein structure classification are reported. A good separation of scores for different fold families suggests that the described construct is quite useful for protein structure analysis. CONCLUSION: We have created a rigorous 3D HMM representation for protein structures and implemented a complete set of routines for building 3D HMMs in C and Perl. The code is freely available from , and at this site we also have a simple prototype server to demonstrate the features of the described approach. BioMed Central 2004-01-09 /pmc/articles/PMC344530/ /pubmed/14715091 http://dx.doi.org/10.1186/1471-2105-5-2 Text en Copyright © 2004 Alexandrov and Gerstein; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research Article
Alexandrov, Vadim
Gerstein, Mark
Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures
title Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures
title_full Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures
title_fullStr Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures
title_full_unstemmed Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures
title_short Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures
title_sort using 3d hidden markov models that explicitly represent spatial coordinates to model and compare protein structures
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC344530/
https://www.ncbi.nlm.nih.gov/pubmed/14715091
http://dx.doi.org/10.1186/1471-2105-5-2
work_keys_str_mv AT alexandrovvadim using3dhiddenmarkovmodelsthatexplicitlyrepresentspatialcoordinatestomodelandcompareproteinstructures
AT gersteinmark using3dhiddenmarkovmodelsthatexplicitlyrepresentspatialcoordinatestomodelandcompareproteinstructures