Cargando…

Analysis of an optimal hidden Markov model for secondary structure prediction

BACKGROUND: Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphic...

Descripción completa

Detalles Bibliográficos
Autores principales: Martin, Juliette, Gibrat, Jean-François, Rodolphe, François
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1769381/
https://www.ncbi.nlm.nih.gov/pubmed/17166267
http://dx.doi.org/10.1186/1472-6807-6-25
_version_ 1782131681507934208
author Martin, Juliette
Gibrat, Jean-François
Rodolphe, François
author_facet Martin, Juliette
Gibrat, Jean-François
Rodolphe, François
author_sort Martin, Juliette
collection PubMed
description BACKGROUND: Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphical interpretable models. Moreover, they have been successfully used in many bioinformatic applications. Because they offer a strong statistical background and allow model interpretation, we propose a method based on hidden Markov models. RESULTS: Our HMM is designed without prior knowledge. It is chosen within a collection of models of increasing size, using statistical and accuracy criteria. The resulting model has 36 hidden states: 15 that model α-helices, 12 that model coil and 9 that model β-strands. Connections between hidden states and state emission probabilities reflect the organization of protein structures into secondary structure segments. We start by analyzing the model features and see how it offers a new vision of local structures. We then use it for secondary structure prediction. Our model appears to be very efficient on single sequences, with a Q3 score of 68.8%, more than one point above PSIPRED prediction on single sequences. A straightforward extension of the method allows the use of multiple sequence alignments, rising the Q3 score to 75.5%. CONCLUSION: The hidden Markov model presented here achieves valuable prediction results using only a limited number of parameters. It provides an interpretable framework for protein secondary structure architecture. Furthermore, it can be used as a tool for generating protein sequences with a given secondary structure content.
format Text
id pubmed-1769381
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17693812007-01-16 Analysis of an optimal hidden Markov model for secondary structure prediction Martin, Juliette Gibrat, Jean-François Rodolphe, François BMC Struct Biol Research Article BACKGROUND: Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphical interpretable models. Moreover, they have been successfully used in many bioinformatic applications. Because they offer a strong statistical background and allow model interpretation, we propose a method based on hidden Markov models. RESULTS: Our HMM is designed without prior knowledge. It is chosen within a collection of models of increasing size, using statistical and accuracy criteria. The resulting model has 36 hidden states: 15 that model α-helices, 12 that model coil and 9 that model β-strands. Connections between hidden states and state emission probabilities reflect the organization of protein structures into secondary structure segments. We start by analyzing the model features and see how it offers a new vision of local structures. We then use it for secondary structure prediction. Our model appears to be very efficient on single sequences, with a Q3 score of 68.8%, more than one point above PSIPRED prediction on single sequences. A straightforward extension of the method allows the use of multiple sequence alignments, rising the Q3 score to 75.5%. CONCLUSION: The hidden Markov model presented here achieves valuable prediction results using only a limited number of parameters. It provides an interpretable framework for protein secondary structure architecture. Furthermore, it can be used as a tool for generating protein sequences with a given secondary structure content. BioMed Central 2006-12-13 /pmc/articles/PMC1769381/ /pubmed/17166267 http://dx.doi.org/10.1186/1472-6807-6-25 Text en Copyright © 2006 Martin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Martin, Juliette
Gibrat, Jean-François
Rodolphe, François
Analysis of an optimal hidden Markov model for secondary structure prediction
title Analysis of an optimal hidden Markov model for secondary structure prediction
title_full Analysis of an optimal hidden Markov model for secondary structure prediction
title_fullStr Analysis of an optimal hidden Markov model for secondary structure prediction
title_full_unstemmed Analysis of an optimal hidden Markov model for secondary structure prediction
title_short Analysis of an optimal hidden Markov model for secondary structure prediction
title_sort analysis of an optimal hidden markov model for secondary structure prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1769381/
https://www.ncbi.nlm.nih.gov/pubmed/17166267
http://dx.doi.org/10.1186/1472-6807-6-25
work_keys_str_mv AT martinjuliette analysisofanoptimalhiddenmarkovmodelforsecondarystructureprediction
AT gibratjeanfrancois analysisofanoptimalhiddenmarkovmodelforsecondarystructureprediction
AT rodolphefrancois analysisofanoptimalhiddenmarkovmodelforsecondarystructureprediction