Cargando…

Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information

BACKGROUND: Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pollastri, Gianluca, Martin, Alberto JM, Mooney, Catherine, Vullo, Alessandro
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1913928/ https://www.ncbi.nlm.nih.gov/pubmed/17570843 http://dx.doi.org/10.1186/1471-2105-8-201

_version_	1782134091737464832
author	Pollastri, Gianluca Martin, Alberto JM Mooney, Catherine Vullo, Alessandro
author_facet	Pollastri, Gianluca Martin, Alberto JM Mooney, Catherine Vullo, Alessandro
author_sort	Pollastri, Gianluca
collection	PubMed
description	BACKGROUND: Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio. RESULTS: Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available. CONCLUSION: The predictive system are publicly available at the address .
format	Text
id	pubmed-1913928
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-19139282007-07-11 Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information Pollastri, Gianluca Martin, Alberto JM Mooney, Catherine Vullo, Alessandro BMC Bioinformatics Methodology Article BACKGROUND: Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio. RESULTS: Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available. CONCLUSION: The predictive system are publicly available at the address . BioMed Central 2007-06-14 /pmc/articles/PMC1913928/ /pubmed/17570843 http://dx.doi.org/10.1186/1471-2105-8-201 Text en Copyright © 2007 Pollastri et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Pollastri, Gianluca Martin, Alberto JM Mooney, Catherine Vullo, Alessandro Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
title	Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
title_full	Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
title_fullStr	Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
title_full_unstemmed	Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
title_short	Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
title_sort	accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1913928/ https://www.ncbi.nlm.nih.gov/pubmed/17570843 http://dx.doi.org/10.1186/1471-2105-8-201
work_keys_str_mv	AT pollastrigianluca accuratepredictionofproteinsecondarystructureandsolventaccessibilitybyconsensuscombinersofsequenceandstructureinformation AT martinalbertojm accuratepredictionofproteinsecondarystructureandsolventaccessibilitybyconsensuscombinersofsequenceandstructureinformation AT mooneycatherine accuratepredictionofproteinsecondarystructureandsolventaccessibilitybyconsensuscombinersofsequenceandstructureinformation AT vulloalessandro accuratepredictionofproteinsecondarystructureandsolventaccessibilitybyconsensuscombinersofsequenceandstructureinformation

Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information

Ejemplares similares