Cargando…

How Many 3D Structures Do We Need to Train a Predictor?

It has been shown that the progress in the determination of membrane protein structure grows exponentially, with approximately the same growth rate as that of the water-soluble proteins. In order to investigate the effect of this, on the performance of prediction algorithms for both α-helical and β-...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bagos, Pantelis G., Tsaousis, Georgios N., Hamodrakas, Stavros J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2009
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054404/ https://www.ncbi.nlm.nih.gov/pubmed/19944385 http://dx.doi.org/10.1016/S1672-0229(08)60041-8

_version_	1782458592223297536
author	Bagos, Pantelis G. Tsaousis, Georgios N. Hamodrakas, Stavros J.
author_facet	Bagos, Pantelis G. Tsaousis, Georgios N. Hamodrakas, Stavros J.
author_sort	Bagos, Pantelis G.
collection	PubMed
description	It has been shown that the progress in the determination of membrane protein structure grows exponentially, with approximately the same growth rate as that of the water-soluble proteins. In order to investigate the effect of this, on the performance of prediction algorithms for both α-helical and β-barrel membrane proteins, we conducted a prospective study based on historical records. We trained separate hidden Markov models with different sized training sets and evaluated their performance on topology prediction for the two classes of transmembrane proteins. We show that the existing top-scoring algorithms for predicting the transmembrane segments of α-helical membrane proteins perform slightly better than that of β-barrel outer membrane proteins in all measures of accuracy. With the same rationale, a meta-analysis of the performance of the secondary structure prediction algorithms indicates that existing algorithmic techniques cannot be further improved by just adding more non-homologous sequences to the training sets. The upper limit for secondary structure prediction is estimated to be no more than 70% and 80% of correctly predicted residues for single sequence based methods and multiple sequence based ones, respectively. Therefore, we should concentrate our efforts on utilizing new techniques for the development of even better scoring predictors.
format	Online Article Text
id	pubmed-5054404
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-50544042016-10-14 How Many 3D Structures Do We Need to Train a Predictor? Bagos, Pantelis G. Tsaousis, Georgios N. Hamodrakas, Stavros J. Genomics Proteomics Bioinformatics Article It has been shown that the progress in the determination of membrane protein structure grows exponentially, with approximately the same growth rate as that of the water-soluble proteins. In order to investigate the effect of this, on the performance of prediction algorithms for both α-helical and β-barrel membrane proteins, we conducted a prospective study based on historical records. We trained separate hidden Markov models with different sized training sets and evaluated their performance on topology prediction for the two classes of transmembrane proteins. We show that the existing top-scoring algorithms for predicting the transmembrane segments of α-helical membrane proteins perform slightly better than that of β-barrel outer membrane proteins in all measures of accuracy. With the same rationale, a meta-analysis of the performance of the secondary structure prediction algorithms indicates that existing algorithmic techniques cannot be further improved by just adding more non-homologous sequences to the training sets. The upper limit for secondary structure prediction is estimated to be no more than 70% and 80% of correctly predicted residues for single sequence based methods and multiple sequence based ones, respectively. Therefore, we should concentrate our efforts on utilizing new techniques for the development of even better scoring predictors. Elsevier 2009-09 2009-11-25 /pmc/articles/PMC5054404/ /pubmed/19944385 http://dx.doi.org/10.1016/S1672-0229(08)60041-8 Text en © 2009 Beijing Institute of Genomics http://creativecommons.org/licenses/by-nc-sa/3.0/ This is an open access article under the CC BY-NC-SA license (http://creativecommons.org/licenses/by-nc-sa/3.0/).
spellingShingle	Article Bagos, Pantelis G. Tsaousis, Georgios N. Hamodrakas, Stavros J. How Many 3D Structures Do We Need to Train a Predictor?
title	How Many 3D Structures Do We Need to Train a Predictor?
title_full	How Many 3D Structures Do We Need to Train a Predictor?
title_fullStr	How Many 3D Structures Do We Need to Train a Predictor?
title_full_unstemmed	How Many 3D Structures Do We Need to Train a Predictor?
title_short	How Many 3D Structures Do We Need to Train a Predictor?
title_sort	how many 3d structures do we need to train a predictor?
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054404/ https://www.ncbi.nlm.nih.gov/pubmed/19944385 http://dx.doi.org/10.1016/S1672-0229(08)60041-8
work_keys_str_mv	AT bagospantelisg howmany3dstructuresdoweneedtotrainapredictor AT tsaousisgeorgiosn howmany3dstructuresdoweneedtotrainapredictor AT hamodrakasstavrosj howmany3dstructuresdoweneedtotrainapredictor

How Many 3D Structures Do We Need to Train a Predictor?

Ejemplares similares