Cargando…

Improving the accuracy of protein secondary structure prediction using structural alignment

BACKGROUND: The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to s...

Descripción completa

Detalles Bibliográficos
Autores principales: Montgomerie, Scott, Sundararaj, Shan, Gallin, Warren J, Wishart, David S
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550433/
https://www.ncbi.nlm.nih.gov/pubmed/16774686
http://dx.doi.org/10.1186/1471-2105-7-301
_version_ 1782129230474117120
author Montgomerie, Scott
Sundararaj, Shan
Gallin, Warren J
Wishart, David S
author_facet Montgomerie, Scott
Sundararaj, Shan
Gallin, Warren J
Wishart, David S
author_sort Montgomerie, Scott
collection PubMed
description BACKGROUND: The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence) database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences), the probability of a newly identified sequence having a structural homologue is actually quite high. RESULTS: We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25%) onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based) secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics) indicate that this new method can achieve a Q3 score approaching 88%. CONCLUSION: By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at . For high throughput or batch sequence analyses, the PROTEUS programs, databases (and server) can be downloaded and run locally.
format Text
id pubmed-1550433
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15504332006-08-18 Improving the accuracy of protein secondary structure prediction using structural alignment Montgomerie, Scott Sundararaj, Shan Gallin, Warren J Wishart, David S BMC Bioinformatics Software BACKGROUND: The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence) database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences), the probability of a newly identified sequence having a structural homologue is actually quite high. RESULTS: We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25%) onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based) secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics) indicate that this new method can achieve a Q3 score approaching 88%. CONCLUSION: By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at . For high throughput or batch sequence analyses, the PROTEUS programs, databases (and server) can be downloaded and run locally. BioMed Central 2006-06-14 /pmc/articles/PMC1550433/ /pubmed/16774686 http://dx.doi.org/10.1186/1471-2105-7-301 Text en Copyright © 2006 Montgomerie et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Montgomerie, Scott
Sundararaj, Shan
Gallin, Warren J
Wishart, David S
Improving the accuracy of protein secondary structure prediction using structural alignment
title Improving the accuracy of protein secondary structure prediction using structural alignment
title_full Improving the accuracy of protein secondary structure prediction using structural alignment
title_fullStr Improving the accuracy of protein secondary structure prediction using structural alignment
title_full_unstemmed Improving the accuracy of protein secondary structure prediction using structural alignment
title_short Improving the accuracy of protein secondary structure prediction using structural alignment
title_sort improving the accuracy of protein secondary structure prediction using structural alignment
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550433/
https://www.ncbi.nlm.nih.gov/pubmed/16774686
http://dx.doi.org/10.1186/1471-2105-7-301
work_keys_str_mv AT montgomeriescott improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment
AT sundararajshan improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment
AT gallinwarrenj improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment
AT wishartdavids improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment