Cargando…
Improving the accuracy of protein secondary structure prediction using structural alignment
BACKGROUND: The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to s...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550433/ https://www.ncbi.nlm.nih.gov/pubmed/16774686 http://dx.doi.org/10.1186/1471-2105-7-301 |
_version_ | 1782129230474117120 |
---|---|
author | Montgomerie, Scott Sundararaj, Shan Gallin, Warren J Wishart, David S |
author_facet | Montgomerie, Scott Sundararaj, Shan Gallin, Warren J Wishart, David S |
author_sort | Montgomerie, Scott |
collection | PubMed |
description | BACKGROUND: The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence) database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences), the probability of a newly identified sequence having a structural homologue is actually quite high. RESULTS: We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25%) onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based) secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics) indicate that this new method can achieve a Q3 score approaching 88%. CONCLUSION: By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at . For high throughput or batch sequence analyses, the PROTEUS programs, databases (and server) can be downloaded and run locally. |
format | Text |
id | pubmed-1550433 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-15504332006-08-18 Improving the accuracy of protein secondary structure prediction using structural alignment Montgomerie, Scott Sundararaj, Shan Gallin, Warren J Wishart, David S BMC Bioinformatics Software BACKGROUND: The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence) database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences), the probability of a newly identified sequence having a structural homologue is actually quite high. RESULTS: We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25%) onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based) secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics) indicate that this new method can achieve a Q3 score approaching 88%. CONCLUSION: By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at . For high throughput or batch sequence analyses, the PROTEUS programs, databases (and server) can be downloaded and run locally. BioMed Central 2006-06-14 /pmc/articles/PMC1550433/ /pubmed/16774686 http://dx.doi.org/10.1186/1471-2105-7-301 Text en Copyright © 2006 Montgomerie et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Montgomerie, Scott Sundararaj, Shan Gallin, Warren J Wishart, David S Improving the accuracy of protein secondary structure prediction using structural alignment |
title | Improving the accuracy of protein secondary structure prediction using structural alignment |
title_full | Improving the accuracy of protein secondary structure prediction using structural alignment |
title_fullStr | Improving the accuracy of protein secondary structure prediction using structural alignment |
title_full_unstemmed | Improving the accuracy of protein secondary structure prediction using structural alignment |
title_short | Improving the accuracy of protein secondary structure prediction using structural alignment |
title_sort | improving the accuracy of protein secondary structure prediction using structural alignment |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550433/ https://www.ncbi.nlm.nih.gov/pubmed/16774686 http://dx.doi.org/10.1186/1471-2105-7-301 |
work_keys_str_mv | AT montgomeriescott improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment AT sundararajshan improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment AT gallinwarrenj improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment AT wishartdavids improvingtheaccuracyofproteinsecondarystructurepredictionusingstructuralalignment |