Cargando…

Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule

MOTIVATION: Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular recognition features (MoRFs). Development of c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Malhis, Nawar, Wong, Eric T. C., Nassar, Roy, Gsponer, Jörg
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2015
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627796/ https://www.ncbi.nlm.nih.gov/pubmed/26517836 http://dx.doi.org/10.1371/journal.pone.0141603

_version_	1782398334457085952
author	Malhis, Nawar Wong, Eric T. C. Nassar, Roy Gsponer, Jörg
author_facet	Malhis, Nawar Wong, Eric T. C. Nassar, Roy Gsponer, Jörg
author_sort	Malhis, Nawar
collection	PubMed
description	MOTIVATION: Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular recognition features (MoRFs). Development of computational tools for the identification of candidate MoRF locations in amino acid sequences is an important task and an area of growing interest. Given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we introduce MoRF(CHiBi_Web), which predicts MoRF locations in protein sequences with higher accuracy compared to current MoRF predictors. METHODS: Three distinct and largely independent property scores are computed with component predictors and then combined to generate the final MoRF propensity scores. The first score reflects the likelihood of sequence windows to harbour MoRFs and is based on amino acid composition and sequence similarity information. It is generated by MoRF(CHiBi) using small windows of up to 40 residues in size. The second score identifies long stretches of protein disorder and is generated by ESpritz with the DisProt option. Lastly, the third score reflects residue conservation and is assembled from PSSM files generated by PSI-BLAST. These propensity scores are processed and then hierarchically combined using Bayes rule to generate the final MoRF(CHiBi_Web) predictions. RESULTS: MoRF(CHiBi_Web) was tested on three datasets. Results show that MoRF(CHiBi_Web) outperforms previously developed predictors by generating less than half the false positive rate for the same true positive rate at practical threshold values. This level of accuracy paired with its relatively high processing speed makes MoRF(CHiBi_Web) a practical tool for MoRF prediction. AVAILABILITY: http://morf.chibi.ubc.ca:8080/morf/.
format	Online Article Text
id	pubmed-4627796
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-46277962015-11-06 Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule Malhis, Nawar Wong, Eric T. C. Nassar, Roy Gsponer, Jörg PLoS One Research Article MOTIVATION: Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular recognition features (MoRFs). Development of computational tools for the identification of candidate MoRF locations in amino acid sequences is an important task and an area of growing interest. Given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we introduce MoRF(CHiBi_Web), which predicts MoRF locations in protein sequences with higher accuracy compared to current MoRF predictors. METHODS: Three distinct and largely independent property scores are computed with component predictors and then combined to generate the final MoRF propensity scores. The first score reflects the likelihood of sequence windows to harbour MoRFs and is based on amino acid composition and sequence similarity information. It is generated by MoRF(CHiBi) using small windows of up to 40 residues in size. The second score identifies long stretches of protein disorder and is generated by ESpritz with the DisProt option. Lastly, the third score reflects residue conservation and is assembled from PSSM files generated by PSI-BLAST. These propensity scores are processed and then hierarchically combined using Bayes rule to generate the final MoRF(CHiBi_Web) predictions. RESULTS: MoRF(CHiBi_Web) was tested on three datasets. Results show that MoRF(CHiBi_Web) outperforms previously developed predictors by generating less than half the false positive rate for the same true positive rate at practical threshold values. This level of accuracy paired with its relatively high processing speed makes MoRF(CHiBi_Web) a practical tool for MoRF prediction. AVAILABILITY: http://morf.chibi.ubc.ca:8080/morf/. Public Library of Science 2015-10-30 /pmc/articles/PMC4627796/ /pubmed/26517836 http://dx.doi.org/10.1371/journal.pone.0141603 Text en © 2015 Malhis et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Malhis, Nawar Wong, Eric T. C. Nassar, Roy Gsponer, Jörg Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule
title	Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule
title_full	Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule
title_fullStr	Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule
title_full_unstemmed	Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule
title_short	Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule
title_sort	computational identification of morfs in protein sequences using hierarchical application of bayes rule
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627796/ https://www.ncbi.nlm.nih.gov/pubmed/26517836 http://dx.doi.org/10.1371/journal.pone.0141603
work_keys_str_mv	AT malhisnawar computationalidentificationofmorfsinproteinsequencesusinghierarchicalapplicationofbayesrule AT wongerictc computationalidentificationofmorfsinproteinsequencesusinghierarchicalapplicationofbayesrule AT nassarroy computationalidentificationofmorfsinproteinsequencesusinghierarchicalapplicationofbayesrule AT gsponerjorg computationalidentificationofmorfsinproteinsequencesusinghierarchicalapplicationofbayesrule

Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule

Ejemplares similares