Cargando…
On the Encoding of Proteins for Disordered Regions Prediction
Disordered regions, i.e., regions of proteins that do not adopt a stable three-dimensional structure, have been shown to play various and critical roles in many biological processes. Predicting and understanding their formation is therefore a key sub-problem of protein structure and function inferen...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3864923/ https://www.ncbi.nlm.nih.gov/pubmed/24358161 http://dx.doi.org/10.1371/journal.pone.0082252 |
_version_ | 1782295966566580224 |
---|---|
author | Becker, Julien Maes, Francis Wehenkel, Louis |
author_facet | Becker, Julien Maes, Francis Wehenkel, Louis |
author_sort | Becker, Julien |
collection | PubMed |
description | Disordered regions, i.e., regions of proteins that do not adopt a stable three-dimensional structure, have been shown to play various and critical roles in many biological processes. Predicting and understanding their formation is therefore a key sub-problem of protein structure and function inference. A wide range of machine learning approaches have been developed to automatically predict disordered regions of proteins. One key factor of the success of these methods is the way in which protein information is encoded into features. Recently, we have proposed a systematic methodology to study the relevance of various feature encodings in the context of disulfide connectivity pattern prediction. In the present paper, we adapt this methodology to the problem of predicting disordered regions and assess it on proteins from the 10th CASP competition, as well as on a very large subset of proteins extracted from PDB. Our results, obtained with ensembles of extremely randomized trees, highlight a novel feature function encoding the proximity of residues according to their accessibility to the solvent, which is playing the second most important role in the prediction of disordered regions, just after evolutionary information. Furthermore, even though our approach treats each residue independently, our results are very competitive in terms of accuracy with respect to the state-of-the-art. A web-application is available at http://m24.giga.ulg.ac.be:81/x3Disorder. |
format | Online Article Text |
id | pubmed-3864923 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-38649232013-12-19 On the Encoding of Proteins for Disordered Regions Prediction Becker, Julien Maes, Francis Wehenkel, Louis PLoS One Research Article Disordered regions, i.e., regions of proteins that do not adopt a stable three-dimensional structure, have been shown to play various and critical roles in many biological processes. Predicting and understanding their formation is therefore a key sub-problem of protein structure and function inference. A wide range of machine learning approaches have been developed to automatically predict disordered regions of proteins. One key factor of the success of these methods is the way in which protein information is encoded into features. Recently, we have proposed a systematic methodology to study the relevance of various feature encodings in the context of disulfide connectivity pattern prediction. In the present paper, we adapt this methodology to the problem of predicting disordered regions and assess it on proteins from the 10th CASP competition, as well as on a very large subset of proteins extracted from PDB. Our results, obtained with ensembles of extremely randomized trees, highlight a novel feature function encoding the proximity of residues according to their accessibility to the solvent, which is playing the second most important role in the prediction of disordered regions, just after evolutionary information. Furthermore, even though our approach treats each residue independently, our results are very competitive in terms of accuracy with respect to the state-of-the-art. A web-application is available at http://m24.giga.ulg.ac.be:81/x3Disorder. Public Library of Science 2013-12-16 /pmc/articles/PMC3864923/ /pubmed/24358161 http://dx.doi.org/10.1371/journal.pone.0082252 Text en © 2013 Becker et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Becker, Julien Maes, Francis Wehenkel, Louis On the Encoding of Proteins for Disordered Regions Prediction |
title | On the Encoding of Proteins for Disordered Regions Prediction |
title_full | On the Encoding of Proteins for Disordered Regions Prediction |
title_fullStr | On the Encoding of Proteins for Disordered Regions Prediction |
title_full_unstemmed | On the Encoding of Proteins for Disordered Regions Prediction |
title_short | On the Encoding of Proteins for Disordered Regions Prediction |
title_sort | on the encoding of proteins for disordered regions prediction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3864923/ https://www.ncbi.nlm.nih.gov/pubmed/24358161 http://dx.doi.org/10.1371/journal.pone.0082252 |
work_keys_str_mv | AT beckerjulien ontheencodingofproteinsfordisorderedregionsprediction AT maesfrancis ontheencodingofproteinsfordisorderedregionsprediction AT wehenkellouis ontheencodingofproteinsfordisorderedregionsprediction |