Cargando…

Complete fold annotation of the human proteome using a novel structural feature space

Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. H...

Descripción completa

Detalles Bibliográficos
Autores principales: Middleton, Sarah A., Illuminati, Joseph, Kim, Junhyong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5390313/
https://www.ncbi.nlm.nih.gov/pubmed/28406174
http://dx.doi.org/10.1038/srep46321
_version_ 1782521434332987392
author Middleton, Sarah A.
Illuminati, Joseph
Kim, Junhyong
author_facet Middleton, Sarah A.
Illuminati, Joseph
Kim, Junhyong
author_sort Middleton, Sarah A.
collection PubMed
description Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.
format Online
Article
Text
id pubmed-5390313
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53903132017-04-14 Complete fold annotation of the human proteome using a novel structural feature space Middleton, Sarah A. Illuminati, Joseph Kim, Junhyong Sci Rep Article Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families. Nature Publishing Group 2017-04-13 /pmc/articles/PMC5390313/ /pubmed/28406174 http://dx.doi.org/10.1038/srep46321 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Middleton, Sarah A.
Illuminati, Joseph
Kim, Junhyong
Complete fold annotation of the human proteome using a novel structural feature space
title Complete fold annotation of the human proteome using a novel structural feature space
title_full Complete fold annotation of the human proteome using a novel structural feature space
title_fullStr Complete fold annotation of the human proteome using a novel structural feature space
title_full_unstemmed Complete fold annotation of the human proteome using a novel structural feature space
title_short Complete fold annotation of the human proteome using a novel structural feature space
title_sort complete fold annotation of the human proteome using a novel structural feature space
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5390313/
https://www.ncbi.nlm.nih.gov/pubmed/28406174
http://dx.doi.org/10.1038/srep46321
work_keys_str_mv AT middletonsaraha completefoldannotationofthehumanproteomeusinganovelstructuralfeaturespace
AT illuminatijoseph completefoldannotationofthehumanproteomeusinganovelstructuralfeaturespace
AT kimjunhyong completefoldannotationofthehumanproteomeusinganovelstructuralfeaturespace