Cargando…

A Unified Multitask Architecture for Predicting Local Protein Properties

A variety of functionally important protein properties, such as secondary structure, transmembrane topology and solvent accessibility, can be encoded as a labeling of amino acids. Indeed, the prediction of such properties from the primary amino acid sequence is one of the core projects of computatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Qi, Yanjun, Oja, Merja, Weston, Jason, Noble, William Stafford
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3312883/
https://www.ncbi.nlm.nih.gov/pubmed/22461885
http://dx.doi.org/10.1371/journal.pone.0032235
_version_ 1782227902681579520
author Qi, Yanjun
Oja, Merja
Weston, Jason
Noble, William Stafford
author_facet Qi, Yanjun
Oja, Merja
Weston, Jason
Noble, William Stafford
author_sort Qi, Yanjun
collection PubMed
description A variety of functionally important protein properties, such as secondary structure, transmembrane topology and solvent accessibility, can be encoded as a labeling of amino acids. Indeed, the prediction of such properties from the primary amino acid sequence is one of the core projects of computational biology. Accordingly, a panoply of approaches have been developed for predicting such properties; however, most such approaches focus on solving a single task at a time. Motivated by recent, successful work in natural language processing, we propose to use multitask learning to train a single, joint model that exploits the dependencies among these various labeling tasks. We describe a deep neural network architecture that, given a protein sequence, outputs a host of predicted local properties, including secondary structure, solvent accessibility, transmembrane topology, signal peptides and DNA-binding residues. The network is trained jointly on all these tasks in a supervised fashion, augmented with a novel form of semi-supervised learning in which the model is trained to distinguish between local patterns from natural and synthetic protein sequences. The task-independent architecture of the network obviates the need for task-specific feature engineering. We demonstrate that, for all of the tasks that we considered, our approach leads to statistically significant improvements in performance, relative to a single task neural network approach, and that the resulting model achieves state-of-the-art performance.
format Online
Article
Text
id pubmed-3312883
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33128832012-03-29 A Unified Multitask Architecture for Predicting Local Protein Properties Qi, Yanjun Oja, Merja Weston, Jason Noble, William Stafford PLoS One Research Article A variety of functionally important protein properties, such as secondary structure, transmembrane topology and solvent accessibility, can be encoded as a labeling of amino acids. Indeed, the prediction of such properties from the primary amino acid sequence is one of the core projects of computational biology. Accordingly, a panoply of approaches have been developed for predicting such properties; however, most such approaches focus on solving a single task at a time. Motivated by recent, successful work in natural language processing, we propose to use multitask learning to train a single, joint model that exploits the dependencies among these various labeling tasks. We describe a deep neural network architecture that, given a protein sequence, outputs a host of predicted local properties, including secondary structure, solvent accessibility, transmembrane topology, signal peptides and DNA-binding residues. The network is trained jointly on all these tasks in a supervised fashion, augmented with a novel form of semi-supervised learning in which the model is trained to distinguish between local patterns from natural and synthetic protein sequences. The task-independent architecture of the network obviates the need for task-specific feature engineering. We demonstrate that, for all of the tasks that we considered, our approach leads to statistically significant improvements in performance, relative to a single task neural network approach, and that the resulting model achieves state-of-the-art performance. Public Library of Science 2012-03-26 /pmc/articles/PMC3312883/ /pubmed/22461885 http://dx.doi.org/10.1371/journal.pone.0032235 Text en Qi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Qi, Yanjun
Oja, Merja
Weston, Jason
Noble, William Stafford
A Unified Multitask Architecture for Predicting Local Protein Properties
title A Unified Multitask Architecture for Predicting Local Protein Properties
title_full A Unified Multitask Architecture for Predicting Local Protein Properties
title_fullStr A Unified Multitask Architecture for Predicting Local Protein Properties
title_full_unstemmed A Unified Multitask Architecture for Predicting Local Protein Properties
title_short A Unified Multitask Architecture for Predicting Local Protein Properties
title_sort unified multitask architecture for predicting local protein properties
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3312883/
https://www.ncbi.nlm.nih.gov/pubmed/22461885
http://dx.doi.org/10.1371/journal.pone.0032235
work_keys_str_mv AT qiyanjun aunifiedmultitaskarchitectureforpredictinglocalproteinproperties
AT ojamerja aunifiedmultitaskarchitectureforpredictinglocalproteinproperties
AT westonjason aunifiedmultitaskarchitectureforpredictinglocalproteinproperties
AT noblewilliamstafford aunifiedmultitaskarchitectureforpredictinglocalproteinproperties
AT qiyanjun unifiedmultitaskarchitectureforpredictinglocalproteinproperties
AT ojamerja unifiedmultitaskarchitectureforpredictinglocalproteinproperties
AT westonjason unifiedmultitaskarchitectureforpredictinglocalproteinproperties
AT noblewilliamstafford unifiedmultitaskarchitectureforpredictinglocalproteinproperties