Cargando…

ASAP: a machine learning framework for local protein properties

Determining residue-level protein properties, such as sites of post-translational modifications (PTMs), is vital to understanding protein function. Experimental methods are costly and time-consuming, while traditional rule-based computational methods fail to annotate sites lacking substantial simila...

Descripción completa

Detalles Bibliográficos
Autores principales: Brandes, Nadav, Ofer, Dan, Linial, Michal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5045867/
https://www.ncbi.nlm.nih.gov/pubmed/27694209
http://dx.doi.org/10.1093/database/baw133
_version_ 1782457184258359296
author Brandes, Nadav
Ofer, Dan
Linial, Michal
author_facet Brandes, Nadav
Ofer, Dan
Linial, Michal
author_sort Brandes, Nadav
collection PubMed
description Determining residue-level protein properties, such as sites of post-translational modifications (PTMs), is vital to understanding protein function. Experimental methods are costly and time-consuming, while traditional rule-based computational methods fail to annotate sites lacking substantial similarity. Machine Learning (ML) methods are becoming fundamental in annotating unknown proteins and their heterogeneous properties. We present ASAP (Amino-acid Sequence Annotation Prediction), a universal ML framework for predicting residue-level properties. ASAP extracts numerous features from raw sequences, and supports easy integration of external features such as secondary structure, solvent accessibility, intrinsically disorder or PSSM profiles. Features are then used to train ML classifiers. ASAP can create new classifiers within minutes for a variety of tasks, including PTM prediction (e.g. cleavage sites by convertase, phosphoserine modification). We present a detailed case study for ASAP: CleavePred, an ASAP-based model to predict protein precursor cleavage sites, with state-of-the-art results. Protein cleavage is a PTM shared by a wide variety of proteins sharing minimal sequence similarity. Current rule-based methods suffer from high false positive rates, making them suboptimal. The high performance of CleavePred makes it suitable for analyzing new proteomes at a genomic scale. The tool is attractive to protein design, mass spectrometry search engines and the discovery of new bioactive peptides from precursors. ASAP functions as a baseline approach for residue-level protein sequence prediction. CleavePred is freely accessible as a web-based application. Both ASAP and CleavePred are open-source with a flexible Python API. Database URL: ASAP’s and CleavePred source code, webtool and tutorials are available at: https://github.com/ddofer/asap; http://protonet.cs.huji.ac.il/cleavepred.
format Online
Article
Text
id pubmed-5045867
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-50458672016-10-03 ASAP: a machine learning framework for local protein properties Brandes, Nadav Ofer, Dan Linial, Michal Database (Oxford) Original Article Determining residue-level protein properties, such as sites of post-translational modifications (PTMs), is vital to understanding protein function. Experimental methods are costly and time-consuming, while traditional rule-based computational methods fail to annotate sites lacking substantial similarity. Machine Learning (ML) methods are becoming fundamental in annotating unknown proteins and their heterogeneous properties. We present ASAP (Amino-acid Sequence Annotation Prediction), a universal ML framework for predicting residue-level properties. ASAP extracts numerous features from raw sequences, and supports easy integration of external features such as secondary structure, solvent accessibility, intrinsically disorder or PSSM profiles. Features are then used to train ML classifiers. ASAP can create new classifiers within minutes for a variety of tasks, including PTM prediction (e.g. cleavage sites by convertase, phosphoserine modification). We present a detailed case study for ASAP: CleavePred, an ASAP-based model to predict protein precursor cleavage sites, with state-of-the-art results. Protein cleavage is a PTM shared by a wide variety of proteins sharing minimal sequence similarity. Current rule-based methods suffer from high false positive rates, making them suboptimal. The high performance of CleavePred makes it suitable for analyzing new proteomes at a genomic scale. The tool is attractive to protein design, mass spectrometry search engines and the discovery of new bioactive peptides from precursors. ASAP functions as a baseline approach for residue-level protein sequence prediction. CleavePred is freely accessible as a web-based application. Both ASAP and CleavePred are open-source with a flexible Python API. Database URL: ASAP’s and CleavePred source code, webtool and tutorials are available at: https://github.com/ddofer/asap; http://protonet.cs.huji.ac.il/cleavepred. Oxford University Press 2016-10-01 /pmc/articles/PMC5045867/ /pubmed/27694209 http://dx.doi.org/10.1093/database/baw133 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Brandes, Nadav
Ofer, Dan
Linial, Michal
ASAP: a machine learning framework for local protein properties
title ASAP: a machine learning framework for local protein properties
title_full ASAP: a machine learning framework for local protein properties
title_fullStr ASAP: a machine learning framework for local protein properties
title_full_unstemmed ASAP: a machine learning framework for local protein properties
title_short ASAP: a machine learning framework for local protein properties
title_sort asap: a machine learning framework for local protein properties
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5045867/
https://www.ncbi.nlm.nih.gov/pubmed/27694209
http://dx.doi.org/10.1093/database/baw133
work_keys_str_mv AT brandesnadav asapamachinelearningframeworkforlocalproteinproperties
AT oferdan asapamachinelearningframeworkforlocalproteinproperties
AT linialmichal asapamachinelearningframeworkforlocalproteinproperties