Cargando…

AMS 4.0: consensus prediction of post-translational modifications in protein sequences

We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt...

Descripción completa

Detalles Bibliográficos
Autores principales: Plewczynski, Dariusz, Basu, Subhadip, Saha, Indrajit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Vienna 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3397139/
https://www.ncbi.nlm.nih.gov/pubmed/22555647
http://dx.doi.org/10.1007/s00726-012-1290-2
_version_ 1782238154280927232
author Plewczynski, Dariusz
Basu, Subhadip
Saha, Indrajit
author_facet Plewczynski, Dariusz
Basu, Subhadip
Saha, Indrajit
author_sort Plewczynski, Dariusz
collection PubMed
description We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00726-012-1290-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-3397139
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Springer Vienna
record_format MEDLINE/PubMed
spelling pubmed-33971392012-07-19 AMS 4.0: consensus prediction of post-translational modifications in protein sequences Plewczynski, Dariusz Basu, Subhadip Saha, Indrajit Amino Acids Original Article We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00726-012-1290-2) contains supplementary material, which is available to authorized users. Springer Vienna 2012-05-04 2012 /pmc/articles/PMC3397139/ /pubmed/22555647 http://dx.doi.org/10.1007/s00726-012-1290-2 Text en © The Author(s) 2012 https://creativecommons.org/licenses/by/4.0/ This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
spellingShingle Original Article
Plewczynski, Dariusz
Basu, Subhadip
Saha, Indrajit
AMS 4.0: consensus prediction of post-translational modifications in protein sequences
title AMS 4.0: consensus prediction of post-translational modifications in protein sequences
title_full AMS 4.0: consensus prediction of post-translational modifications in protein sequences
title_fullStr AMS 4.0: consensus prediction of post-translational modifications in protein sequences
title_full_unstemmed AMS 4.0: consensus prediction of post-translational modifications in protein sequences
title_short AMS 4.0: consensus prediction of post-translational modifications in protein sequences
title_sort ams 4.0: consensus prediction of post-translational modifications in protein sequences
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3397139/
https://www.ncbi.nlm.nih.gov/pubmed/22555647
http://dx.doi.org/10.1007/s00726-012-1290-2
work_keys_str_mv AT plewczynskidariusz ams40consensuspredictionofposttranslationalmodificationsinproteinsequences
AT basusubhadip ams40consensuspredictionofposttranslationalmodificationsinproteinsequences
AT sahaindrajit ams40consensuspredictionofposttranslationalmodificationsinproteinsequences