Cargando…

STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction

Protein post-translational modification (PTM) is an important regulatory mechanism that plays a key role in both normal and disease states. Acetylation on lysine residues is one of the most potent PTMs owing to its critical role in cellular metabolism and regulatory processes. Identifying protein ly...

Descripción completa

Detalles Bibliográficos
Autores principales: Basith, Shaherin, Lee, Gwang, Manavalan, Balachandran
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8769686/
https://www.ncbi.nlm.nih.gov/pubmed/34532736
http://dx.doi.org/10.1093/bib/bbab376
_version_ 1784635205479301120
author Basith, Shaherin
Lee, Gwang
Manavalan, Balachandran
author_facet Basith, Shaherin
Lee, Gwang
Manavalan, Balachandran
author_sort Basith, Shaherin
collection PubMed
description Protein post-translational modification (PTM) is an important regulatory mechanism that plays a key role in both normal and disease states. Acetylation on lysine residues is one of the most potent PTMs owing to its critical role in cellular metabolism and regulatory processes. Identifying protein lysine acetylation (Kace) sites is a challenging task in bioinformatics. To date, several machine learning-based methods for the in silico identification of Kace sites have been developed. Of those, a few are prokaryotic species-specific. Despite their attractive advantages and performances, these methods have certain limitations. Therefore, this study proposes a novel predictor STALLION (STacking-based Predictor for ProkAryotic Lysine AcetyLatION), containing six prokaryotic species-specific models to identify Kace sites accurately. To extract crucial patterns around Kace sites, we employed 11 different encodings representing three different characteristics. Subsequently, a systematic and rigorous feature selection approach was employed to identify the optimal feature set independently for five tree-based ensemble algorithms and built their respective baseline model for each species. Finally, the predicted values from baseline models were utilized and trained with an appropriate classifier using the stacking strategy to develop STALLION. Comparative benchmarking experiments showed that STALLION significantly outperformed existing predictor on independent tests. To expedite direct accessibility to the STALLION models, a user-friendly online predictor was implemented, which is available at: http://thegleelab.org/STALLION.
format Online
Article
Text
id pubmed-8769686
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87696862022-01-20 STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction Basith, Shaherin Lee, Gwang Manavalan, Balachandran Brief Bioinform Problem Solving Protocol Protein post-translational modification (PTM) is an important regulatory mechanism that plays a key role in both normal and disease states. Acetylation on lysine residues is one of the most potent PTMs owing to its critical role in cellular metabolism and regulatory processes. Identifying protein lysine acetylation (Kace) sites is a challenging task in bioinformatics. To date, several machine learning-based methods for the in silico identification of Kace sites have been developed. Of those, a few are prokaryotic species-specific. Despite their attractive advantages and performances, these methods have certain limitations. Therefore, this study proposes a novel predictor STALLION (STacking-based Predictor for ProkAryotic Lysine AcetyLatION), containing six prokaryotic species-specific models to identify Kace sites accurately. To extract crucial patterns around Kace sites, we employed 11 different encodings representing three different characteristics. Subsequently, a systematic and rigorous feature selection approach was employed to identify the optimal feature set independently for five tree-based ensemble algorithms and built their respective baseline model for each species. Finally, the predicted values from baseline models were utilized and trained with an appropriate classifier using the stacking strategy to develop STALLION. Comparative benchmarking experiments showed that STALLION significantly outperformed existing predictor on independent tests. To expedite direct accessibility to the STALLION models, a user-friendly online predictor was implemented, which is available at: http://thegleelab.org/STALLION. Oxford University Press 2021-09-17 /pmc/articles/PMC8769686/ /pubmed/34532736 http://dx.doi.org/10.1093/bib/bbab376 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Basith, Shaherin
Lee, Gwang
Manavalan, Balachandran
STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction
title STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction
title_full STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction
title_fullStr STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction
title_full_unstemmed STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction
title_short STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction
title_sort stallion: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8769686/
https://www.ncbi.nlm.nih.gov/pubmed/34532736
http://dx.doi.org/10.1093/bib/bbab376
work_keys_str_mv AT basithshaherin stallionastackingbasedensemblelearningframeworkforprokaryoticlysineacetylationsiteprediction
AT leegwang stallionastackingbasedensemblelearningframeworkforprokaryoticlysineacetylationsiteprediction
AT manavalanbalachandran stallionastackingbasedensemblelearningframeworkforprokaryoticlysineacetylationsiteprediction