Cargando…
A hybrid feature extraction scheme for efficient malonylation site prediction
Lysine malonylation is one of the most important post-translational modifications (PTMs). It affects the functionality of cells. Malonylation site prediction in proteins can unfold the mechanisms of cellular functionalities. Experimental methods are one of the due prediction approaches. But they are...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987080/ https://www.ncbi.nlm.nih.gov/pubmed/35388017 http://dx.doi.org/10.1038/s41598-022-08555-9 |
_version_ | 1784682658994847744 |
---|---|
author | Sorkhi, Ali Ghanbari Pirgazi, Jamshid Ghasemi, Vahid |
author_facet | Sorkhi, Ali Ghanbari Pirgazi, Jamshid Ghasemi, Vahid |
author_sort | Sorkhi, Ali Ghanbari |
collection | PubMed |
description | Lysine malonylation is one of the most important post-translational modifications (PTMs). It affects the functionality of cells. Malonylation site prediction in proteins can unfold the mechanisms of cellular functionalities. Experimental methods are one of the due prediction approaches. But they are typically costly and time-consuming to implement. Recently, methods based on machine-learning solutions have been proposed to tackle this problem. Such practices have been shown to reduce costs and time complexities and increase accuracy. However, these approaches also have specific shortcomings, including inappropriate feature extraction out of protein sequences, high-dimensional features, and inefficient underlying classifiers. A machine learning-based method is proposed in this paper to cope with these problems. In the proposed approach, seven different features are extracted. Then, the extracted features are combined, ranked based on the Fisher’s score (F-score), and the most efficient ones are selected. Afterward, malonylation sites are predicted using various classifiers. Simulation results show that the proposed method has acceptable performance compared with some state-of-the-art approaches. In addition, the XGBOOST classifier, founded on extracted features such as TFCRF, has a higher prediction rate than the other methods. The codes are publicly available at: https://github.com/jimy2020/Malonylation-site-prediction |
format | Online Article Text |
id | pubmed-8987080 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-89870802022-04-08 A hybrid feature extraction scheme for efficient malonylation site prediction Sorkhi, Ali Ghanbari Pirgazi, Jamshid Ghasemi, Vahid Sci Rep Article Lysine malonylation is one of the most important post-translational modifications (PTMs). It affects the functionality of cells. Malonylation site prediction in proteins can unfold the mechanisms of cellular functionalities. Experimental methods are one of the due prediction approaches. But they are typically costly and time-consuming to implement. Recently, methods based on machine-learning solutions have been proposed to tackle this problem. Such practices have been shown to reduce costs and time complexities and increase accuracy. However, these approaches also have specific shortcomings, including inappropriate feature extraction out of protein sequences, high-dimensional features, and inefficient underlying classifiers. A machine learning-based method is proposed in this paper to cope with these problems. In the proposed approach, seven different features are extracted. Then, the extracted features are combined, ranked based on the Fisher’s score (F-score), and the most efficient ones are selected. Afterward, malonylation sites are predicted using various classifiers. Simulation results show that the proposed method has acceptable performance compared with some state-of-the-art approaches. In addition, the XGBOOST classifier, founded on extracted features such as TFCRF, has a higher prediction rate than the other methods. The codes are publicly available at: https://github.com/jimy2020/Malonylation-site-prediction Nature Publishing Group UK 2022-04-06 /pmc/articles/PMC8987080/ /pubmed/35388017 http://dx.doi.org/10.1038/s41598-022-08555-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Sorkhi, Ali Ghanbari Pirgazi, Jamshid Ghasemi, Vahid A hybrid feature extraction scheme for efficient malonylation site prediction |
title | A hybrid feature extraction scheme for efficient malonylation site prediction |
title_full | A hybrid feature extraction scheme for efficient malonylation site prediction |
title_fullStr | A hybrid feature extraction scheme for efficient malonylation site prediction |
title_full_unstemmed | A hybrid feature extraction scheme for efficient malonylation site prediction |
title_short | A hybrid feature extraction scheme for efficient malonylation site prediction |
title_sort | hybrid feature extraction scheme for efficient malonylation site prediction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987080/ https://www.ncbi.nlm.nih.gov/pubmed/35388017 http://dx.doi.org/10.1038/s41598-022-08555-9 |
work_keys_str_mv | AT sorkhialighanbari ahybridfeatureextractionschemeforefficientmalonylationsiteprediction AT pirgazijamshid ahybridfeatureextractionschemeforefficientmalonylationsiteprediction AT ghasemivahid ahybridfeatureextractionschemeforefficientmalonylationsiteprediction AT sorkhialighanbari hybridfeatureextractionschemeforefficientmalonylationsiteprediction AT pirgazijamshid hybridfeatureextractionschemeforefficientmalonylationsiteprediction AT ghasemivahid hybridfeatureextractionschemeforefficientmalonylationsiteprediction |