Cargando…
Detecting Succinylation sites from protein sequences using ensemble support vector machine
BACKGROUND: Lysine succinylation is a new kind of post-translational modification which plays a key role in protein conformation regulation and cellular function control. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately....
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6016146/ https://www.ncbi.nlm.nih.gov/pubmed/29940836 http://dx.doi.org/10.1186/s12859-018-2249-4 |
_version_ | 1783334516170948608 |
---|---|
author | Ning, Qiao Zhao, Xiaosa Bao, Lingling Ma, Zhiqiang Zhao, Xiaowei |
author_facet | Ning, Qiao Zhao, Xiaosa Bao, Lingling Ma, Zhiqiang Zhao, Xiaowei |
author_sort | Ning, Qiao |
collection | PubMed |
description | BACKGROUND: Lysine succinylation is a new kind of post-translational modification which plays a key role in protein conformation regulation and cellular function control. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately. However, traditional methods, experimental approaches, are labor-intensive and time-consuming. Computational prediction methods have been proposed recent years, and they are popular because of their convenience and high speed. In this study, we developed a new method to predict succinylation sites in protein combining multiple features, including amino acid composition, binary encoding, physicochemical property and grey pseudo amino acid composition, with a feature selection scheme (information gain). And then, it was trained using SVM (Support Vector Machine) and an ensemble learning algorithm. RESULTS: The performance of this method was measured with an accuracy of 89.14% and a MCC (Matthew Correlation Coefficient) of 0.79 using 10-fold cross validation on training dataset and an accuracy of 84.5% and a MCC of 0.2 on independent dataset. CONCLUSIONS: The conclusions made from this study can help to understand more of the succinylation mechanism. These results suggest that our method was very promising for predicting succinylation sites. The source code and data of this paper are freely available athttps://github.com/ningq669/PSuccE. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2249-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6016146 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-60161462018-07-06 Detecting Succinylation sites from protein sequences using ensemble support vector machine Ning, Qiao Zhao, Xiaosa Bao, Lingling Ma, Zhiqiang Zhao, Xiaowei BMC Bioinformatics Research Article BACKGROUND: Lysine succinylation is a new kind of post-translational modification which plays a key role in protein conformation regulation and cellular function control. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately. However, traditional methods, experimental approaches, are labor-intensive and time-consuming. Computational prediction methods have been proposed recent years, and they are popular because of their convenience and high speed. In this study, we developed a new method to predict succinylation sites in protein combining multiple features, including amino acid composition, binary encoding, physicochemical property and grey pseudo amino acid composition, with a feature selection scheme (information gain). And then, it was trained using SVM (Support Vector Machine) and an ensemble learning algorithm. RESULTS: The performance of this method was measured with an accuracy of 89.14% and a MCC (Matthew Correlation Coefficient) of 0.79 using 10-fold cross validation on training dataset and an accuracy of 84.5% and a MCC of 0.2 on independent dataset. CONCLUSIONS: The conclusions made from this study can help to understand more of the succinylation mechanism. These results suggest that our method was very promising for predicting succinylation sites. The source code and data of this paper are freely available athttps://github.com/ningq669/PSuccE. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2249-4) contains supplementary material, which is available to authorized users. BioMed Central 2018-06-25 /pmc/articles/PMC6016146/ /pubmed/29940836 http://dx.doi.org/10.1186/s12859-018-2249-4 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Ning, Qiao Zhao, Xiaosa Bao, Lingling Ma, Zhiqiang Zhao, Xiaowei Detecting Succinylation sites from protein sequences using ensemble support vector machine |
title | Detecting Succinylation sites from protein sequences using ensemble support vector machine |
title_full | Detecting Succinylation sites from protein sequences using ensemble support vector machine |
title_fullStr | Detecting Succinylation sites from protein sequences using ensemble support vector machine |
title_full_unstemmed | Detecting Succinylation sites from protein sequences using ensemble support vector machine |
title_short | Detecting Succinylation sites from protein sequences using ensemble support vector machine |
title_sort | detecting succinylation sites from protein sequences using ensemble support vector machine |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6016146/ https://www.ncbi.nlm.nih.gov/pubmed/29940836 http://dx.doi.org/10.1186/s12859-018-2249-4 |
work_keys_str_mv | AT ningqiao detectingsuccinylationsitesfromproteinsequencesusingensemblesupportvectormachine AT zhaoxiaosa detectingsuccinylationsitesfromproteinsequencesusingensemblesupportvectormachine AT baolingling detectingsuccinylationsitesfromproteinsequencesusingensemblesupportvectormachine AT mazhiqiang detectingsuccinylationsitesfromproteinsequencesusingensemblesupportvectormachine AT zhaoxiaowei detectingsuccinylationsitesfromproteinsequencesusingensemblesupportvectormachine |