Cargando…
A deep learning method to more accurately recall known lysine acetylation sites
BACKGROUND: Lysine acetylation in protein is one of the most important post-translational modifications (PTMs). It plays an important role in essential biological processes and is related to various diseases. To obtain a comprehensive understanding of regulatory mechanism of lysine acetylation, the...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343287/ https://www.ncbi.nlm.nih.gov/pubmed/30674277 http://dx.doi.org/10.1186/s12859-019-2632-9 |
_version_ | 1783389255043645440 |
---|---|
author | Wu, Meiqi Yang, Yingxi Wang, Hui Xu, Yan |
author_facet | Wu, Meiqi Yang, Yingxi Wang, Hui Xu, Yan |
author_sort | Wu, Meiqi |
collection | PubMed |
description | BACKGROUND: Lysine acetylation in protein is one of the most important post-translational modifications (PTMs). It plays an important role in essential biological processes and is related to various diseases. To obtain a comprehensive understanding of regulatory mechanism of lysine acetylation, the key is to identify lysine acetylation sites. Previously, several shallow machine learning algorithms had been applied to predict lysine modification sites in proteins. However, shallow machine learning has some disadvantages. For instance, it is not as effective as deep learning for processing big data. RESULTS: In this work, a novel predictor named DeepAcet was developed to predict acetylation sites. Six encoding schemes were adopted, including a one-hot, BLOSUM62 matrix, a composition of K-space amino acid pairs, information gain, physicochemical properties, and a position specific scoring matrix to represent the modified residues. A multilayer perceptron (MLP) was utilized to construct a model to predict lysine acetylation sites in proteins with many different features. We also integrated all features and implemented the feature selection method to select a feature set that contained 2199 features. As a result, the best prediction achieved 84.95% accuracy, 83.45% specificity, 86.44% sensitivity, 0.8540 AUC, and 0.6993 MCC in a 10-fold cross-validation. For an independent test set, the prediction achieved 84.87% accuracy, 83.46% specificity, 86.28% sensitivity, 0.8407 AUC, and 0.6977 MCC. CONCLUSION: The predictive performance of our DeepAcet is better than that of other existing methods. DeepAcet can be freely downloaded from https://github.com/Sunmile/DeepAcet. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2632-9) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6343287 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63432872019-01-24 A deep learning method to more accurately recall known lysine acetylation sites Wu, Meiqi Yang, Yingxi Wang, Hui Xu, Yan BMC Bioinformatics Research Article BACKGROUND: Lysine acetylation in protein is one of the most important post-translational modifications (PTMs). It plays an important role in essential biological processes and is related to various diseases. To obtain a comprehensive understanding of regulatory mechanism of lysine acetylation, the key is to identify lysine acetylation sites. Previously, several shallow machine learning algorithms had been applied to predict lysine modification sites in proteins. However, shallow machine learning has some disadvantages. For instance, it is not as effective as deep learning for processing big data. RESULTS: In this work, a novel predictor named DeepAcet was developed to predict acetylation sites. Six encoding schemes were adopted, including a one-hot, BLOSUM62 matrix, a composition of K-space amino acid pairs, information gain, physicochemical properties, and a position specific scoring matrix to represent the modified residues. A multilayer perceptron (MLP) was utilized to construct a model to predict lysine acetylation sites in proteins with many different features. We also integrated all features and implemented the feature selection method to select a feature set that contained 2199 features. As a result, the best prediction achieved 84.95% accuracy, 83.45% specificity, 86.44% sensitivity, 0.8540 AUC, and 0.6993 MCC in a 10-fold cross-validation. For an independent test set, the prediction achieved 84.87% accuracy, 83.46% specificity, 86.28% sensitivity, 0.8407 AUC, and 0.6977 MCC. CONCLUSION: The predictive performance of our DeepAcet is better than that of other existing methods. DeepAcet can be freely downloaded from https://github.com/Sunmile/DeepAcet. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2632-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-23 /pmc/articles/PMC6343287/ /pubmed/30674277 http://dx.doi.org/10.1186/s12859-019-2632-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Wu, Meiqi Yang, Yingxi Wang, Hui Xu, Yan A deep learning method to more accurately recall known lysine acetylation sites |
title | A deep learning method to more accurately recall known lysine acetylation sites |
title_full | A deep learning method to more accurately recall known lysine acetylation sites |
title_fullStr | A deep learning method to more accurately recall known lysine acetylation sites |
title_full_unstemmed | A deep learning method to more accurately recall known lysine acetylation sites |
title_short | A deep learning method to more accurately recall known lysine acetylation sites |
title_sort | deep learning method to more accurately recall known lysine acetylation sites |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343287/ https://www.ncbi.nlm.nih.gov/pubmed/30674277 http://dx.doi.org/10.1186/s12859-019-2632-9 |
work_keys_str_mv | AT wumeiqi adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites AT yangyingxi adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites AT wanghui adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites AT xuyan adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites AT wumeiqi deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites AT yangyingxi deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites AT wanghui deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites AT xuyan deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites |