Cargando…

A deep learning method to more accurately recall known lysine acetylation sites

BACKGROUND: Lysine acetylation in protein is one of the most important post-translational modifications (PTMs). It plays an important role in essential biological processes and is related to various diseases. To obtain a comprehensive understanding of regulatory mechanism of lysine acetylation, the...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Meiqi, Yang, Yingxi, Wang, Hui, Xu, Yan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343287/
https://www.ncbi.nlm.nih.gov/pubmed/30674277
http://dx.doi.org/10.1186/s12859-019-2632-9
_version_ 1783389255043645440
author Wu, Meiqi
Yang, Yingxi
Wang, Hui
Xu, Yan
author_facet Wu, Meiqi
Yang, Yingxi
Wang, Hui
Xu, Yan
author_sort Wu, Meiqi
collection PubMed
description BACKGROUND: Lysine acetylation in protein is one of the most important post-translational modifications (PTMs). It plays an important role in essential biological processes and is related to various diseases. To obtain a comprehensive understanding of regulatory mechanism of lysine acetylation, the key is to identify lysine acetylation sites. Previously, several shallow machine learning algorithms had been applied to predict lysine modification sites in proteins. However, shallow machine learning has some disadvantages. For instance, it is not as effective as deep learning for processing big data. RESULTS: In this work, a novel predictor named DeepAcet was developed to predict acetylation sites. Six encoding schemes were adopted, including a one-hot, BLOSUM62 matrix, a composition of K-space amino acid pairs, information gain, physicochemical properties, and a position specific scoring matrix to represent the modified residues. A multilayer perceptron (MLP) was utilized to construct a model to predict lysine acetylation sites in proteins with many different features. We also integrated all features and implemented the feature selection method to select a feature set that contained 2199 features. As a result, the best prediction achieved 84.95% accuracy, 83.45% specificity, 86.44% sensitivity, 0.8540 AUC, and 0.6993 MCC in a 10-fold cross-validation. For an independent test set, the prediction achieved 84.87% accuracy, 83.46% specificity, 86.28% sensitivity, 0.8407 AUC, and 0.6977 MCC. CONCLUSION: The predictive performance of our DeepAcet is better than that of other existing methods. DeepAcet can be freely downloaded from https://github.com/Sunmile/DeepAcet. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2632-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6343287
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63432872019-01-24 A deep learning method to more accurately recall known lysine acetylation sites Wu, Meiqi Yang, Yingxi Wang, Hui Xu, Yan BMC Bioinformatics Research Article BACKGROUND: Lysine acetylation in protein is one of the most important post-translational modifications (PTMs). It plays an important role in essential biological processes and is related to various diseases. To obtain a comprehensive understanding of regulatory mechanism of lysine acetylation, the key is to identify lysine acetylation sites. Previously, several shallow machine learning algorithms had been applied to predict lysine modification sites in proteins. However, shallow machine learning has some disadvantages. For instance, it is not as effective as deep learning for processing big data. RESULTS: In this work, a novel predictor named DeepAcet was developed to predict acetylation sites. Six encoding schemes were adopted, including a one-hot, BLOSUM62 matrix, a composition of K-space amino acid pairs, information gain, physicochemical properties, and a position specific scoring matrix to represent the modified residues. A multilayer perceptron (MLP) was utilized to construct a model to predict lysine acetylation sites in proteins with many different features. We also integrated all features and implemented the feature selection method to select a feature set that contained 2199 features. As a result, the best prediction achieved 84.95% accuracy, 83.45% specificity, 86.44% sensitivity, 0.8540 AUC, and 0.6993 MCC in a 10-fold cross-validation. For an independent test set, the prediction achieved 84.87% accuracy, 83.46% specificity, 86.28% sensitivity, 0.8407 AUC, and 0.6977 MCC. CONCLUSION: The predictive performance of our DeepAcet is better than that of other existing methods. DeepAcet can be freely downloaded from https://github.com/Sunmile/DeepAcet. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2632-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-23 /pmc/articles/PMC6343287/ /pubmed/30674277 http://dx.doi.org/10.1186/s12859-019-2632-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wu, Meiqi
Yang, Yingxi
Wang, Hui
Xu, Yan
A deep learning method to more accurately recall known lysine acetylation sites
title A deep learning method to more accurately recall known lysine acetylation sites
title_full A deep learning method to more accurately recall known lysine acetylation sites
title_fullStr A deep learning method to more accurately recall known lysine acetylation sites
title_full_unstemmed A deep learning method to more accurately recall known lysine acetylation sites
title_short A deep learning method to more accurately recall known lysine acetylation sites
title_sort deep learning method to more accurately recall known lysine acetylation sites
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343287/
https://www.ncbi.nlm.nih.gov/pubmed/30674277
http://dx.doi.org/10.1186/s12859-019-2632-9
work_keys_str_mv AT wumeiqi adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites
AT yangyingxi adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites
AT wanghui adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites
AT xuyan adeeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites
AT wumeiqi deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites
AT yangyingxi deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites
AT wanghui deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites
AT xuyan deeplearningmethodtomoreaccuratelyrecallknownlysineacetylationsites