Cargando…

Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites

As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Zhen, He, Ningning, Huang, Yu, Qin, Wen Tao, Liu, Xuhan, Li, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6411950/
https://www.ncbi.nlm.nih.gov/pubmed/30639696
http://dx.doi.org/10.1016/j.gpb.2018.08.004
_version_ 1783402490250657792
author Chen, Zhen
He, Ningning
Huang, Yu
Qin, Wen Tao
Liu, Xuhan
Li, Lei
author_facet Chen, Zhen
He, Ningning
Huang, Yu
Qin, Wen Tao
Liu, Xuhan
Li, Lei
author_sort Chen, Zhen
collection PubMed
description As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning (DL) network classifier based on long short-term memory (LSTM) with word embedding (LSTM(WE)) for the prediction of mammalian malonylation sites. LSTM(WE) performs better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTM(WE) is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning (ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTM(WE) and the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence. LEMP is available at http://www.bioinfogo.org/lemp.
format Online
Article
Text
id pubmed-6411950
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-64119502019-03-22 Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites Chen, Zhen He, Ningning Huang, Yu Qin, Wen Tao Liu, Xuhan Li, Lei Genomics Proteomics Bioinformatics Method As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning (DL) network classifier based on long short-term memory (LSTM) with word embedding (LSTM(WE)) for the prediction of mammalian malonylation sites. LSTM(WE) performs better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTM(WE) is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning (ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTM(WE) and the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence. LEMP is available at http://www.bioinfogo.org/lemp. Elsevier 2018-12 2019-01-11 /pmc/articles/PMC6411950/ /pubmed/30639696 http://dx.doi.org/10.1016/j.gpb.2018.08.004 Text en © 2019 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Method
Chen, Zhen
He, Ningning
Huang, Yu
Qin, Wen Tao
Liu, Xuhan
Li, Lei
Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
title Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
title_full Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
title_fullStr Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
title_full_unstemmed Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
title_short Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
title_sort integration of a deep learning classifier with a random forest approach for predicting malonylation sites
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6411950/
https://www.ncbi.nlm.nih.gov/pubmed/30639696
http://dx.doi.org/10.1016/j.gpb.2018.08.004
work_keys_str_mv AT chenzhen integrationofadeeplearningclassifierwitharandomforestapproachforpredictingmalonylationsites
AT heningning integrationofadeeplearningclassifierwitharandomforestapproachforpredictingmalonylationsites
AT huangyu integrationofadeeplearningclassifierwitharandomforestapproachforpredictingmalonylationsites
AT qinwentao integrationofadeeplearningclassifierwitharandomforestapproachforpredictingmalonylationsites
AT liuxuhan integrationofadeeplearningclassifierwitharandomforestapproachforpredictingmalonylationsites
AT lilei integrationofadeeplearningclassifierwitharandomforestapproachforpredictingmalonylationsites