Cargando…
SSMFN: a fused spatial and sequential deep learning model for methylation site prediction
BACKGROUND: Conventional in vivo methods for post-translational modification site prediction such as spectrophotometry, Western blotting, and chromatin immune precipitation can be very expensive and time-consuming. Neural networks (NN) are one of the computational approaches that can predict effecti...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8409337/ https://www.ncbi.nlm.nih.gov/pubmed/34541311 http://dx.doi.org/10.7717/peerj-cs.683 |
_version_ | 1783746977779941376 |
---|---|
author | Lumbanraja, Favorisen Rosyking Mahesworo, Bharuno Cenggoro, Tjeng Wawan Sudigyo, Digdo Pardamean, Bens |
author_facet | Lumbanraja, Favorisen Rosyking Mahesworo, Bharuno Cenggoro, Tjeng Wawan Sudigyo, Digdo Pardamean, Bens |
author_sort | Lumbanraja, Favorisen Rosyking |
collection | PubMed |
description | BACKGROUND: Conventional in vivo methods for post-translational modification site prediction such as spectrophotometry, Western blotting, and chromatin immune precipitation can be very expensive and time-consuming. Neural networks (NN) are one of the computational approaches that can predict effectively the post-translational modification site. We developed a neural network model, namely the Sequential and Spatial Methylation Fusion Network (SSMFN), to predict possible methylation sites on protein sequences. METHOD: We designed our model to be able to extract spatial and sequential information from amino acid sequences. Convolutional neural networks (CNN) is applied to harness spatial information, while long short-term memory (LSTM) is applied for sequential data. The latent representation of the CNN and LSTM branch are then fused. Afterwards, we compared the performance of our proposed model to the state-of-the-art methylation site prediction models on the balanced and imbalanced dataset. RESULTS: Our model appeared to be better in almost all measurement when trained on the balanced training dataset. On the imbalanced training dataset, all of the models gave better performance since they are trained on more data. In several metrics, our model also surpasses the PRMePred model, which requires a laborious effort for feature extraction and selection. CONCLUSION: Our models achieved the best performance across different environments in almost all measurements. Also, our result suggests that the NN model trained on a balanced training dataset and tested on an imbalanced dataset will offer high specificity and low sensitivity. Thus, the NN model for methylation site prediction should be trained on an imbalanced dataset. Since in the actual application, there are far more negative samples than positive samples. |
format | Online Article Text |
id | pubmed-8409337 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-84093372021-09-17 SSMFN: a fused spatial and sequential deep learning model for methylation site prediction Lumbanraja, Favorisen Rosyking Mahesworo, Bharuno Cenggoro, Tjeng Wawan Sudigyo, Digdo Pardamean, Bens PeerJ Comput Sci Bioinformatics BACKGROUND: Conventional in vivo methods for post-translational modification site prediction such as spectrophotometry, Western blotting, and chromatin immune precipitation can be very expensive and time-consuming. Neural networks (NN) are one of the computational approaches that can predict effectively the post-translational modification site. We developed a neural network model, namely the Sequential and Spatial Methylation Fusion Network (SSMFN), to predict possible methylation sites on protein sequences. METHOD: We designed our model to be able to extract spatial and sequential information from amino acid sequences. Convolutional neural networks (CNN) is applied to harness spatial information, while long short-term memory (LSTM) is applied for sequential data. The latent representation of the CNN and LSTM branch are then fused. Afterwards, we compared the performance of our proposed model to the state-of-the-art methylation site prediction models on the balanced and imbalanced dataset. RESULTS: Our model appeared to be better in almost all measurement when trained on the balanced training dataset. On the imbalanced training dataset, all of the models gave better performance since they are trained on more data. In several metrics, our model also surpasses the PRMePred model, which requires a laborious effort for feature extraction and selection. CONCLUSION: Our models achieved the best performance across different environments in almost all measurements. Also, our result suggests that the NN model trained on a balanced training dataset and tested on an imbalanced dataset will offer high specificity and low sensitivity. Thus, the NN model for methylation site prediction should be trained on an imbalanced dataset. Since in the actual application, there are far more negative samples than positive samples. PeerJ Inc. 2021-08-26 /pmc/articles/PMC8409337/ /pubmed/34541311 http://dx.doi.org/10.7717/peerj-cs.683 Text en ©2021 Lumbanraja et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Lumbanraja, Favorisen Rosyking Mahesworo, Bharuno Cenggoro, Tjeng Wawan Sudigyo, Digdo Pardamean, Bens SSMFN: a fused spatial and sequential deep learning model for methylation site prediction |
title | SSMFN: a fused spatial and sequential deep learning model for methylation site prediction |
title_full | SSMFN: a fused spatial and sequential deep learning model for methylation site prediction |
title_fullStr | SSMFN: a fused spatial and sequential deep learning model for methylation site prediction |
title_full_unstemmed | SSMFN: a fused spatial and sequential deep learning model for methylation site prediction |
title_short | SSMFN: a fused spatial and sequential deep learning model for methylation site prediction |
title_sort | ssmfn: a fused spatial and sequential deep learning model for methylation site prediction |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8409337/ https://www.ncbi.nlm.nih.gov/pubmed/34541311 http://dx.doi.org/10.7717/peerj-cs.683 |
work_keys_str_mv | AT lumbanrajafavorisenrosyking ssmfnafusedspatialandsequentialdeeplearningmodelformethylationsiteprediction AT mahesworobharuno ssmfnafusedspatialandsequentialdeeplearningmodelformethylationsiteprediction AT cenggorotjengwawan ssmfnafusedspatialandsequentialdeeplearningmodelformethylationsiteprediction AT sudigyodigdo ssmfnafusedspatialandsequentialdeeplearningmodelformethylationsiteprediction AT pardameanbens ssmfnafusedspatialandsequentialdeeplearningmodelformethylationsiteprediction |