Cargando…

iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks

BACKGROUND: Since protein-DNA interactions are highly essential to diverse biological events, accurately positioning the location of the DNA-binding residues is necessary. This biological issue, however, is currently a challenging task in the age of post-genomic where data on protein sequences have...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Binh P., Nguyen, Quang H., Doan-Ngoc, Giang-Nam, Nguyen-Vo, Thanh-Hoang, Rahardja, Susanto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933727/
https://www.ncbi.nlm.nih.gov/pubmed/31881828
http://dx.doi.org/10.1186/s12859-019-3295-2
_version_ 1783483267799842816
author Nguyen, Binh P.
Nguyen, Quang H.
Doan-Ngoc, Giang-Nam
Nguyen-Vo, Thanh-Hoang
Rahardja, Susanto
author_facet Nguyen, Binh P.
Nguyen, Quang H.
Doan-Ngoc, Giang-Nam
Nguyen-Vo, Thanh-Hoang
Rahardja, Susanto
author_sort Nguyen, Binh P.
collection PubMed
description BACKGROUND: Since protein-DNA interactions are highly essential to diverse biological events, accurately positioning the location of the DNA-binding residues is necessary. This biological issue, however, is currently a challenging task in the age of post-genomic where data on protein sequences have expanded very fast. In this study, we propose iProDNA-CapsNet – a new prediction model identifying protein-DNA binding residues using an ensemble of capsule neural networks (CapsNets) on position specific scoring matrix (PSMM) profiles. The use of CapsNets promises an innovative approach to determine the location of DNA-binding residues. In this study, the benchmark datasets introduced by Hu et al. (2017), i.e., PDNA-543 and PDNA-TEST, were used to train and evaluate the model, respectively. To fairly assess the model performance, comparative analysis between iProDNA-CapsNet and existing state-of-the-art methods was done. RESULTS: Under the decision threshold corresponding to false positive rate (FPR) ≈ 5%, the accuracy, sensitivity, precision, and Matthews’s correlation coefficient (MCC) of our model is increased by about 2.0%, 2.0%, 14.0%, and 5.0% with respect to TargetDNA (Hu et al., 2017) and 1.0%, 75.0%, 45.0%, and 77.0% with respect to BindN+ (Wang et al., 2010), respectively. With regards to other methods not reporting their threshold settings, iProDNA-CapsNet also shows a significant improvement in performance based on most of the evaluation metrics. Even with different patterns of change among the models, iProDNA-CapsNets remains to be the best model having top performance in most of the metrics, especially MCC which is boosted from about 8.0% to 220.0%. CONCLUSIONS: According to all evaluation metrics under various decision thresholds, iProDNA-CapsNet shows better performance compared to the two current best models (BindN and TargetDNA). Our proposed approach also shows that CapsNet can potentially be used and adopted in other biological applications.
format Online
Article
Text
id pubmed-6933727
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69337272019-12-30 iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks Nguyen, Binh P. Nguyen, Quang H. Doan-Ngoc, Giang-Nam Nguyen-Vo, Thanh-Hoang Rahardja, Susanto BMC Bioinformatics Research BACKGROUND: Since protein-DNA interactions are highly essential to diverse biological events, accurately positioning the location of the DNA-binding residues is necessary. This biological issue, however, is currently a challenging task in the age of post-genomic where data on protein sequences have expanded very fast. In this study, we propose iProDNA-CapsNet – a new prediction model identifying protein-DNA binding residues using an ensemble of capsule neural networks (CapsNets) on position specific scoring matrix (PSMM) profiles. The use of CapsNets promises an innovative approach to determine the location of DNA-binding residues. In this study, the benchmark datasets introduced by Hu et al. (2017), i.e., PDNA-543 and PDNA-TEST, were used to train and evaluate the model, respectively. To fairly assess the model performance, comparative analysis between iProDNA-CapsNet and existing state-of-the-art methods was done. RESULTS: Under the decision threshold corresponding to false positive rate (FPR) ≈ 5%, the accuracy, sensitivity, precision, and Matthews’s correlation coefficient (MCC) of our model is increased by about 2.0%, 2.0%, 14.0%, and 5.0% with respect to TargetDNA (Hu et al., 2017) and 1.0%, 75.0%, 45.0%, and 77.0% with respect to BindN+ (Wang et al., 2010), respectively. With regards to other methods not reporting their threshold settings, iProDNA-CapsNet also shows a significant improvement in performance based on most of the evaluation metrics. Even with different patterns of change among the models, iProDNA-CapsNets remains to be the best model having top performance in most of the metrics, especially MCC which is boosted from about 8.0% to 220.0%. CONCLUSIONS: According to all evaluation metrics under various decision thresholds, iProDNA-CapsNet shows better performance compared to the two current best models (BindN and TargetDNA). Our proposed approach also shows that CapsNet can potentially be used and adopted in other biological applications. BioMed Central 2019-12-27 /pmc/articles/PMC6933727/ /pubmed/31881828 http://dx.doi.org/10.1186/s12859-019-3295-2 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Nguyen, Binh P.
Nguyen, Quang H.
Doan-Ngoc, Giang-Nam
Nguyen-Vo, Thanh-Hoang
Rahardja, Susanto
iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_full iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_fullStr iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_full_unstemmed iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_short iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_sort iprodna-capsnet: identifying protein-dna binding residues using capsule neural networks
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933727/
https://www.ncbi.nlm.nih.gov/pubmed/31881828
http://dx.doi.org/10.1186/s12859-019-3295-2
work_keys_str_mv AT nguyenbinhp iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT nguyenquangh iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT doanngocgiangnam iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT nguyenvothanhhoang iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT rahardjasusanto iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks