Cargando…

KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest

DNA-binding protein (DBP) is a protein with a special DNA binding domain that is associated with many important molecular biological mechanisms. Rapid development of computational methods has made it possible to predict DBP on a large scale; however, existing methods do not fully integrate DBP-relat...

Descripción completa

Detalles Bibliográficos
Autores principales: Jia, Yuran, Huang, Shan, Zhang, Tianjiao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667860/
https://www.ncbi.nlm.nih.gov/pubmed/34912382
http://dx.doi.org/10.3389/fgene.2021.811158
_version_ 1784614446413381632
author Jia, Yuran
Huang, Shan
Zhang, Tianjiao
author_facet Jia, Yuran
Huang, Shan
Zhang, Tianjiao
author_sort Jia, Yuran
collection PubMed
description DNA-binding protein (DBP) is a protein with a special DNA binding domain that is associated with many important molecular biological mechanisms. Rapid development of computational methods has made it possible to predict DBP on a large scale; however, existing methods do not fully integrate DBP-related features, resulting in rough prediction results. In this article, we develop a DNA-binding protein identification method called KK-DBP. To improve prediction accuracy, we propose a feature extraction method that fuses multiple PSSM features. The experimental results show a prediction accuracy on the independent test dataset PDB186 of 81.22%, which is the highest of all existing methods.
format Online
Article
Text
id pubmed-8667860
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-86678602021-12-14 KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest Jia, Yuran Huang, Shan Zhang, Tianjiao Front Genet Genetics DNA-binding protein (DBP) is a protein with a special DNA binding domain that is associated with many important molecular biological mechanisms. Rapid development of computational methods has made it possible to predict DBP on a large scale; however, existing methods do not fully integrate DBP-related features, resulting in rough prediction results. In this article, we develop a DNA-binding protein identification method called KK-DBP. To improve prediction accuracy, we propose a feature extraction method that fuses multiple PSSM features. The experimental results show a prediction accuracy on the independent test dataset PDB186 of 81.22%, which is the highest of all existing methods. Frontiers Media S.A. 2021-11-29 /pmc/articles/PMC8667860/ /pubmed/34912382 http://dx.doi.org/10.3389/fgene.2021.811158 Text en Copyright © 2021 Jia, Huang and Zhang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Jia, Yuran
Huang, Shan
Zhang, Tianjiao
KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest
title KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest
title_full KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest
title_fullStr KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest
title_full_unstemmed KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest
title_short KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest
title_sort kk-dbp: a multi-feature fusion method for dna-binding protein identification based on random forest
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667860/
https://www.ncbi.nlm.nih.gov/pubmed/34912382
http://dx.doi.org/10.3389/fgene.2021.811158
work_keys_str_mv AT jiayuran kkdbpamultifeaturefusionmethodfordnabindingproteinidentificationbasedonrandomforest
AT huangshan kkdbpamultifeaturefusionmethodfordnabindingproteinidentificationbasedonrandomforest
AT zhangtianjiao kkdbpamultifeaturefusionmethodfordnabindingproteinidentificationbasedonrandomforest