Cargando…

DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features

In the domain of genome annotation, the identification of DNA-binding protein is one of the crucial challenges. DNA is considered a blueprint for the cell. It contained all necessary information for building and maintaining the trait of an organism. It is DNA, which makes a living thing, a living th...

Descripción completa

Detalles Bibliográficos
Autores principales: Barukab, Omar, Khan, Yaser Daanial, Khan, Sher Afzal, Chou, Kuo-Chen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9020926/
https://www.ncbi.nlm.nih.gov/pubmed/35465187
http://dx.doi.org/10.1155/2022/5483115
_version_ 1784689676579241984
author Barukab, Omar
Khan, Yaser Daanial
Khan, Sher Afzal
Chou, Kuo-Chen
author_facet Barukab, Omar
Khan, Yaser Daanial
Khan, Sher Afzal
Chou, Kuo-Chen
author_sort Barukab, Omar
collection PubMed
description In the domain of genome annotation, the identification of DNA-binding protein is one of the crucial challenges. DNA is considered a blueprint for the cell. It contained all necessary information for building and maintaining the trait of an organism. It is DNA, which makes a living thing, a living thing. Protein interaction with DNA performs an essential role in regulating DNA functions such as DNA repair, transcription, and regulation. Identification of these proteins is a crucial task for understanding the regulation of genes. Several methods have been developed to identify the binding sites of DNA and protein depending upon the structures and sequences, but they were costly and time-consuming. Therefore, we propose a methodology named “DNAPred_Prot”, which uses various position and frequency-dependent features from protein sequences for efficient and effective prediction of DNA-binding proteins. Using testing techniques like 10-fold cross-validation and jackknife testing an accuracy of 94.95% and 95.11% was yielded, respectively. The results of SVM and ANN were also compared with those of a random forest classifier. The robustness of the proposed model was evaluated by using the independent dataset PDB186, and an accuracy of 91.47% was achieved by it. From these results, it can be predicted that the suggested methodology performs better than other extant methods for the identification of DNA-binding proteins.
format Online
Article
Text
id pubmed-9020926
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-90209262022-04-21 DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features Barukab, Omar Khan, Yaser Daanial Khan, Sher Afzal Chou, Kuo-Chen Appl Bionics Biomech Research Article In the domain of genome annotation, the identification of DNA-binding protein is one of the crucial challenges. DNA is considered a blueprint for the cell. It contained all necessary information for building and maintaining the trait of an organism. It is DNA, which makes a living thing, a living thing. Protein interaction with DNA performs an essential role in regulating DNA functions such as DNA repair, transcription, and regulation. Identification of these proteins is a crucial task for understanding the regulation of genes. Several methods have been developed to identify the binding sites of DNA and protein depending upon the structures and sequences, but they were costly and time-consuming. Therefore, we propose a methodology named “DNAPred_Prot”, which uses various position and frequency-dependent features from protein sequences for efficient and effective prediction of DNA-binding proteins. Using testing techniques like 10-fold cross-validation and jackknife testing an accuracy of 94.95% and 95.11% was yielded, respectively. The results of SVM and ANN were also compared with those of a random forest classifier. The robustness of the proposed model was evaluated by using the independent dataset PDB186, and an accuracy of 91.47% was achieved by it. From these results, it can be predicted that the suggested methodology performs better than other extant methods for the identification of DNA-binding proteins. Hindawi 2022-04-13 /pmc/articles/PMC9020926/ /pubmed/35465187 http://dx.doi.org/10.1155/2022/5483115 Text en Copyright © 2022 Omar Barukab et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Barukab, Omar
Khan, Yaser Daanial
Khan, Sher Afzal
Chou, Kuo-Chen
DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features
title DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features
title_full DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features
title_fullStr DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features
title_full_unstemmed DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features
title_short DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features
title_sort dnapred_prot: identification of dna-binding proteins using composition- and position-based features
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9020926/
https://www.ncbi.nlm.nih.gov/pubmed/35465187
http://dx.doi.org/10.1155/2022/5483115
work_keys_str_mv AT barukabomar dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures
AT khanyaserdaanial dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures
AT khansherafzal dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures
AT choukuochen dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures