Cargando…
DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features
In the domain of genome annotation, the identification of DNA-binding protein is one of the crucial challenges. DNA is considered a blueprint for the cell. It contained all necessary information for building and maintaining the trait of an organism. It is DNA, which makes a living thing, a living th...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9020926/ https://www.ncbi.nlm.nih.gov/pubmed/35465187 http://dx.doi.org/10.1155/2022/5483115 |
_version_ | 1784689676579241984 |
---|---|
author | Barukab, Omar Khan, Yaser Daanial Khan, Sher Afzal Chou, Kuo-Chen |
author_facet | Barukab, Omar Khan, Yaser Daanial Khan, Sher Afzal Chou, Kuo-Chen |
author_sort | Barukab, Omar |
collection | PubMed |
description | In the domain of genome annotation, the identification of DNA-binding protein is one of the crucial challenges. DNA is considered a blueprint for the cell. It contained all necessary information for building and maintaining the trait of an organism. It is DNA, which makes a living thing, a living thing. Protein interaction with DNA performs an essential role in regulating DNA functions such as DNA repair, transcription, and regulation. Identification of these proteins is a crucial task for understanding the regulation of genes. Several methods have been developed to identify the binding sites of DNA and protein depending upon the structures and sequences, but they were costly and time-consuming. Therefore, we propose a methodology named “DNAPred_Prot”, which uses various position and frequency-dependent features from protein sequences for efficient and effective prediction of DNA-binding proteins. Using testing techniques like 10-fold cross-validation and jackknife testing an accuracy of 94.95% and 95.11% was yielded, respectively. The results of SVM and ANN were also compared with those of a random forest classifier. The robustness of the proposed model was evaluated by using the independent dataset PDB186, and an accuracy of 91.47% was achieved by it. From these results, it can be predicted that the suggested methodology performs better than other extant methods for the identification of DNA-binding proteins. |
format | Online Article Text |
id | pubmed-9020926 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-90209262022-04-21 DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features Barukab, Omar Khan, Yaser Daanial Khan, Sher Afzal Chou, Kuo-Chen Appl Bionics Biomech Research Article In the domain of genome annotation, the identification of DNA-binding protein is one of the crucial challenges. DNA is considered a blueprint for the cell. It contained all necessary information for building and maintaining the trait of an organism. It is DNA, which makes a living thing, a living thing. Protein interaction with DNA performs an essential role in regulating DNA functions such as DNA repair, transcription, and regulation. Identification of these proteins is a crucial task for understanding the regulation of genes. Several methods have been developed to identify the binding sites of DNA and protein depending upon the structures and sequences, but they were costly and time-consuming. Therefore, we propose a methodology named “DNAPred_Prot”, which uses various position and frequency-dependent features from protein sequences for efficient and effective prediction of DNA-binding proteins. Using testing techniques like 10-fold cross-validation and jackknife testing an accuracy of 94.95% and 95.11% was yielded, respectively. The results of SVM and ANN were also compared with those of a random forest classifier. The robustness of the proposed model was evaluated by using the independent dataset PDB186, and an accuracy of 91.47% was achieved by it. From these results, it can be predicted that the suggested methodology performs better than other extant methods for the identification of DNA-binding proteins. Hindawi 2022-04-13 /pmc/articles/PMC9020926/ /pubmed/35465187 http://dx.doi.org/10.1155/2022/5483115 Text en Copyright © 2022 Omar Barukab et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Barukab, Omar Khan, Yaser Daanial Khan, Sher Afzal Chou, Kuo-Chen DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features |
title | DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features |
title_full | DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features |
title_fullStr | DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features |
title_full_unstemmed | DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features |
title_short | DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features |
title_sort | dnapred_prot: identification of dna-binding proteins using composition- and position-based features |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9020926/ https://www.ncbi.nlm.nih.gov/pubmed/35465187 http://dx.doi.org/10.1155/2022/5483115 |
work_keys_str_mv | AT barukabomar dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures AT khanyaserdaanial dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures AT khansherafzal dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures AT choukuochen dnapredprotidentificationofdnabindingproteinsusingcompositionandpositionbasedfeatures |