Cargando…

Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm

The exploration of DNA-binding proteins (DBPs) is an important aspect of studying biological life activities. Research on life activities requires the support of scientific research results on DBPs. The decline in many life activities is closely related to DBPs. Generally, the detection method for i...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Ziye, Yang, Wen, Zhai, Yixiao, Liang, Yingjian, Zhao, Yuming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8837382/
https://www.ncbi.nlm.nih.gov/pubmed/35154264
http://dx.doi.org/10.3389/fgene.2021.821996
_version_ 1784649897129017344
author Zhao, Ziye
Yang, Wen
Zhai, Yixiao
Liang, Yingjian
Zhao, Yuming
author_facet Zhao, Ziye
Yang, Wen
Zhai, Yixiao
Liang, Yingjian
Zhao, Yuming
author_sort Zhao, Ziye
collection PubMed
description The exploration of DNA-binding proteins (DBPs) is an important aspect of studying biological life activities. Research on life activities requires the support of scientific research results on DBPs. The decline in many life activities is closely related to DBPs. Generally, the detection method for identifying DBPs is achieved through biochemical experiments. This method is inefficient and requires considerable manpower, material resources and time. At present, several computational approaches have been developed to detect DBPs, among which machine learning (ML) algorithm-based computational techniques have shown excellent performance. In our experiments, our method uses fewer features and simpler recognition methods than other methods and simultaneously obtains satisfactory results. First, we use six feature extraction methods to extract sequence features from the same group of DBPs. Then, this feature information is spliced together, and the data are standardized. Finally, the extreme gradient boosting (XGBoost) model is used to construct an effective predictive model. Compared with other excellent methods, our proposed method has achieved better results. The accuracy achieved by our method is 78.26% for PDB2272 and 85.48% for PDB186. The accuracy of the experimental results achieved by our strategy is similar to that of previous detection methods.
format Online
Article
Text
id pubmed-8837382
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-88373822022-02-12 Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm Zhao, Ziye Yang, Wen Zhai, Yixiao Liang, Yingjian Zhao, Yuming Front Genet Genetics The exploration of DNA-binding proteins (DBPs) is an important aspect of studying biological life activities. Research on life activities requires the support of scientific research results on DBPs. The decline in many life activities is closely related to DBPs. Generally, the detection method for identifying DBPs is achieved through biochemical experiments. This method is inefficient and requires considerable manpower, material resources and time. At present, several computational approaches have been developed to detect DBPs, among which machine learning (ML) algorithm-based computational techniques have shown excellent performance. In our experiments, our method uses fewer features and simpler recognition methods than other methods and simultaneously obtains satisfactory results. First, we use six feature extraction methods to extract sequence features from the same group of DBPs. Then, this feature information is spliced together, and the data are standardized. Finally, the extreme gradient boosting (XGBoost) model is used to construct an effective predictive model. Compared with other excellent methods, our proposed method has achieved better results. The accuracy achieved by our method is 78.26% for PDB2272 and 85.48% for PDB186. The accuracy of the experimental results achieved by our strategy is similar to that of previous detection methods. Frontiers Media S.A. 2022-01-28 /pmc/articles/PMC8837382/ /pubmed/35154264 http://dx.doi.org/10.3389/fgene.2021.821996 Text en Copyright © 2022 Zhao, Yang, Zhai, Liang and Zhao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhao, Ziye
Yang, Wen
Zhai, Yixiao
Liang, Yingjian
Zhao, Yuming
Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm
title Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm
title_full Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm
title_fullStr Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm
title_full_unstemmed Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm
title_short Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm
title_sort identify dna-binding proteins through the extreme gradient boosting algorithm
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8837382/
https://www.ncbi.nlm.nih.gov/pubmed/35154264
http://dx.doi.org/10.3389/fgene.2021.821996
work_keys_str_mv AT zhaoziye identifydnabindingproteinsthroughtheextremegradientboostingalgorithm
AT yangwen identifydnabindingproteinsthroughtheextremegradientboostingalgorithm
AT zhaiyixiao identifydnabindingproteinsthroughtheextremegradientboostingalgorithm
AT liangyingjian identifydnabindingproteinsthroughtheextremegradientboostingalgorithm
AT zhaoyuming identifydnabindingproteinsthroughtheextremegradientboostingalgorithm