Cargando…

iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition

Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein seque...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Bin, Xu, Jinghao, Lan, Xun, Xu, Ruifeng, Zhou, Jiyun, Wang, Xiaolong, Chou, Kuo-Chen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4153653/
https://www.ncbi.nlm.nih.gov/pubmed/25184541
http://dx.doi.org/10.1371/journal.pone.0106691
_version_ 1782333319782858752
author Liu, Bin
Xu, Jinghao
Lan, Xun
Xu, Ruifeng
Zhou, Jiyun
Wang, Xiaolong
Chou, Kuo-Chen
author_facet Liu, Bin
Xu, Jinghao
Lan, Xun
Xu, Ruifeng
Zhou, Jiyun
Wang, Xiaolong
Chou, Kuo-Chen
author_sort Liu, Bin
collection PubMed
description Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein sequences generated in the postgenomic age, it is a critical challenge to develop automated methods for accurate and rapidly identifying DNA-binding proteins based on their sequence information alone. Here, a novel predictor, called “iDNA-Prot|dis”, was established by incorporating the amino acid distance-pair coupling information and the amino acid reduced alphabet profile into the general pseudo amino acid composition (PseAAC) vector. The former can capture the characteristics of DNA-binding proteins so as to enhance its prediction quality, while the latter can reduce the dimension of PseAAC vector so as to speed up its prediction process. It was observed by the rigorous jackknife and independent dataset tests that the new predictor outperformed the existing predictors for the same purpose. As a user-friendly web-server, iDNA-Prot|dis is accessible to the public at http://bioinformatics.hitsz.edu.cn/iDNA-Prot_dis/. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step protocol guide is provided on how to use the web-server to get their desired results without the need to follow the complicated mathematic equations that are presented in this paper just for the integrity of its developing process. It is anticipated that the iDNA-Prot|dis predictor may become a useful high throughput tool for large-scale analysis of DNA-binding proteins, or at the very least, play a complementary role to the existing predictors in this regard.
format Online
Article
Text
id pubmed-4153653
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41536532014-09-05 iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition Liu, Bin Xu, Jinghao Lan, Xun Xu, Ruifeng Zhou, Jiyun Wang, Xiaolong Chou, Kuo-Chen PLoS One Research Article Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein sequences generated in the postgenomic age, it is a critical challenge to develop automated methods for accurate and rapidly identifying DNA-binding proteins based on their sequence information alone. Here, a novel predictor, called “iDNA-Prot|dis”, was established by incorporating the amino acid distance-pair coupling information and the amino acid reduced alphabet profile into the general pseudo amino acid composition (PseAAC) vector. The former can capture the characteristics of DNA-binding proteins so as to enhance its prediction quality, while the latter can reduce the dimension of PseAAC vector so as to speed up its prediction process. It was observed by the rigorous jackknife and independent dataset tests that the new predictor outperformed the existing predictors for the same purpose. As a user-friendly web-server, iDNA-Prot|dis is accessible to the public at http://bioinformatics.hitsz.edu.cn/iDNA-Prot_dis/. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step protocol guide is provided on how to use the web-server to get their desired results without the need to follow the complicated mathematic equations that are presented in this paper just for the integrity of its developing process. It is anticipated that the iDNA-Prot|dis predictor may become a useful high throughput tool for large-scale analysis of DNA-binding proteins, or at the very least, play a complementary role to the existing predictors in this regard. Public Library of Science 2014-09-03 /pmc/articles/PMC4153653/ /pubmed/25184541 http://dx.doi.org/10.1371/journal.pone.0106691 Text en © 2014 Liu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Liu, Bin
Xu, Jinghao
Lan, Xun
Xu, Ruifeng
Zhou, Jiyun
Wang, Xiaolong
Chou, Kuo-Chen
iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition
title iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition
title_full iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition
title_fullStr iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition
title_full_unstemmed iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition
title_short iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition
title_sort idna-prot|dis: identifying dna-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4153653/
https://www.ncbi.nlm.nih.gov/pubmed/25184541
http://dx.doi.org/10.1371/journal.pone.0106691
work_keys_str_mv AT liubin idnaprotdisidentifyingdnabindingproteinsbyincorporatingaminoaciddistancepairsandreducedalphabetprofileintothegeneralpseudoaminoacidcomposition
AT xujinghao idnaprotdisidentifyingdnabindingproteinsbyincorporatingaminoaciddistancepairsandreducedalphabetprofileintothegeneralpseudoaminoacidcomposition
AT lanxun idnaprotdisidentifyingdnabindingproteinsbyincorporatingaminoaciddistancepairsandreducedalphabetprofileintothegeneralpseudoaminoacidcomposition
AT xuruifeng idnaprotdisidentifyingdnabindingproteinsbyincorporatingaminoaciddistancepairsandreducedalphabetprofileintothegeneralpseudoaminoacidcomposition
AT zhoujiyun idnaprotdisidentifyingdnabindingproteinsbyincorporatingaminoaciddistancepairsandreducedalphabetprofileintothegeneralpseudoaminoacidcomposition
AT wangxiaolong idnaprotdisidentifyingdnabindingproteinsbyincorporatingaminoaciddistancepairsandreducedalphabetprofileintothegeneralpseudoaminoacidcomposition
AT choukuochen idnaprotdisidentifyingdnabindingproteinsbyincorporatingaminoaciddistancepairsandreducedalphabetprofileintothegeneralpseudoaminoacidcomposition