Cargando…

A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome

Diverse mechanisms for DNA-protein recognition have been elucidated in numerous atomic complex structures from various protein families. These structural data provide an invaluable knowledge base not only for understanding DNA-protein interactions, but also for developing specialized methods that pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Mu, Skolnick, Jeffrey
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770119/
https://www.ncbi.nlm.nih.gov/pubmed/19911048
http://dx.doi.org/10.1371/journal.pcbi.1000567
_version_ 1782173631377309696
author Gao, Mu
Skolnick, Jeffrey
author_facet Gao, Mu
Skolnick, Jeffrey
author_sort Gao, Mu
collection PubMed
description Diverse mechanisms for DNA-protein recognition have been elucidated in numerous atomic complex structures from various protein families. These structural data provide an invaluable knowledge base not only for understanding DNA-protein interactions, but also for developing specialized methods that predict the DNA-binding function from protein structure. While such methods are useful, a major limitation is that they require an experimental structure of the target as input. To overcome this obstacle, we develop a threading-based method, DNA-Binding-Domain-Threader (DBD-Threader), for the prediction of DNA-binding domains and associated DNA-binding protein residues. Our method, which uses a template library composed of DNA-protein complex structures, requires only the target protein's sequence. In our approach, fold similarity and DNA-binding propensity are employed as two functional discriminating properties. In benchmark tests on 179 DNA-binding and 3,797 non-DNA-binding proteins, using templates whose sequence identity is less than 30% to the target, DBD-Threader achieves a sensitivity/precision of 56%/86%. This performance is considerably better than the standard sequence comparison method PSI-BLAST and is comparable to DBD-Hunter, which requires an experimental structure as input. Moreover, for over 70% of predicted DNA-binding domains, the backbone Root Mean Square Deviations (RMSDs) of the top-ranked structural models are within 6.5 Å of their experimental structures, with their associated DNA-binding sites identified at satisfactory accuracy. Additionally, DBD-Threader correctly assigned the SCOP superfamily for most predicted domains. To demonstrate that DBD-Threader is useful for automatic function annotation on a large-scale, DBD-Threader was applied to 18,631 protein sequences from the human genome; 1,654 proteins are predicted to have DNA-binding function. Comparison with existing Gene Ontology (GO) annotations suggests that ∼30% of our predictions are new. Finally, we present some interesting predictions in detail. In particular, it is estimated that ∼20% of classic zinc finger domains play a functional role not related to direct DNA-binding.
format Text
id pubmed-2770119
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27701192009-11-13 A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome Gao, Mu Skolnick, Jeffrey PLoS Comput Biol Research Article Diverse mechanisms for DNA-protein recognition have been elucidated in numerous atomic complex structures from various protein families. These structural data provide an invaluable knowledge base not only for understanding DNA-protein interactions, but also for developing specialized methods that predict the DNA-binding function from protein structure. While such methods are useful, a major limitation is that they require an experimental structure of the target as input. To overcome this obstacle, we develop a threading-based method, DNA-Binding-Domain-Threader (DBD-Threader), for the prediction of DNA-binding domains and associated DNA-binding protein residues. Our method, which uses a template library composed of DNA-protein complex structures, requires only the target protein's sequence. In our approach, fold similarity and DNA-binding propensity are employed as two functional discriminating properties. In benchmark tests on 179 DNA-binding and 3,797 non-DNA-binding proteins, using templates whose sequence identity is less than 30% to the target, DBD-Threader achieves a sensitivity/precision of 56%/86%. This performance is considerably better than the standard sequence comparison method PSI-BLAST and is comparable to DBD-Hunter, which requires an experimental structure as input. Moreover, for over 70% of predicted DNA-binding domains, the backbone Root Mean Square Deviations (RMSDs) of the top-ranked structural models are within 6.5 Å of their experimental structures, with their associated DNA-binding sites identified at satisfactory accuracy. Additionally, DBD-Threader correctly assigned the SCOP superfamily for most predicted domains. To demonstrate that DBD-Threader is useful for automatic function annotation on a large-scale, DBD-Threader was applied to 18,631 protein sequences from the human genome; 1,654 proteins are predicted to have DNA-binding function. Comparison with existing Gene Ontology (GO) annotations suggests that ∼30% of our predictions are new. Finally, we present some interesting predictions in detail. In particular, it is estimated that ∼20% of classic zinc finger domains play a functional role not related to direct DNA-binding. Public Library of Science 2009-11-13 /pmc/articles/PMC2770119/ /pubmed/19911048 http://dx.doi.org/10.1371/journal.pcbi.1000567 Text en Gao, Skolnick. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Gao, Mu
Skolnick, Jeffrey
A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome
title A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome
title_full A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome
title_fullStr A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome
title_full_unstemmed A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome
title_short A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome
title_sort threading-based method for the prediction of dna-binding proteins with application to the human genome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770119/
https://www.ncbi.nlm.nih.gov/pubmed/19911048
http://dx.doi.org/10.1371/journal.pcbi.1000567
work_keys_str_mv AT gaomu athreadingbasedmethodforthepredictionofdnabindingproteinswithapplicationtothehumangenome
AT skolnickjeffrey athreadingbasedmethodforthepredictionofdnabindingproteinswithapplicationtothehumangenome
AT gaomu threadingbasedmethodforthepredictionofdnabindingproteinswithapplicationtothehumangenome
AT skolnickjeffrey threadingbasedmethodforthepredictionofdnabindingproteinswithapplicationtothehumangenome