Cargando…
The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function
BACKGROUND: The PD-(D/E)XK nuclease superfamily, initially identified in type II restriction endonucleases and later in many enzymes involved in DNA recombination and repair, is one of the most challenging targets for protein sequence analysis and structure prediction. Typically, the sequence simila...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1189080/ https://www.ncbi.nlm.nih.gov/pubmed/16011798 http://dx.doi.org/10.1186/1471-2105-6-172 |
_version_ | 1782124790329376768 |
---|---|
author | Kosinski, Jan Feder, Marcin Bujnicki, Janusz M |
author_facet | Kosinski, Jan Feder, Marcin Bujnicki, Janusz M |
author_sort | Kosinski, Jan |
collection | PubMed |
description | BACKGROUND: The PD-(D/E)XK nuclease superfamily, initially identified in type II restriction endonucleases and later in many enzymes involved in DNA recombination and repair, is one of the most challenging targets for protein sequence analysis and structure prediction. Typically, the sequence similarity between these proteins is so low, that most of the relationships between known members of the PD-(D/E)XK superfamily were identified only after the corresponding structures were determined experimentally. Thus, it is tempting to speculate that among the uncharacterized protein families, there are potential nucleases that remain to be discovered, but their identification requires more sensitive tools than traditional PSI-BLAST searches. RESULTS: The low degree of amino acid conservation hampers the possibility of identification of new members of the PD-(D/E)XK superfamily based solely on sequence comparisons to known members. Therefore, we used a recently developed method HHsearch for sensitive detection of remote similarities between protein families represented as profile Hidden Markov Models enhanced by secondary structure. We carried out a comparison of known families of PD-(D/E)XK nucleases to the database comprising the COG and PFAM profiles corresponding to both functionally characterized as well as uncharacterized protein families to detect significant similarities. The initial candidates for new nucleases were subsequently verified by sequence-structure threading, comparative modeling, and identification of potential active site residues. CONCLUSION: In this article, we report identification of the PD-(D/E)XK nuclease domain in numerous proteins implicated in interactions with DNA but with unknown structure and mechanism of action (such as putative recombinase RmuC, DNA competence factor CoiA, a DNA-binding protein SfsA, a large human protein predicted to be a DNA repair enzyme, predicted archaeal transcription regulators, and the head completion protein of phage T4) and in proteins for which no function was assigned to date (such as YhcG, various phage proteins, novel candidates for restriction enzymes). Our results contributes to the reduction of "white spaces" on the sequence-structure-function map of the protein universe and will help to jump-start the experimental characterization of new nucleases, of which many may be of importance for the complete understanding of mechanisms that govern the evolution and stability of the genome. |
format | Text |
id | pubmed-1189080 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-11890802005-08-24 The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function Kosinski, Jan Feder, Marcin Bujnicki, Janusz M BMC Bioinformatics Research Article BACKGROUND: The PD-(D/E)XK nuclease superfamily, initially identified in type II restriction endonucleases and later in many enzymes involved in DNA recombination and repair, is one of the most challenging targets for protein sequence analysis and structure prediction. Typically, the sequence similarity between these proteins is so low, that most of the relationships between known members of the PD-(D/E)XK superfamily were identified only after the corresponding structures were determined experimentally. Thus, it is tempting to speculate that among the uncharacterized protein families, there are potential nucleases that remain to be discovered, but their identification requires more sensitive tools than traditional PSI-BLAST searches. RESULTS: The low degree of amino acid conservation hampers the possibility of identification of new members of the PD-(D/E)XK superfamily based solely on sequence comparisons to known members. Therefore, we used a recently developed method HHsearch for sensitive detection of remote similarities between protein families represented as profile Hidden Markov Models enhanced by secondary structure. We carried out a comparison of known families of PD-(D/E)XK nucleases to the database comprising the COG and PFAM profiles corresponding to both functionally characterized as well as uncharacterized protein families to detect significant similarities. The initial candidates for new nucleases were subsequently verified by sequence-structure threading, comparative modeling, and identification of potential active site residues. CONCLUSION: In this article, we report identification of the PD-(D/E)XK nuclease domain in numerous proteins implicated in interactions with DNA but with unknown structure and mechanism of action (such as putative recombinase RmuC, DNA competence factor CoiA, a DNA-binding protein SfsA, a large human protein predicted to be a DNA repair enzyme, predicted archaeal transcription regulators, and the head completion protein of phage T4) and in proteins for which no function was assigned to date (such as YhcG, various phage proteins, novel candidates for restriction enzymes). Our results contributes to the reduction of "white spaces" on the sequence-structure-function map of the protein universe and will help to jump-start the experimental characterization of new nucleases, of which many may be of importance for the complete understanding of mechanisms that govern the evolution and stability of the genome. BioMed Central 2005-07-12 /pmc/articles/PMC1189080/ /pubmed/16011798 http://dx.doi.org/10.1186/1471-2105-6-172 Text en Copyright © 2005 Kosinski et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Kosinski, Jan Feder, Marcin Bujnicki, Janusz M The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function |
title | The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function |
title_full | The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function |
title_fullStr | The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function |
title_full_unstemmed | The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function |
title_short | The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function |
title_sort | pd-(d/e)xk superfamily revisited: identification of new members among proteins involved in dna metabolism and functional predictions for domains of (hitherto) unknown function |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1189080/ https://www.ncbi.nlm.nih.gov/pubmed/16011798 http://dx.doi.org/10.1186/1471-2105-6-172 |
work_keys_str_mv | AT kosinskijan thepddexksuperfamilyrevisitedidentificationofnewmembersamongproteinsinvolvedindnametabolismandfunctionalpredictionsfordomainsofhithertounknownfunction AT federmarcin thepddexksuperfamilyrevisitedidentificationofnewmembersamongproteinsinvolvedindnametabolismandfunctionalpredictionsfordomainsofhithertounknownfunction AT bujnickijanuszm thepddexksuperfamilyrevisitedidentificationofnewmembersamongproteinsinvolvedindnametabolismandfunctionalpredictionsfordomainsofhithertounknownfunction AT kosinskijan pddexksuperfamilyrevisitedidentificationofnewmembersamongproteinsinvolvedindnametabolismandfunctionalpredictionsfordomainsofhithertounknownfunction AT federmarcin pddexksuperfamilyrevisitedidentificationofnewmembersamongproteinsinvolvedindnametabolismandfunctionalpredictionsfordomainsofhithertounknownfunction AT bujnickijanuszm pddexksuperfamilyrevisitedidentificationofnewmembersamongproteinsinvolvedindnametabolismandfunctionalpredictionsfordomainsofhithertounknownfunction |