Cargando…

Sequence variation in ligand binding sites in proteins

BACKGROUND: The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from sequence conservation alone. In contrast, little attenti...

Descripción completa

Detalles Bibliográficos
Autores principales: Magliery, Thomas J, Regan, Lynne
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261162/
https://www.ncbi.nlm.nih.gov/pubmed/16194281
http://dx.doi.org/10.1186/1471-2105-6-240
_version_ 1782125867010359296
author Magliery, Thomas J
Regan, Lynne
author_facet Magliery, Thomas J
Regan, Lynne
author_sort Magliery, Thomas J
collection PubMed
description BACKGROUND: The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from sequence conservation alone. In contrast, little attention has been given to the meaning of poorly-conserved sites in families of proteins, which are typically assumed to be of little structural or functional importance. RESULTS: Recently, using statistical free energy analysis of tetratricopeptide repeat (TPR) domains, we observed that positions in contact with peptide ligands are more variable than surface positions in general. Here we show that statistical analysis of TPRs, ankyrin repeats, Cys(2)His(2 )zinc fingers and PDZ domains accurately identifies specificity-determining positions by their sequence variation. Sequence variation is measured as deviation from a neutral reference state, and we present probabilistic and information theory formalisms that improve upon recently suggested methods such as statistical free energies and sequence entropies. CONCLUSION: Sequence variation has been used to identify functionally-important residues in four selected protein families. With TPRs and ankyrin repeats, protein families that bind highly diverse ligands, the effect is so pronounced that sequence "hypervariation" alone can be used to predict ligand binding sites.
format Text
id pubmed-1261162
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-12611622005-10-27 Sequence variation in ligand binding sites in proteins Magliery, Thomas J Regan, Lynne BMC Bioinformatics Research Article BACKGROUND: The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from sequence conservation alone. In contrast, little attention has been given to the meaning of poorly-conserved sites in families of proteins, which are typically assumed to be of little structural or functional importance. RESULTS: Recently, using statistical free energy analysis of tetratricopeptide repeat (TPR) domains, we observed that positions in contact with peptide ligands are more variable than surface positions in general. Here we show that statistical analysis of TPRs, ankyrin repeats, Cys(2)His(2 )zinc fingers and PDZ domains accurately identifies specificity-determining positions by their sequence variation. Sequence variation is measured as deviation from a neutral reference state, and we present probabilistic and information theory formalisms that improve upon recently suggested methods such as statistical free energies and sequence entropies. CONCLUSION: Sequence variation has been used to identify functionally-important residues in four selected protein families. With TPRs and ankyrin repeats, protein families that bind highly diverse ligands, the effect is so pronounced that sequence "hypervariation" alone can be used to predict ligand binding sites. BioMed Central 2005-09-30 /pmc/articles/PMC1261162/ /pubmed/16194281 http://dx.doi.org/10.1186/1471-2105-6-240 Text en Copyright © 2005 Magliery and Regan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Magliery, Thomas J
Regan, Lynne
Sequence variation in ligand binding sites in proteins
title Sequence variation in ligand binding sites in proteins
title_full Sequence variation in ligand binding sites in proteins
title_fullStr Sequence variation in ligand binding sites in proteins
title_full_unstemmed Sequence variation in ligand binding sites in proteins
title_short Sequence variation in ligand binding sites in proteins
title_sort sequence variation in ligand binding sites in proteins
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261162/
https://www.ncbi.nlm.nih.gov/pubmed/16194281
http://dx.doi.org/10.1186/1471-2105-6-240
work_keys_str_mv AT maglierythomasj sequencevariationinligandbindingsitesinproteins
AT reganlynne sequencevariationinligandbindingsitesinproteins