Cargando…

Bayesian refinement of protein functional site matching

BACKGROUND: Matching functional sites is a key problem for the understanding of protein function and evolution. The commonly used graph theoretic approach, and other related approaches, require adjustment of a matching distance threshold a priori according to the noise in atomic positions. This is d...

Descripción completa

Detalles Bibliográficos
Autores principales: Mardia, Kanti V, Nyirongo, Vysaul B, Green, Peter J, Gold, Nicola D, Westhead, David R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1940029/
https://www.ncbi.nlm.nih.gov/pubmed/17640336
http://dx.doi.org/10.1186/1471-2105-8-257
_version_ 1782134431935365120
author Mardia, Kanti V
Nyirongo, Vysaul B
Green, Peter J
Gold, Nicola D
Westhead, David R
author_facet Mardia, Kanti V
Nyirongo, Vysaul B
Green, Peter J
Gold, Nicola D
Westhead, David R
author_sort Mardia, Kanti V
collection PubMed
description BACKGROUND: Matching functional sites is a key problem for the understanding of protein function and evolution. The commonly used graph theoretic approach, and other related approaches, require adjustment of a matching distance threshold a priori according to the noise in atomic positions. This is difficult to pre-determine when matching sites related by varying evolutionary distances and crystallographic precision. Furthermore, sometimes the graph method is unable to identify alternative but important solutions in the neighbourhood of the distance based solution because of strict distance constraints. We consider the Bayesian approach to improve graph based solutions. In principle this approach applies to other methods with strict distance matching constraints. The Bayesian method can flexibly incorporate all types of prior information on specific binding sites (e.g. amino acid types) in contrast to combinatorial formulations. RESULTS: We present a new meta-algorithm for matching protein functional sites (active sites and ligand binding sites) based on an initial graph matching followed by refinement using a Markov chain Monte Carlo (MCMC) procedure. This procedure is an innovative extension to our recent work. The method accounts for the 3-dimensional structure of the site as well as the physico-chemical properties of the constituent amino acids. The MCMC procedure can lead to a significant increase in the number of significant matches compared to the graph method as measured independently by rigorously derived p-values. CONCLUSION: MCMC refinement step is able to significantly improve graph based matches. We apply the method to matching NAD(P)(H) binding sites within single Rossmann fold families, between different families in the same superfamily, and in different folds. Within families sites are often well conserved, but there are examples where significant shape based matches do not retain similar amino acid chemistry, indicating that even within families the same ligand may be bound using substantially different physico-chemistry. We also show that the procedure finds significant matches between binding sites for the same co-factor in different families and different folds.
format Text
id pubmed-1940029
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19400292007-08-07 Bayesian refinement of protein functional site matching Mardia, Kanti V Nyirongo, Vysaul B Green, Peter J Gold, Nicola D Westhead, David R BMC Bioinformatics Research Article BACKGROUND: Matching functional sites is a key problem for the understanding of protein function and evolution. The commonly used graph theoretic approach, and other related approaches, require adjustment of a matching distance threshold a priori according to the noise in atomic positions. This is difficult to pre-determine when matching sites related by varying evolutionary distances and crystallographic precision. Furthermore, sometimes the graph method is unable to identify alternative but important solutions in the neighbourhood of the distance based solution because of strict distance constraints. We consider the Bayesian approach to improve graph based solutions. In principle this approach applies to other methods with strict distance matching constraints. The Bayesian method can flexibly incorporate all types of prior information on specific binding sites (e.g. amino acid types) in contrast to combinatorial formulations. RESULTS: We present a new meta-algorithm for matching protein functional sites (active sites and ligand binding sites) based on an initial graph matching followed by refinement using a Markov chain Monte Carlo (MCMC) procedure. This procedure is an innovative extension to our recent work. The method accounts for the 3-dimensional structure of the site as well as the physico-chemical properties of the constituent amino acids. The MCMC procedure can lead to a significant increase in the number of significant matches compared to the graph method as measured independently by rigorously derived p-values. CONCLUSION: MCMC refinement step is able to significantly improve graph based matches. We apply the method to matching NAD(P)(H) binding sites within single Rossmann fold families, between different families in the same superfamily, and in different folds. Within families sites are often well conserved, but there are examples where significant shape based matches do not retain similar amino acid chemistry, indicating that even within families the same ligand may be bound using substantially different physico-chemistry. We also show that the procedure finds significant matches between binding sites for the same co-factor in different families and different folds. BioMed Central 2007-07-17 /pmc/articles/PMC1940029/ /pubmed/17640336 http://dx.doi.org/10.1186/1471-2105-8-257 Text en Copyright © 2007 Mardia et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Mardia, Kanti V
Nyirongo, Vysaul B
Green, Peter J
Gold, Nicola D
Westhead, David R
Bayesian refinement of protein functional site matching
title Bayesian refinement of protein functional site matching
title_full Bayesian refinement of protein functional site matching
title_fullStr Bayesian refinement of protein functional site matching
title_full_unstemmed Bayesian refinement of protein functional site matching
title_short Bayesian refinement of protein functional site matching
title_sort bayesian refinement of protein functional site matching
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1940029/
https://www.ncbi.nlm.nih.gov/pubmed/17640336
http://dx.doi.org/10.1186/1471-2105-8-257
work_keys_str_mv AT mardiakantiv bayesianrefinementofproteinfunctionalsitematching
AT nyirongovysaulb bayesianrefinementofproteinfunctionalsitematching
AT greenpeterj bayesianrefinementofproteinfunctionalsitematching
AT goldnicolad bayesianrefinementofproteinfunctionalsitematching
AT westheaddavidr bayesianrefinementofproteinfunctionalsitematching