Cargando…

A Mathematical Framework for Protein Structure Comparison

Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem,...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Wei, Srivastava, Anuj, Zhang, Jinfeng
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3033361/
https://www.ncbi.nlm.nih.gov/pubmed/21304929
http://dx.doi.org/10.1371/journal.pcbi.1001075
_version_ 1782197567018237952
author Liu, Wei
Srivastava, Anuj
Zhang, Jinfeng
author_facet Liu, Wei
Srivastava, Anuj
Zhang, Jinfeng
author_sort Liu, Wei
collection PubMed
description Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem, rigorous mathematical or statistical frameworks have seldom been pursued for general protein structure comparison. One notable issue in this field is that with many different distances used to measure the similarity between protein structures, none of them are proper distances when protein structures of different sequences are compared. Statistical approaches based on those non-proper distances or similarity scores as random variables are thus not mathematically rigorous. In this work, we develop a mathematical framework for protein structure comparison by treating protein structures as three-dimensional curves. Using an elastic Riemannian metric on spaces of curves, geodesic distance, a proper distance on spaces of curves, can be computed for any two protein structures. In this framework, protein structures can be treated as random variables on the shape manifold, and means and covariance can be computed for populations of protein structures. Furthermore, these moments can be used to build Gaussian-type probability distributions of protein structures for use in hypothesis testing. The covariance of a population of protein structures can reveal the population-specific variations and be helpful in improving structure classification. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions. We show that our method performs comparably with commonly used methods in protein structure classification on a large manually annotated data set.
format Text
id pubmed-3033361
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-30333612011-02-08 A Mathematical Framework for Protein Structure Comparison Liu, Wei Srivastava, Anuj Zhang, Jinfeng PLoS Comput Biol Research Article Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem, rigorous mathematical or statistical frameworks have seldom been pursued for general protein structure comparison. One notable issue in this field is that with many different distances used to measure the similarity between protein structures, none of them are proper distances when protein structures of different sequences are compared. Statistical approaches based on those non-proper distances or similarity scores as random variables are thus not mathematically rigorous. In this work, we develop a mathematical framework for protein structure comparison by treating protein structures as three-dimensional curves. Using an elastic Riemannian metric on spaces of curves, geodesic distance, a proper distance on spaces of curves, can be computed for any two protein structures. In this framework, protein structures can be treated as random variables on the shape manifold, and means and covariance can be computed for populations of protein structures. Furthermore, these moments can be used to build Gaussian-type probability distributions of protein structures for use in hypothesis testing. The covariance of a population of protein structures can reveal the population-specific variations and be helpful in improving structure classification. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions. We show that our method performs comparably with commonly used methods in protein structure classification on a large manually annotated data set. Public Library of Science 2011-02-03 /pmc/articles/PMC3033361/ /pubmed/21304929 http://dx.doi.org/10.1371/journal.pcbi.1001075 Text en Liu et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Liu, Wei
Srivastava, Anuj
Zhang, Jinfeng
A Mathematical Framework for Protein Structure Comparison
title A Mathematical Framework for Protein Structure Comparison
title_full A Mathematical Framework for Protein Structure Comparison
title_fullStr A Mathematical Framework for Protein Structure Comparison
title_full_unstemmed A Mathematical Framework for Protein Structure Comparison
title_short A Mathematical Framework for Protein Structure Comparison
title_sort mathematical framework for protein structure comparison
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3033361/
https://www.ncbi.nlm.nih.gov/pubmed/21304929
http://dx.doi.org/10.1371/journal.pcbi.1001075
work_keys_str_mv AT liuwei amathematicalframeworkforproteinstructurecomparison
AT srivastavaanuj amathematicalframeworkforproteinstructurecomparison
AT zhangjinfeng amathematicalframeworkforproteinstructurecomparison
AT liuwei mathematicalframeworkforproteinstructurecomparison
AT srivastavaanuj mathematicalframeworkforproteinstructurecomparison
AT zhangjinfeng mathematicalframeworkforproteinstructurecomparison