Cargando…

Piecewise linear approximation of protein structures using the principle of minimum message length

Simple and concise representations of protein-folding patterns provide powerful abstractions for visualizations, comparisons, classifications, searching and aligning structural data. Structures are often abstracted by replacing standard secondary structural features—that is, helices and strands of s...

Descripción completa

Detalles Bibliográficos
Autores principales:	Konagurthu, Arun S., Allison, Lloyd, Stuckey, Peter J., Lesk, Arthur M.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2011
Materias:	Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117365/ https://www.ncbi.nlm.nih.gov/pubmed/21685100 http://dx.doi.org/10.1093/bioinformatics/btr240

_version_	1782206323215040512
author	Konagurthu, Arun S. Allison, Lloyd Stuckey, Peter J. Lesk, Arthur M.
author_facet	Konagurthu, Arun S. Allison, Lloyd Stuckey, Peter J. Lesk, Arthur M.
author_sort	Konagurthu, Arun S.
collection	PubMed
description	Simple and concise representations of protein-folding patterns provide powerful abstractions for visualizations, comparisons, classifications, searching and aligning structural data. Structures are often abstracted by replacing standard secondary structural features—that is, helices and strands of sheet—by vectors or linear segments. Relying solely on standard secondary structure may result in a significant loss of structural information. Further, traditional methods of simplification crucially depend on the consistency and accuracy of external methods to assign secondary structures to protein coordinate data. Although many methods exist automatically to identify secondary structure, the impreciseness of definitions, along with errors and inconsistencies in experimental structure data, drastically limit their applicability to generate reliable simplified representations, especially for structural comparison. This article introduces a mathematically rigorous algorithm to delineate protein structure using the elegant statistical and inductive inference framework of minimum message length (MML). Our method generates consistent and statistically robust piecewise linear explanations of protein coordinate data, resulting in a powerful and concise representation of the structure. The delineation is completely independent of the approaches of using hydrogen-bonding patterns or inspecting local substructural geometry that the current methods use. Indeed, as is common with applications of the MML criterion, this method is free of parameters and thresholds, in striking contrast to the existing programs which are often beset by them. The analysis of results over a large number of proteins suggests that the method produces consistent delineation of structures that encompasses, among others, the segments corresponding to standard secondary structure. Availability: http://www.csse.monash.edu.au/~karun/pmml. Contact: arun.konagurthu@monash.edu; lloyd.allison@monesh.edu
format	Online Article Text
id	pubmed-3117365
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-31173652011-06-17 Piecewise linear approximation of protein structures using the principle of minimum message length Konagurthu, Arun S. Allison, Lloyd Stuckey, Peter J. Lesk, Arthur M. Bioinformatics Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Simple and concise representations of protein-folding patterns provide powerful abstractions for visualizations, comparisons, classifications, searching and aligning structural data. Structures are often abstracted by replacing standard secondary structural features—that is, helices and strands of sheet—by vectors or linear segments. Relying solely on standard secondary structure may result in a significant loss of structural information. Further, traditional methods of simplification crucially depend on the consistency and accuracy of external methods to assign secondary structures to protein coordinate data. Although many methods exist automatically to identify secondary structure, the impreciseness of definitions, along with errors and inconsistencies in experimental structure data, drastically limit their applicability to generate reliable simplified representations, especially for structural comparison. This article introduces a mathematically rigorous algorithm to delineate protein structure using the elegant statistical and inductive inference framework of minimum message length (MML). Our method generates consistent and statistically robust piecewise linear explanations of protein coordinate data, resulting in a powerful and concise representation of the structure. The delineation is completely independent of the approaches of using hydrogen-bonding patterns or inspecting local substructural geometry that the current methods use. Indeed, as is common with applications of the MML criterion, this method is free of parameters and thresholds, in striking contrast to the existing programs which are often beset by them. The analysis of results over a large number of proteins suggests that the method produces consistent delineation of structures that encompasses, among others, the segments corresponding to standard secondary structure. Availability: http://www.csse.monash.edu.au/~karun/pmml. Contact: arun.konagurthu@monash.edu; lloyd.allison@monesh.edu Oxford University Press 2011-07-01 2011-06-14 /pmc/articles/PMC3117365/ /pubmed/21685100 http://dx.doi.org/10.1093/bioinformatics/btr240 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Konagurthu, Arun S. Allison, Lloyd Stuckey, Peter J. Lesk, Arthur M. Piecewise linear approximation of protein structures using the principle of minimum message length
title	Piecewise linear approximation of protein structures using the principle of minimum message length
title_full	Piecewise linear approximation of protein structures using the principle of minimum message length
title_fullStr	Piecewise linear approximation of protein structures using the principle of minimum message length
title_full_unstemmed	Piecewise linear approximation of protein structures using the principle of minimum message length
title_short	Piecewise linear approximation of protein structures using the principle of minimum message length
title_sort	piecewise linear approximation of protein structures using the principle of minimum message length
topic	Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117365/ https://www.ncbi.nlm.nih.gov/pubmed/21685100 http://dx.doi.org/10.1093/bioinformatics/btr240
work_keys_str_mv	AT konagurthuaruns piecewiselinearapproximationofproteinstructuresusingtheprincipleofminimummessagelength AT allisonlloyd piecewiselinearapproximationofproteinstructuresusingtheprincipleofminimummessagelength AT stuckeypeterj piecewiselinearapproximationofproteinstructuresusingtheprincipleofminimummessagelength AT leskarthurm piecewiselinearapproximationofproteinstructuresusingtheprincipleofminimummessagelength

Piecewise linear approximation of protein structures using the principle of minimum message length

Ejemplares similares