Cargando…

Optimal simultaneous superpositioning of multiple structures with missing data

Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the diff...

Descripción completa

Detalles Bibliográficos
Autores principales: Theobald, Douglas L., Steindel, Phillip A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3400950/
https://www.ncbi.nlm.nih.gov/pubmed/22543369
http://dx.doi.org/10.1093/bioinformatics/bts243
_version_ 1782238551233003520
author Theobald, Douglas L.
Steindel, Phillip A.
author_facet Theobald, Douglas L.
Steindel, Phillip A.
author_sort Theobald, Douglas L.
collection PubMed
description Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually ‘missing’ from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Results: Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation–maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. Availability and implementation: The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. Contact: dtheobald@brandeis.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3400950
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-34009502012-07-20 Optimal simultaneous superpositioning of multiple structures with missing data Theobald, Douglas L. Steindel, Phillip A. Bioinformatics Original Papers Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually ‘missing’ from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Results: Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation–maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. Availability and implementation: The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. Contact: dtheobald@brandeis.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-08-01 2012-04-27 /pmc/articles/PMC3400950/ /pubmed/22543369 http://dx.doi.org/10.1093/bioinformatics/bts243 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Theobald, Douglas L.
Steindel, Phillip A.
Optimal simultaneous superpositioning of multiple structures with missing data
title Optimal simultaneous superpositioning of multiple structures with missing data
title_full Optimal simultaneous superpositioning of multiple structures with missing data
title_fullStr Optimal simultaneous superpositioning of multiple structures with missing data
title_full_unstemmed Optimal simultaneous superpositioning of multiple structures with missing data
title_short Optimal simultaneous superpositioning of multiple structures with missing data
title_sort optimal simultaneous superpositioning of multiple structures with missing data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3400950/
https://www.ncbi.nlm.nih.gov/pubmed/22543369
http://dx.doi.org/10.1093/bioinformatics/bts243
work_keys_str_mv AT theobalddouglasl optimalsimultaneoussuperpositioningofmultiplestructureswithmissingdata
AT steindelphillipa optimalsimultaneoussuperpositioningofmultiplestructureswithmissingdata