Cargando…
Optimal simultaneous superpositioning of multiple structures with missing data
Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the diff...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3400950/ https://www.ncbi.nlm.nih.gov/pubmed/22543369 http://dx.doi.org/10.1093/bioinformatics/bts243 |
_version_ | 1782238551233003520 |
---|---|
author | Theobald, Douglas L. Steindel, Phillip A. |
author_facet | Theobald, Douglas L. Steindel, Phillip A. |
author_sort | Theobald, Douglas L. |
collection | PubMed |
description | Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually ‘missing’ from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Results: Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation–maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. Availability and implementation: The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. Contact: dtheobald@brandeis.edu Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-3400950 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-34009502012-07-20 Optimal simultaneous superpositioning of multiple structures with missing data Theobald, Douglas L. Steindel, Phillip A. Bioinformatics Original Papers Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually ‘missing’ from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Results: Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation–maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. Availability and implementation: The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. Contact: dtheobald@brandeis.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-08-01 2012-04-27 /pmc/articles/PMC3400950/ /pubmed/22543369 http://dx.doi.org/10.1093/bioinformatics/bts243 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Theobald, Douglas L. Steindel, Phillip A. Optimal simultaneous superpositioning of multiple structures with missing data |
title | Optimal simultaneous superpositioning of multiple structures with missing data |
title_full | Optimal simultaneous superpositioning of multiple structures with missing data |
title_fullStr | Optimal simultaneous superpositioning of multiple structures with missing data |
title_full_unstemmed | Optimal simultaneous superpositioning of multiple structures with missing data |
title_short | Optimal simultaneous superpositioning of multiple structures with missing data |
title_sort | optimal simultaneous superpositioning of multiple structures with missing data |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3400950/ https://www.ncbi.nlm.nih.gov/pubmed/22543369 http://dx.doi.org/10.1093/bioinformatics/bts243 |
work_keys_str_mv | AT theobalddouglasl optimalsimultaneoussuperpositioningofmultiplestructureswithmissingdata AT steindelphillipa optimalsimultaneoussuperpositioningofmultiplestructureswithmissingdata |