Cargando…

Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity

During in silico crystal structure prediction of organic molecules, millions of candidate structures are often generated. These candidates must be compared to remove duplicates prior to further analysis (e.g. optimization with electronic structure methods) and ultimately compared with structures det...

Descripción completa

Detalles Bibliográficos
Autores principales: Nessler, Aaron J., Okada, Okimasa, Hermon, Mitchell J., Nagata, Hiroomi, Schnieders, Michael J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: International Union of Crystallography 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9721330/
https://www.ncbi.nlm.nih.gov/pubmed/36570662
http://dx.doi.org/10.1107/S1600576722009670
_version_ 1784843750211584000
author Nessler, Aaron J.
Okada, Okimasa
Hermon, Mitchell J.
Nagata, Hiroomi
Schnieders, Michael J.
author_facet Nessler, Aaron J.
Okada, Okimasa
Hermon, Mitchell J.
Nagata, Hiroomi
Schnieders, Michael J.
author_sort Nessler, Aaron J.
collection PubMed
description During in silico crystal structure prediction of organic molecules, millions of candidate structures are often generated. These candidates must be compared to remove duplicates prior to further analysis (e.g. optimization with electronic structure methods) and ultimately compared with structures determined experimentally. The agreement of predicted and experimental structures forms the basis of evaluating the results from the Cambridge Crystallographic Data Centre (CCDC) blind assessment of crystal structure prediction, which further motivates the pursuit of rigorous alignments. Evaluating crystal structure packings using coordinate root-mean-square deviation (RMSD) for N molecules (or N asymmetric units) in a reproducible manner requires metrics to describe the shape of the compared molecular clusters to account for alternative approaches used to prioritize selection of molecules. Described here is a flexible algorithm called Progressive Alignment of Crystals (PAC) to evaluate crystal packing similarity using coordinate RMSD and introducing the radius of gyration (R (g)) as a metric to quantify the shape of the superimposed clusters. It is shown that the absence of metrics to describe cluster shape adds ambiguity to the results of the CCDC blind assessments because it is not possible to determine whether the superposition algorithm has prioritized tightly packed molecular clusters (i.e. to minimize R (g)) or prioritized reduced RMSD (i.e. via possibly elongated clusters with relatively larger R (g)). For example, it is shown that when the PAC algorithm described here uses single linkage to prioritize molecules for inclusion in the superimposed clusters, the results are nearly identical to those calculated by the widely used program COMPACK. However, the lower R (g) values obtained by the use of average linkage are favored for molecule prioritization because the resulting RMSDs more equally reflect the importance of packing along each dimension. It is shown that the PAC algorithm is faster than COMPACK when using a single process and its utility for biomolecular crystals is demonstrated. Finally, parallel scaling up to 64 processes in the open-source code Force Field X is presented.
format Online
Article
Text
id pubmed-9721330
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher International Union of Crystallography
record_format MEDLINE/PubMed
spelling pubmed-97213302022-12-22 Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity Nessler, Aaron J. Okada, Okimasa Hermon, Mitchell J. Nagata, Hiroomi Schnieders, Michael J. J Appl Crystallogr Research Papers During in silico crystal structure prediction of organic molecules, millions of candidate structures are often generated. These candidates must be compared to remove duplicates prior to further analysis (e.g. optimization with electronic structure methods) and ultimately compared with structures determined experimentally. The agreement of predicted and experimental structures forms the basis of evaluating the results from the Cambridge Crystallographic Data Centre (CCDC) blind assessment of crystal structure prediction, which further motivates the pursuit of rigorous alignments. Evaluating crystal structure packings using coordinate root-mean-square deviation (RMSD) for N molecules (or N asymmetric units) in a reproducible manner requires metrics to describe the shape of the compared molecular clusters to account for alternative approaches used to prioritize selection of molecules. Described here is a flexible algorithm called Progressive Alignment of Crystals (PAC) to evaluate crystal packing similarity using coordinate RMSD and introducing the radius of gyration (R (g)) as a metric to quantify the shape of the superimposed clusters. It is shown that the absence of metrics to describe cluster shape adds ambiguity to the results of the CCDC blind assessments because it is not possible to determine whether the superposition algorithm has prioritized tightly packed molecular clusters (i.e. to minimize R (g)) or prioritized reduced RMSD (i.e. via possibly elongated clusters with relatively larger R (g)). For example, it is shown that when the PAC algorithm described here uses single linkage to prioritize molecules for inclusion in the superimposed clusters, the results are nearly identical to those calculated by the widely used program COMPACK. However, the lower R (g) values obtained by the use of average linkage are favored for molecule prioritization because the resulting RMSDs more equally reflect the importance of packing along each dimension. It is shown that the PAC algorithm is faster than COMPACK when using a single process and its utility for biomolecular crystals is demonstrated. Finally, parallel scaling up to 64 processes in the open-source code Force Field X is presented. International Union of Crystallography 2022-11-21 /pmc/articles/PMC9721330/ /pubmed/36570662 http://dx.doi.org/10.1107/S1600576722009670 Text en © Aaron J. Nessler et al. 2022 https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.
spellingShingle Research Papers
Nessler, Aaron J.
Okada, Okimasa
Hermon, Mitchell J.
Nagata, Hiroomi
Schnieders, Michael J.
Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity
title Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity
title_full Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity
title_fullStr Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity
title_full_unstemmed Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity
title_short Progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity
title_sort progressive alignment of crystals: reproducible and efficient assessment of crystal structure similarity
topic Research Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9721330/
https://www.ncbi.nlm.nih.gov/pubmed/36570662
http://dx.doi.org/10.1107/S1600576722009670
work_keys_str_mv AT nessleraaronj progressivealignmentofcrystalsreproducibleandefficientassessmentofcrystalstructuresimilarity
AT okadaokimasa progressivealignmentofcrystalsreproducibleandefficientassessmentofcrystalstructuresimilarity
AT hermonmitchellj progressivealignmentofcrystalsreproducibleandefficientassessmentofcrystalstructuresimilarity
AT nagatahiroomi progressivealignmentofcrystalsreproducibleandefficientassessmentofcrystalstructuresimilarity
AT schniedersmichaelj progressivealignmentofcrystalsreproducibleandefficientassessmentofcrystalstructuresimilarity