Cargando…
The predictive power of data-processing statistics
This study describes a method to estimate the likelihood of success in determining a macromolecular structure by X-ray crystallography and experimental single-wavelength anomalous dispersion (SAD) or multiple-wavelength anomalous dispersion (MAD) phasing based on initial data-processing statistics a...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
International Union of Crystallography
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7055369/ https://www.ncbi.nlm.nih.gov/pubmed/32148861 http://dx.doi.org/10.1107/S2052252520000895 |
_version_ | 1783503362754347008 |
---|---|
author | Vollmar, Melanie Parkhurst, James M. Jaques, Dominic Baslé, Arnaud Murshudov, Garib N. Waterman, David G. Evans, Gwyndaf |
author_facet | Vollmar, Melanie Parkhurst, James M. Jaques, Dominic Baslé, Arnaud Murshudov, Garib N. Waterman, David G. Evans, Gwyndaf |
author_sort | Vollmar, Melanie |
collection | PubMed |
description | This study describes a method to estimate the likelihood of success in determining a macromolecular structure by X-ray crystallography and experimental single-wavelength anomalous dispersion (SAD) or multiple-wavelength anomalous dispersion (MAD) phasing based on initial data-processing statistics and sample crystal properties. Such a predictive tool can rapidly assess the usefulness of data and guide the collection of an optimal data set. The increase in data rates from modern macromolecular crystallography beamlines, together with a demand from users for real-time feedback, has led to pressure on computational resources and a need for smarter data handling. Statistical and machine-learning methods have been applied to construct a classifier that displays 95% accuracy for training and testing data sets compiled from 440 solved structures. Applying this classifier to new data achieved 79% accuracy. These scores already provide clear guidance as to the effective use of computing resources and offer a starting point for a personalized data-collection assistant. |
format | Online Article Text |
id | pubmed-7055369 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | International Union of Crystallography |
record_format | MEDLINE/PubMed |
spelling | pubmed-70553692020-03-06 The predictive power of data-processing statistics Vollmar, Melanie Parkhurst, James M. Jaques, Dominic Baslé, Arnaud Murshudov, Garib N. Waterman, David G. Evans, Gwyndaf IUCrJ Research Papers This study describes a method to estimate the likelihood of success in determining a macromolecular structure by X-ray crystallography and experimental single-wavelength anomalous dispersion (SAD) or multiple-wavelength anomalous dispersion (MAD) phasing based on initial data-processing statistics and sample crystal properties. Such a predictive tool can rapidly assess the usefulness of data and guide the collection of an optimal data set. The increase in data rates from modern macromolecular crystallography beamlines, together with a demand from users for real-time feedback, has led to pressure on computational resources and a need for smarter data handling. Statistical and machine-learning methods have been applied to construct a classifier that displays 95% accuracy for training and testing data sets compiled from 440 solved structures. Applying this classifier to new data achieved 79% accuracy. These scores already provide clear guidance as to the effective use of computing resources and offer a starting point for a personalized data-collection assistant. International Union of Crystallography 2020-02-27 /pmc/articles/PMC7055369/ /pubmed/32148861 http://dx.doi.org/10.1107/S2052252520000895 Text en © Melanie Vollmar et al. 2020 http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Research Papers Vollmar, Melanie Parkhurst, James M. Jaques, Dominic Baslé, Arnaud Murshudov, Garib N. Waterman, David G. Evans, Gwyndaf The predictive power of data-processing statistics |
title | The predictive power of data-processing statistics |
title_full | The predictive power of data-processing statistics |
title_fullStr | The predictive power of data-processing statistics |
title_full_unstemmed | The predictive power of data-processing statistics |
title_short | The predictive power of data-processing statistics |
title_sort | predictive power of data-processing statistics |
topic | Research Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7055369/ https://www.ncbi.nlm.nih.gov/pubmed/32148861 http://dx.doi.org/10.1107/S2052252520000895 |
work_keys_str_mv | AT vollmarmelanie thepredictivepowerofdataprocessingstatistics AT parkhurstjamesm thepredictivepowerofdataprocessingstatistics AT jaquesdominic thepredictivepowerofdataprocessingstatistics AT baslearnaud thepredictivepowerofdataprocessingstatistics AT murshudovgaribn thepredictivepowerofdataprocessingstatistics AT watermandavidg thepredictivepowerofdataprocessingstatistics AT evansgwyndaf thepredictivepowerofdataprocessingstatistics AT vollmarmelanie predictivepowerofdataprocessingstatistics AT parkhurstjamesm predictivepowerofdataprocessingstatistics AT jaquesdominic predictivepowerofdataprocessingstatistics AT baslearnaud predictivepowerofdataprocessingstatistics AT murshudovgaribn predictivepowerofdataprocessingstatistics AT watermandavidg predictivepowerofdataprocessingstatistics AT evansgwyndaf predictivepowerofdataprocessingstatistics |