Cargando…

Better models by discarding data?

In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC(1/2), that can be used for this purpose were cha...

Descripción completa

Detalles Bibliográficos
Autores principales: Diederichs, K., Karplus, P. A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: International Union of Crystallography 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3689524/
https://www.ncbi.nlm.nih.gov/pubmed/23793147
http://dx.doi.org/10.1107/S0907444913001121
_version_ 1782274260071350272
author Diederichs, K.
Karplus, P. A.
author_facet Diederichs, K.
Karplus, P. A.
author_sort Diederichs, K.
collection PubMed
description In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC(1/2), that can be used for this purpose were characterized and it was shown that CC(1/2) has superior properties compared with ‘merging’ R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC(1/2) and the more conventional indicators were compared in two situations of practical importance: merging data sets from different crystals and selectively rejecting weak observations or (merged) unique reflections from a data set. In these situations controlled ‘paired-refinement’ tests show that even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality. These results show the folly of such data-filtering practices aimed at improving the merging R values. Interestingly, in all of these tests CC(1/2) is the one data-quality indicator for which the behaviour accurately reflects which of the alternative data-handling strategies results in the best-quality refined model. Its properties in the presence of systematic error are documented and discussed.
format Online
Article
Text
id pubmed-3689524
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher International Union of Crystallography
record_format MEDLINE/PubMed
spelling pubmed-36895242013-06-28 Better models by discarding data? Diederichs, K. Karplus, P. A. Acta Crystallogr D Biol Crystallogr Research Papers In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC(1/2), that can be used for this purpose were characterized and it was shown that CC(1/2) has superior properties compared with ‘merging’ R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC(1/2) and the more conventional indicators were compared in two situations of practical importance: merging data sets from different crystals and selectively rejecting weak observations or (merged) unique reflections from a data set. In these situations controlled ‘paired-refinement’ tests show that even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality. These results show the folly of such data-filtering practices aimed at improving the merging R values. Interestingly, in all of these tests CC(1/2) is the one data-quality indicator for which the behaviour accurately reflects which of the alternative data-handling strategies results in the best-quality refined model. Its properties in the presence of systematic error are documented and discussed. International Union of Crystallography 2013-07-01 2013-06-15 /pmc/articles/PMC3689524/ /pubmed/23793147 http://dx.doi.org/10.1107/S0907444913001121 Text en © Diederichs & Karplus 2013 http://creativecommons.org/licenses/by/2.0/uk/ This is an open-access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.
spellingShingle Research Papers
Diederichs, K.
Karplus, P. A.
Better models by discarding data?
title Better models by discarding data?
title_full Better models by discarding data?
title_fullStr Better models by discarding data?
title_full_unstemmed Better models by discarding data?
title_short Better models by discarding data?
title_sort better models by discarding data?
topic Research Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3689524/
https://www.ncbi.nlm.nih.gov/pubmed/23793147
http://dx.doi.org/10.1107/S0907444913001121
work_keys_str_mv AT diederichsk bettermodelsbydiscardingdata
AT karpluspa bettermodelsbydiscardingdata