Cargando…
Better models by discarding data?
In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC(1/2), that can be used for this purpose were cha...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
International Union of Crystallography
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3689524/ https://www.ncbi.nlm.nih.gov/pubmed/23793147 http://dx.doi.org/10.1107/S0907444913001121 |
_version_ | 1782274260071350272 |
---|---|
author | Diederichs, K. Karplus, P. A. |
author_facet | Diederichs, K. Karplus, P. A. |
author_sort | Diederichs, K. |
collection | PubMed |
description | In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC(1/2), that can be used for this purpose were characterized and it was shown that CC(1/2) has superior properties compared with ‘merging’ R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC(1/2) and the more conventional indicators were compared in two situations of practical importance: merging data sets from different crystals and selectively rejecting weak observations or (merged) unique reflections from a data set. In these situations controlled ‘paired-refinement’ tests show that even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality. These results show the folly of such data-filtering practices aimed at improving the merging R values. Interestingly, in all of these tests CC(1/2) is the one data-quality indicator for which the behaviour accurately reflects which of the alternative data-handling strategies results in the best-quality refined model. Its properties in the presence of systematic error are documented and discussed. |
format | Online Article Text |
id | pubmed-3689524 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | International Union of Crystallography |
record_format | MEDLINE/PubMed |
spelling | pubmed-36895242013-06-28 Better models by discarding data? Diederichs, K. Karplus, P. A. Acta Crystallogr D Biol Crystallogr Research Papers In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC(1/2), that can be used for this purpose were characterized and it was shown that CC(1/2) has superior properties compared with ‘merging’ R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC(1/2) and the more conventional indicators were compared in two situations of practical importance: merging data sets from different crystals and selectively rejecting weak observations or (merged) unique reflections from a data set. In these situations controlled ‘paired-refinement’ tests show that even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality. These results show the folly of such data-filtering practices aimed at improving the merging R values. Interestingly, in all of these tests CC(1/2) is the one data-quality indicator for which the behaviour accurately reflects which of the alternative data-handling strategies results in the best-quality refined model. Its properties in the presence of systematic error are documented and discussed. International Union of Crystallography 2013-07-01 2013-06-15 /pmc/articles/PMC3689524/ /pubmed/23793147 http://dx.doi.org/10.1107/S0907444913001121 Text en © Diederichs & Karplus 2013 http://creativecommons.org/licenses/by/2.0/uk/ This is an open-access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited. |
spellingShingle | Research Papers Diederichs, K. Karplus, P. A. Better models by discarding data? |
title | Better models by discarding data? |
title_full | Better models by discarding data? |
title_fullStr | Better models by discarding data? |
title_full_unstemmed | Better models by discarding data? |
title_short | Better models by discarding data? |
title_sort | better models by discarding data? |
topic | Research Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3689524/ https://www.ncbi.nlm.nih.gov/pubmed/23793147 http://dx.doi.org/10.1107/S0907444913001121 |
work_keys_str_mv | AT diederichsk bettermodelsbydiscardingdata AT karpluspa bettermodelsbydiscardingdata |