Cargando…
Predicting the performance of automated crystallographic model-building pipelines
Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
International Union of Crystallography
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647178/ https://www.ncbi.nlm.nih.gov/pubmed/34866614 http://dx.doi.org/10.1107/S2059798321010500 |
_version_ | 1784610562632581120 |
---|---|
author | Alharbi, Emad Bond, Paul Calinescu, Radu Cowtan, Kevin |
author_facet | Alharbi, Emad Bond, Paul Calinescu, Radu Cowtan, Kevin |
author_sort | Alharbi, Emad |
collection | PubMed |
description | Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model. |
format | Online Article Text |
id | pubmed-8647178 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | International Union of Crystallography |
record_format | MEDLINE/PubMed |
spelling | pubmed-86471782021-12-16 Predicting the performance of automated crystallographic model-building pipelines Alharbi, Emad Bond, Paul Calinescu, Radu Cowtan, Kevin Acta Crystallogr D Struct Biol Research Papers Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model. International Union of Crystallography 2021-11-29 /pmc/articles/PMC8647178/ /pubmed/34866614 http://dx.doi.org/10.1107/S2059798321010500 Text en © Emad Alharbi et al. 2021 https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited. |
spellingShingle | Research Papers Alharbi, Emad Bond, Paul Calinescu, Radu Cowtan, Kevin Predicting the performance of automated crystallographic model-building pipelines |
title | Predicting the performance of automated crystallographic model-building pipelines |
title_full | Predicting the performance of automated crystallographic model-building pipelines |
title_fullStr | Predicting the performance of automated crystallographic model-building pipelines |
title_full_unstemmed | Predicting the performance of automated crystallographic model-building pipelines |
title_short | Predicting the performance of automated crystallographic model-building pipelines |
title_sort | predicting the performance of automated crystallographic model-building pipelines |
topic | Research Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647178/ https://www.ncbi.nlm.nih.gov/pubmed/34866614 http://dx.doi.org/10.1107/S2059798321010500 |
work_keys_str_mv | AT alharbiemad predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines AT bondpaul predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines AT calinescuradu predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines AT cowtankevin predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines |