Cargando…

Predicting the performance of automated crystallographic model-building pipelines

Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Alharbi, Emad, Bond, Paul, Calinescu, Radu, Cowtan, Kevin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: International Union of Crystallography 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647178/
https://www.ncbi.nlm.nih.gov/pubmed/34866614
http://dx.doi.org/10.1107/S2059798321010500
_version_ 1784610562632581120
author Alharbi, Emad
Bond, Paul
Calinescu, Radu
Cowtan, Kevin
author_facet Alharbi, Emad
Bond, Paul
Calinescu, Radu
Cowtan, Kevin
author_sort Alharbi, Emad
collection PubMed
description Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model.
format Online
Article
Text
id pubmed-8647178
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher International Union of Crystallography
record_format MEDLINE/PubMed
spelling pubmed-86471782021-12-16 Predicting the performance of automated crystallographic model-building pipelines Alharbi, Emad Bond, Paul Calinescu, Radu Cowtan, Kevin Acta Crystallogr D Struct Biol Research Papers Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model. International Union of Crystallography 2021-11-29 /pmc/articles/PMC8647178/ /pubmed/34866614 http://dx.doi.org/10.1107/S2059798321010500 Text en © Emad Alharbi et al. 2021 https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.
spellingShingle Research Papers
Alharbi, Emad
Bond, Paul
Calinescu, Radu
Cowtan, Kevin
Predicting the performance of automated crystallographic model-building pipelines
title Predicting the performance of automated crystallographic model-building pipelines
title_full Predicting the performance of automated crystallographic model-building pipelines
title_fullStr Predicting the performance of automated crystallographic model-building pipelines
title_full_unstemmed Predicting the performance of automated crystallographic model-building pipelines
title_short Predicting the performance of automated crystallographic model-building pipelines
title_sort predicting the performance of automated crystallographic model-building pipelines
topic Research Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647178/
https://www.ncbi.nlm.nih.gov/pubmed/34866614
http://dx.doi.org/10.1107/S2059798321010500
work_keys_str_mv AT alharbiemad predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines
AT bondpaul predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines
AT calinescuradu predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines
AT cowtankevin predictingtheperformanceofautomatedcrystallographicmodelbuildingpipelines