Cargando…

DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models

Sequence assignment is a key step of the model building process in both cryogenic electron microscopy (cryo-EM) and macromolecular crystallography (MX). If the assignment fails, it can result in difficult to identify errors affecting the interpretation of a model. There are many model validation str...

Descripción completa

Detalles Bibliográficos
Autor principal: Chojnowski, Grzegorz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450167/
https://www.ncbi.nlm.nih.gov/pubmed/37395405
http://dx.doi.org/10.1093/nar/gkad553
_version_ 1785095138114011136
author Chojnowski, Grzegorz
author_facet Chojnowski, Grzegorz
author_sort Chojnowski, Grzegorz
collection PubMed
description Sequence assignment is a key step of the model building process in both cryogenic electron microscopy (cryo-EM) and macromolecular crystallography (MX). If the assignment fails, it can result in difficult to identify errors affecting the interpretation of a model. There are many model validation strategies that help experimentalists in this step of protein model building, but they are virtually non-existent for nucleic acids. Here, I present doubleHelix—a comprehensive method for assignment, identification, and validation of nucleic acid sequences in structures determined using cryo-EM and MX. The method combines a neural network classifier of nucleobase identities and a sequence-independent secondary structure assignment approach. I show that the presented method can successfully assist sequence-assignment step in nucleic-acid model building at lower resolutions, where visual map interpretation is very difficult. Moreover, I present examples of sequence assignment errors detected using doubleHelix in cryo-EM and MX structures of ribosomes deposited in the Protein Data Bank, which escaped the scrutiny of available model-validation approaches. The doubleHelix program source code is available under BSD-3 license at https://gitlab.com/gchojnowski/doublehelix.
format Online
Article
Text
id pubmed-10450167
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-104501672023-08-26 DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models Chojnowski, Grzegorz Nucleic Acids Res Structural Biology Sequence assignment is a key step of the model building process in both cryogenic electron microscopy (cryo-EM) and macromolecular crystallography (MX). If the assignment fails, it can result in difficult to identify errors affecting the interpretation of a model. There are many model validation strategies that help experimentalists in this step of protein model building, but they are virtually non-existent for nucleic acids. Here, I present doubleHelix—a comprehensive method for assignment, identification, and validation of nucleic acid sequences in structures determined using cryo-EM and MX. The method combines a neural network classifier of nucleobase identities and a sequence-independent secondary structure assignment approach. I show that the presented method can successfully assist sequence-assignment step in nucleic-acid model building at lower resolutions, where visual map interpretation is very difficult. Moreover, I present examples of sequence assignment errors detected using doubleHelix in cryo-EM and MX structures of ribosomes deposited in the Protein Data Bank, which escaped the scrutiny of available model-validation approaches. The doubleHelix program source code is available under BSD-3 license at https://gitlab.com/gchojnowski/doublehelix. Oxford University Press 2023-07-03 /pmc/articles/PMC10450167/ /pubmed/37395405 http://dx.doi.org/10.1093/nar/gkad553 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Structural Biology
Chojnowski, Grzegorz
DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models
title DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models
title_full DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models
title_fullStr DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models
title_full_unstemmed DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models
title_short DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models
title_sort doublehelix: nucleic acid sequence identification, assignment and validation tool for cryo-em and crystal structure models
topic Structural Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450167/
https://www.ncbi.nlm.nih.gov/pubmed/37395405
http://dx.doi.org/10.1093/nar/gkad553
work_keys_str_mv AT chojnowskigrzegorz doublehelixnucleicacidsequenceidentificationassignmentandvalidationtoolforcryoemandcrystalstructuremodels