Cargando…

findMySequence: a neural-network-based approach for identification of unknown proteins in X-ray crystallography and cryo-EM

Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of...

Descripción completa

Detalles Bibliográficos
Autores principales: Chojnowski, Grzegorz, Simpkin, Adam J., Leonardo, Diego A., Seifert-Davila, Wolfram, Vivas-Ruiz, Dan E., Keegan, Ronan M., Rigden, Daniel J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: International Union of Crystallography 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8733886/
https://www.ncbi.nlm.nih.gov/pubmed/35059213
http://dx.doi.org/10.1107/S2052252521011088
Descripción
Sumario:Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of the problem, the unknown protein always requires characterization. Here, an automated pipeline is presented for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. The method’s application to characterize the crystal structure of an unknown protein purified from a snake venom is presented. It is also shown that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.