Cargando…

Selecting XFEL single-particle snapshots by geometric machine learning

A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a suf...

Descripción completa

Detalles Bibliográficos
Autores principales: Cruz-Chú, Eduardo R., Hosseinizadeh, Ahmad, Mashayekhi, Ghoncheh, Fung, Russell, Ourmazd, Abbas, Schwander, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Crystallographic Association 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7902084/
https://www.ncbi.nlm.nih.gov/pubmed/33644252
http://dx.doi.org/10.1063/4.0000060
_version_ 1783654491224014848
author Cruz-Chú, Eduardo R.
Hosseinizadeh, Ahmad
Mashayekhi, Ghoncheh
Fung, Russell
Ourmazd, Abbas
Schwander, Peter
author_facet Cruz-Chú, Eduardo R.
Hosseinizadeh, Ahmad
Mashayekhi, Ghoncheh
Fung, Russell
Ourmazd, Abbas
Schwander, Peter
author_sort Cruz-Chú, Eduardo R.
collection PubMed
description A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a sufficiently large number of diffraction patterns of individual molecules, so-called single particles. Here, we present a method that allows for efficient identification of single particles in very large XFEL datasets, operates at low signal levels, and is tolerant to background. This method uses supervised Geometric Machine Learning (GML) to extract low-dimensional feature vectors from a training dataset, fuse test datasets into the feature space of training datasets, and separate the data into binary distributions of “single particles” and “non-single particles.” As a proof of principle, we tested simulated and experimental datasets of the Coliphage PR772 virus. We created a training dataset and classified three types of test datasets: First, a noise-free simulated test dataset, which gave near perfect separation. Second, simulated test datasets that were modified to reflect different levels of photon counts and background noise. These modified datasets were used to quantify the predictive limits of our approach. Third, an experimental dataset collected at the Stanford Linear Accelerator Center. The single-particle identification for this experimental dataset was compared with previously published results and it was found that GML covers a wide photon-count range, outperforming other single-particle identification methods. Moreover, a major advantage of GML is its ability to retrieve single particles in the presence of structural variability.
format Online
Article
Text
id pubmed-7902084
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Crystallographic Association
record_format MEDLINE/PubMed
spelling pubmed-79020842021-02-25 Selecting XFEL single-particle snapshots by geometric machine learning Cruz-Chú, Eduardo R. Hosseinizadeh, Ahmad Mashayekhi, Ghoncheh Fung, Russell Ourmazd, Abbas Schwander, Peter Struct Dyn ARTICLES A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a sufficiently large number of diffraction patterns of individual molecules, so-called single particles. Here, we present a method that allows for efficient identification of single particles in very large XFEL datasets, operates at low signal levels, and is tolerant to background. This method uses supervised Geometric Machine Learning (GML) to extract low-dimensional feature vectors from a training dataset, fuse test datasets into the feature space of training datasets, and separate the data into binary distributions of “single particles” and “non-single particles.” As a proof of principle, we tested simulated and experimental datasets of the Coliphage PR772 virus. We created a training dataset and classified three types of test datasets: First, a noise-free simulated test dataset, which gave near perfect separation. Second, simulated test datasets that were modified to reflect different levels of photon counts and background noise. These modified datasets were used to quantify the predictive limits of our approach. Third, an experimental dataset collected at the Stanford Linear Accelerator Center. The single-particle identification for this experimental dataset was compared with previously published results and it was found that GML covers a wide photon-count range, outperforming other single-particle identification methods. Moreover, a major advantage of GML is its ability to retrieve single particles in the presence of structural variability. American Crystallographic Association 2021-02-18 /pmc/articles/PMC7902084/ /pubmed/33644252 http://dx.doi.org/10.1063/4.0000060 Text en © 2021 Author(s). 2329-7778/2021/8(1)/014701/12 All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle ARTICLES
Cruz-Chú, Eduardo R.
Hosseinizadeh, Ahmad
Mashayekhi, Ghoncheh
Fung, Russell
Ourmazd, Abbas
Schwander, Peter
Selecting XFEL single-particle snapshots by geometric machine learning
title Selecting XFEL single-particle snapshots by geometric machine learning
title_full Selecting XFEL single-particle snapshots by geometric machine learning
title_fullStr Selecting XFEL single-particle snapshots by geometric machine learning
title_full_unstemmed Selecting XFEL single-particle snapshots by geometric machine learning
title_short Selecting XFEL single-particle snapshots by geometric machine learning
title_sort selecting xfel single-particle snapshots by geometric machine learning
topic ARTICLES
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7902084/
https://www.ncbi.nlm.nih.gov/pubmed/33644252
http://dx.doi.org/10.1063/4.0000060
work_keys_str_mv AT cruzchueduardor selectingxfelsingleparticlesnapshotsbygeometricmachinelearning
AT hosseinizadehahmad selectingxfelsingleparticlesnapshotsbygeometricmachinelearning
AT mashayekhighoncheh selectingxfelsingleparticlesnapshotsbygeometricmachinelearning
AT fungrussell selectingxfelsingleparticlesnapshotsbygeometricmachinelearning
AT ourmazdabbas selectingxfelsingleparticlesnapshotsbygeometricmachinelearning
AT schwanderpeter selectingxfelsingleparticlesnapshotsbygeometricmachinelearning