Cargando…
Selecting XFEL single-particle snapshots by geometric machine learning
A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a suf...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Crystallographic Association
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7902084/ https://www.ncbi.nlm.nih.gov/pubmed/33644252 http://dx.doi.org/10.1063/4.0000060 |
_version_ | 1783654491224014848 |
---|---|
author | Cruz-Chú, Eduardo R. Hosseinizadeh, Ahmad Mashayekhi, Ghoncheh Fung, Russell Ourmazd, Abbas Schwander, Peter |
author_facet | Cruz-Chú, Eduardo R. Hosseinizadeh, Ahmad Mashayekhi, Ghoncheh Fung, Russell Ourmazd, Abbas Schwander, Peter |
author_sort | Cruz-Chú, Eduardo R. |
collection | PubMed |
description | A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a sufficiently large number of diffraction patterns of individual molecules, so-called single particles. Here, we present a method that allows for efficient identification of single particles in very large XFEL datasets, operates at low signal levels, and is tolerant to background. This method uses supervised Geometric Machine Learning (GML) to extract low-dimensional feature vectors from a training dataset, fuse test datasets into the feature space of training datasets, and separate the data into binary distributions of “single particles” and “non-single particles.” As a proof of principle, we tested simulated and experimental datasets of the Coliphage PR772 virus. We created a training dataset and classified three types of test datasets: First, a noise-free simulated test dataset, which gave near perfect separation. Second, simulated test datasets that were modified to reflect different levels of photon counts and background noise. These modified datasets were used to quantify the predictive limits of our approach. Third, an experimental dataset collected at the Stanford Linear Accelerator Center. The single-particle identification for this experimental dataset was compared with previously published results and it was found that GML covers a wide photon-count range, outperforming other single-particle identification methods. Moreover, a major advantage of GML is its ability to retrieve single particles in the presence of structural variability. |
format | Online Article Text |
id | pubmed-7902084 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | American Crystallographic Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-79020842021-02-25 Selecting XFEL single-particle snapshots by geometric machine learning Cruz-Chú, Eduardo R. Hosseinizadeh, Ahmad Mashayekhi, Ghoncheh Fung, Russell Ourmazd, Abbas Schwander, Peter Struct Dyn ARTICLES A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a sufficiently large number of diffraction patterns of individual molecules, so-called single particles. Here, we present a method that allows for efficient identification of single particles in very large XFEL datasets, operates at low signal levels, and is tolerant to background. This method uses supervised Geometric Machine Learning (GML) to extract low-dimensional feature vectors from a training dataset, fuse test datasets into the feature space of training datasets, and separate the data into binary distributions of “single particles” and “non-single particles.” As a proof of principle, we tested simulated and experimental datasets of the Coliphage PR772 virus. We created a training dataset and classified three types of test datasets: First, a noise-free simulated test dataset, which gave near perfect separation. Second, simulated test datasets that were modified to reflect different levels of photon counts and background noise. These modified datasets were used to quantify the predictive limits of our approach. Third, an experimental dataset collected at the Stanford Linear Accelerator Center. The single-particle identification for this experimental dataset was compared with previously published results and it was found that GML covers a wide photon-count range, outperforming other single-particle identification methods. Moreover, a major advantage of GML is its ability to retrieve single particles in the presence of structural variability. American Crystallographic Association 2021-02-18 /pmc/articles/PMC7902084/ /pubmed/33644252 http://dx.doi.org/10.1063/4.0000060 Text en © 2021 Author(s). 2329-7778/2021/8(1)/014701/12 All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | ARTICLES Cruz-Chú, Eduardo R. Hosseinizadeh, Ahmad Mashayekhi, Ghoncheh Fung, Russell Ourmazd, Abbas Schwander, Peter Selecting XFEL single-particle snapshots by geometric machine learning |
title | Selecting XFEL single-particle snapshots by geometric machine learning |
title_full | Selecting XFEL single-particle snapshots by geometric machine learning |
title_fullStr | Selecting XFEL single-particle snapshots by geometric machine learning |
title_full_unstemmed | Selecting XFEL single-particle snapshots by geometric machine learning |
title_short | Selecting XFEL single-particle snapshots by geometric machine learning |
title_sort | selecting xfel single-particle snapshots by geometric machine learning |
topic | ARTICLES |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7902084/ https://www.ncbi.nlm.nih.gov/pubmed/33644252 http://dx.doi.org/10.1063/4.0000060 |
work_keys_str_mv | AT cruzchueduardor selectingxfelsingleparticlesnapshotsbygeometricmachinelearning AT hosseinizadehahmad selectingxfelsingleparticlesnapshotsbygeometricmachinelearning AT mashayekhighoncheh selectingxfelsingleparticlesnapshotsbygeometricmachinelearning AT fungrussell selectingxfelsingleparticlesnapshotsbygeometricmachinelearning AT ourmazdabbas selectingxfelsingleparticlesnapshotsbygeometricmachinelearning AT schwanderpeter selectingxfelsingleparticlesnapshotsbygeometricmachinelearning |