Cargando…
Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images
Since preparative chromatography is a sustainability challenge due to large amounts of consumables used in downstream processing of biomolecules, protein crystallization offers a promising alternative as a purification method. While the limited crystallizability of proteins often restricts a broad a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372129/ https://www.ncbi.nlm.nih.gov/pubmed/35661232 http://dx.doi.org/10.1007/s00216-022-04101-8 |
_version_ | 1784767313038278656 |
---|---|
author | Bischoff, Daniel Walla, Brigitte Weuster-Botz, Dirk |
author_facet | Bischoff, Daniel Walla, Brigitte Weuster-Botz, Dirk |
author_sort | Bischoff, Daniel |
collection | PubMed |
description | Since preparative chromatography is a sustainability challenge due to large amounts of consumables used in downstream processing of biomolecules, protein crystallization offers a promising alternative as a purification method. While the limited crystallizability of proteins often restricts a broad application of crystallization as a purification method, advances in molecular biology, as well as computational methods are pushing the applicability towards integration in biotechnological downstream processes. However, in industrial and academic settings, monitoring protein crystallization processes non-invasively by microscopic photography and automated image evaluation remains a challenging problem. Recently, the identification of single crystal objects using deep learning has been the subject of increased attention for various model systems. However, the advancement of crystal detection using deep learning for biotechnological applications is limited: robust models obtained through supervised machine learning tasks require large-scale and high-quality data sets usually obtained in large projects through extensive manual labeling, an approach that is highly error-prone for dense systems of transparent crystals. For the first time, recent trends involving the use of synthetic data sets for supervised learning are transferred, thus generating photorealistic images of virtual protein crystals in suspension (PCS) through the use of ray tracing algorithms, accompanied by specialized data augmentations modelling experimental noise. Further, it is demonstrated that state-of-the-art models trained with the large-scale synthetic PCS data set outperform similar fine-tuned models based on the average precision metric on a validation data set, followed by experimental validation using high-resolution photomicrographs from stirred tank protein crystallization processes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00216-022-04101-8. |
format | Online Article Text |
id | pubmed-9372129 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-93721292022-08-13 Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images Bischoff, Daniel Walla, Brigitte Weuster-Botz, Dirk Anal Bioanal Chem Research Paper Since preparative chromatography is a sustainability challenge due to large amounts of consumables used in downstream processing of biomolecules, protein crystallization offers a promising alternative as a purification method. While the limited crystallizability of proteins often restricts a broad application of crystallization as a purification method, advances in molecular biology, as well as computational methods are pushing the applicability towards integration in biotechnological downstream processes. However, in industrial and academic settings, monitoring protein crystallization processes non-invasively by microscopic photography and automated image evaluation remains a challenging problem. Recently, the identification of single crystal objects using deep learning has been the subject of increased attention for various model systems. However, the advancement of crystal detection using deep learning for biotechnological applications is limited: robust models obtained through supervised machine learning tasks require large-scale and high-quality data sets usually obtained in large projects through extensive manual labeling, an approach that is highly error-prone for dense systems of transparent crystals. For the first time, recent trends involving the use of synthetic data sets for supervised learning are transferred, thus generating photorealistic images of virtual protein crystals in suspension (PCS) through the use of ray tracing algorithms, accompanied by specialized data augmentations modelling experimental noise. Further, it is demonstrated that state-of-the-art models trained with the large-scale synthetic PCS data set outperform similar fine-tuned models based on the average precision metric on a validation data set, followed by experimental validation using high-resolution photomicrographs from stirred tank protein crystallization processes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00216-022-04101-8. Springer Berlin Heidelberg 2022-06-04 2022 /pmc/articles/PMC9372129/ /pubmed/35661232 http://dx.doi.org/10.1007/s00216-022-04101-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Research Paper Bischoff, Daniel Walla, Brigitte Weuster-Botz, Dirk Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images |
title | Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images |
title_full | Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images |
title_fullStr | Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images |
title_full_unstemmed | Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images |
title_short | Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images |
title_sort | machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images |
topic | Research Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372129/ https://www.ncbi.nlm.nih.gov/pubmed/35661232 http://dx.doi.org/10.1007/s00216-022-04101-8 |
work_keys_str_mv | AT bischoffdaniel machinelearningbasedproteincrystaldetectionformonitoringofcrystallizationprocessesenabledwithlargescalesyntheticdatasetsofphotorealisticimages AT wallabrigitte machinelearningbasedproteincrystaldetectionformonitoringofcrystallizationprocessesenabledwithlargescalesyntheticdatasetsofphotorealisticimages AT weusterbotzdirk machinelearningbasedproteincrystaldetectionformonitoringofcrystallizationprocessesenabledwithlargescalesyntheticdatasetsofphotorealisticimages |