Cargando…

Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection

Selection protocols such as SELEX, where molecules are selected over multiple rounds for their ability to bind to a target of interest, are popular methods for obtaining binders for diagnostic and therapeutic purposes. We show that Restricted Boltzmann Machines (RBMs), an unsupervised two-layer neur...

Descripción completa

Detalles Bibliográficos
Autores principales:	Di Gioacchino, Andrea, Procyk, Jonah, Molari, Marco, Schreck, John S., Zhou, Yu, Liu, Yan, Monasson, Rémi, Cocco, Simona, Šulc, Petr
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9553063/ https://www.ncbi.nlm.nih.gov/pubmed/36174101 http://dx.doi.org/10.1371/journal.pcbi.1010561

_version_	1784806386532614144
author	Di Gioacchino, Andrea Procyk, Jonah Molari, Marco Schreck, John S. Zhou, Yu Liu, Yan Monasson, Rémi Cocco, Simona Šulc, Petr
author_facet	Di Gioacchino, Andrea Procyk, Jonah Molari, Marco Schreck, John S. Zhou, Yu Liu, Yan Monasson, Rémi Cocco, Simona Šulc, Petr
author_sort	Di Gioacchino, Andrea
collection	PubMed
description	Selection protocols such as SELEX, where molecules are selected over multiple rounds for their ability to bind to a target of interest, are popular methods for obtaining binders for diagnostic and therapeutic purposes. We show that Restricted Boltzmann Machines (RBMs), an unsupervised two-layer neural network architecture, can successfully be trained on sequence ensembles from single rounds of SELEX experiments for thrombin aptamers. RBMs assign scores to sequences that can be directly related to their fitnesses estimated through experimental enrichment ratios. Hence, RBMs trained from sequence data at a given round can be used to predict the effects of selection at later rounds. Moreover, the parameters of the trained RBMs are interpretable and identify functional features contributing most to sequence fitness. To exploit the generative capabilities of RBMs, we introduce two different training protocols: one taking into account sequence counts, capable of identifying the few best binders, and another based on unique sequences only, generating more diverse binders. We then use RBMs model to generate novel aptamers with putative disruptive mutations or good binding properties, and validate the generated sequences with gel shift assay experiments. Finally, we compare the RBM’s performance with different supervised learning approaches that include random forests and several deep neural network architectures.
format	Online Article Text
id	pubmed-9553063
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-95530632022-10-12 Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection Di Gioacchino, Andrea Procyk, Jonah Molari, Marco Schreck, John S. Zhou, Yu Liu, Yan Monasson, Rémi Cocco, Simona Šulc, Petr PLoS Comput Biol Research Article Selection protocols such as SELEX, where molecules are selected over multiple rounds for their ability to bind to a target of interest, are popular methods for obtaining binders for diagnostic and therapeutic purposes. We show that Restricted Boltzmann Machines (RBMs), an unsupervised two-layer neural network architecture, can successfully be trained on sequence ensembles from single rounds of SELEX experiments for thrombin aptamers. RBMs assign scores to sequences that can be directly related to their fitnesses estimated through experimental enrichment ratios. Hence, RBMs trained from sequence data at a given round can be used to predict the effects of selection at later rounds. Moreover, the parameters of the trained RBMs are interpretable and identify functional features contributing most to sequence fitness. To exploit the generative capabilities of RBMs, we introduce two different training protocols: one taking into account sequence counts, capable of identifying the few best binders, and another based on unique sequences only, generating more diverse binders. We then use RBMs model to generate novel aptamers with putative disruptive mutations or good binding properties, and validate the generated sequences with gel shift assay experiments. Finally, we compare the RBM’s performance with different supervised learning approaches that include random forests and several deep neural network architectures. Public Library of Science 2022-09-29 /pmc/articles/PMC9553063/ /pubmed/36174101 http://dx.doi.org/10.1371/journal.pcbi.1010561 Text en © 2022 Di Gioacchino et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Di Gioacchino, Andrea Procyk, Jonah Molari, Marco Schreck, John S. Zhou, Yu Liu, Yan Monasson, Rémi Cocco, Simona Šulc, Petr Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection
title	Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection
title_full	Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection
title_fullStr	Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection
title_full_unstemmed	Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection
title_short	Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection
title_sort	generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9553063/ https://www.ncbi.nlm.nih.gov/pubmed/36174101 http://dx.doi.org/10.1371/journal.pcbi.1010561
work_keys_str_mv	AT digioacchinoandrea generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT procykjonah generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT molarimarco generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT schreckjohns generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT zhouyu generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT liuyan generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT monassonremi generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT coccosimona generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection AT sulcpetr generativeandinterpretablemachinelearningforaptamerdesignandanalysisofinvitrosequenceselection

Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection

Ejemplares similares