Cargando…
Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping
Predicting effects of gene regulatory elements (GREs) is a longstanding challenge in biology. Machine learning may address this, but requires large datasets linking GREs to their quantitative function. However, experimental methods to generate such datasets are either application-specific or technic...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363850/ https://www.ncbi.nlm.nih.gov/pubmed/32669542 http://dx.doi.org/10.1038/s41467-020-17222-4 |
_version_ | 1783559722129948672 |
---|---|
author | Höllerer, Simon Papaxanthos, Laetitia Gumpinger, Anja Cathrin Fischer, Katrin Beisel, Christian Borgwardt, Karsten Benenson, Yaakov Jeschek, Markus |
author_facet | Höllerer, Simon Papaxanthos, Laetitia Gumpinger, Anja Cathrin Fischer, Katrin Beisel, Christian Borgwardt, Karsten Benenson, Yaakov Jeschek, Markus |
author_sort | Höllerer, Simon |
collection | PubMed |
description | Predicting effects of gene regulatory elements (GREs) is a longstanding challenge in biology. Machine learning may address this, but requires large datasets linking GREs to their quantitative function. However, experimental methods to generate such datasets are either application-specific or technically complex and error-prone. Here, we introduce DNA-based phenotypic recording as a widely applicable, practicable approach to generate large-scale sequence-function datasets. We use a site-specific recombinase to directly record a GRE’s effect in DNA, enabling readout of both sequence and quantitative function for extremely large GRE-sets via next-generation sequencing. We record translation kinetics of over 300,000 bacterial ribosome binding sites (RBSs) in >2.7 million sequence-function pairs in a single experiment. Further, we introduce a deep learning approach employing ensembling and uncertainty modelling that predicts RBS function with high accuracy, outperforming state-of-the-art methods. DNA-based phenotypic recording combined with deep learning represents a major advance in our ability to predict function from genetic sequence. |
format | Online Article Text |
id | pubmed-7363850 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-73638502020-07-20 Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping Höllerer, Simon Papaxanthos, Laetitia Gumpinger, Anja Cathrin Fischer, Katrin Beisel, Christian Borgwardt, Karsten Benenson, Yaakov Jeschek, Markus Nat Commun Article Predicting effects of gene regulatory elements (GREs) is a longstanding challenge in biology. Machine learning may address this, but requires large datasets linking GREs to their quantitative function. However, experimental methods to generate such datasets are either application-specific or technically complex and error-prone. Here, we introduce DNA-based phenotypic recording as a widely applicable, practicable approach to generate large-scale sequence-function datasets. We use a site-specific recombinase to directly record a GRE’s effect in DNA, enabling readout of both sequence and quantitative function for extremely large GRE-sets via next-generation sequencing. We record translation kinetics of over 300,000 bacterial ribosome binding sites (RBSs) in >2.7 million sequence-function pairs in a single experiment. Further, we introduce a deep learning approach employing ensembling and uncertainty modelling that predicts RBS function with high accuracy, outperforming state-of-the-art methods. DNA-based phenotypic recording combined with deep learning represents a major advance in our ability to predict function from genetic sequence. Nature Publishing Group UK 2020-07-15 /pmc/articles/PMC7363850/ /pubmed/32669542 http://dx.doi.org/10.1038/s41467-020-17222-4 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Höllerer, Simon Papaxanthos, Laetitia Gumpinger, Anja Cathrin Fischer, Katrin Beisel, Christian Borgwardt, Karsten Benenson, Yaakov Jeschek, Markus Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping |
title | Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping |
title_full | Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping |
title_fullStr | Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping |
title_full_unstemmed | Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping |
title_short | Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping |
title_sort | large-scale dna-based phenotypic recording and deep learning enable highly accurate sequence-function mapping |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363850/ https://www.ncbi.nlm.nih.gov/pubmed/32669542 http://dx.doi.org/10.1038/s41467-020-17222-4 |
work_keys_str_mv | AT hollerersimon largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping AT papaxanthoslaetitia largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping AT gumpingeranjacathrin largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping AT fischerkatrin largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping AT beiselchristian largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping AT borgwardtkarsten largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping AT benensonyaakov largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping AT jeschekmarkus largescalednabasedphenotypicrecordinganddeeplearningenablehighlyaccuratesequencefunctionmapping |