Cargando…
Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter
The majority of microbial genomes have yet to be cultured, and most proteins identified in microbial genomes or environmental sequences cannot be functionally annotated. As a result, current computational approaches to describe microbial systems rely on incomplete reference databases that cannot ade...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9095714/ https://www.ncbi.nlm.nih.gov/pubmed/35545619 http://dx.doi.org/10.1038/s41467-022-30070-8 |
_version_ | 1784705817850675200 |
---|---|
author | Hoarfrost, A. Aptekmann, A. Farfañuk, G. Bromberg, Y. |
author_facet | Hoarfrost, A. Aptekmann, A. Farfañuk, G. Bromberg, Y. |
author_sort | Hoarfrost, A. |
collection | PubMed |
description | The majority of microbial genomes have yet to be cultured, and most proteins identified in microbial genomes or environmental sequences cannot be functionally annotated. As a result, current computational approaches to describe microbial systems rely on incomplete reference databases that cannot adequately capture the functional diversity of the microbial tree of life, limiting our ability to model high-level features of biological sequences. Here we present LookingGlass, a deep learning model encoding contextually-aware, functionally and evolutionarily relevant representations of short DNA reads, that distinguishes reads of disparate function, homology, and environmental origin. We demonstrate the ability of LookingGlass to be fine-tuned via transfer learning to perform a range of diverse tasks: to identify novel oxidoreductases, to predict enzyme optimal temperature, and to recognize the reading frames of DNA sequence fragments. LookingGlass enables functionally relevant representations of otherwise unknown and unannotated sequences, shedding light on the microbial dark matter that dominates life on Earth. |
format | Online Article Text |
id | pubmed-9095714 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-90957142022-05-13 Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter Hoarfrost, A. Aptekmann, A. Farfañuk, G. Bromberg, Y. Nat Commun Article The majority of microbial genomes have yet to be cultured, and most proteins identified in microbial genomes or environmental sequences cannot be functionally annotated. As a result, current computational approaches to describe microbial systems rely on incomplete reference databases that cannot adequately capture the functional diversity of the microbial tree of life, limiting our ability to model high-level features of biological sequences. Here we present LookingGlass, a deep learning model encoding contextually-aware, functionally and evolutionarily relevant representations of short DNA reads, that distinguishes reads of disparate function, homology, and environmental origin. We demonstrate the ability of LookingGlass to be fine-tuned via transfer learning to perform a range of diverse tasks: to identify novel oxidoreductases, to predict enzyme optimal temperature, and to recognize the reading frames of DNA sequence fragments. LookingGlass enables functionally relevant representations of otherwise unknown and unannotated sequences, shedding light on the microbial dark matter that dominates life on Earth. Nature Publishing Group UK 2022-05-11 /pmc/articles/PMC9095714/ /pubmed/35545619 http://dx.doi.org/10.1038/s41467-022-30070-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Hoarfrost, A. Aptekmann, A. Farfañuk, G. Bromberg, Y. Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter |
title | Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter |
title_full | Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter |
title_fullStr | Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter |
title_full_unstemmed | Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter |
title_short | Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter |
title_sort | deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9095714/ https://www.ncbi.nlm.nih.gov/pubmed/35545619 http://dx.doi.org/10.1038/s41467-022-30070-8 |
work_keys_str_mv | AT hoarfrosta deeplearningofabacterialandarchaealuniversallanguageoflifeenablestransferlearningandilluminatesmicrobialdarkmatter AT aptekmanna deeplearningofabacterialandarchaealuniversallanguageoflifeenablestransferlearningandilluminatesmicrobialdarkmatter AT farfanukg deeplearningofabacterialandarchaealuniversallanguageoflifeenablestransferlearningandilluminatesmicrobialdarkmatter AT brombergy deeplearningofabacterialandarchaealuniversallanguageoflifeenablestransferlearningandilluminatesmicrobialdarkmatter |