Cargando…

RNA contact prediction by data efficient deep learning

On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited...

Descripción completa

Detalles Bibliográficos
Autores principales: Taubert, Oskar, von der Lehr, Fabrice, Bazarova, Alina, Faber, Christian, Knechtges, Philipp, Weiel, Marie, Debus, Charlotte, Coquelin, Daniel, Basermann, Achim, Streit, Achim, Kesselheim, Stefan, Götz, Markus, Schug, Alexander
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10482910/
https://www.ncbi.nlm.nih.gov/pubmed/37674020
http://dx.doi.org/10.1038/s42003-023-05244-9
_version_ 1785102273952612352
author Taubert, Oskar
von der Lehr, Fabrice
Bazarova, Alina
Faber, Christian
Knechtges, Philipp
Weiel, Marie
Debus, Charlotte
Coquelin, Daniel
Basermann, Achim
Streit, Achim
Kesselheim, Stefan
Götz, Markus
Schug, Alexander
author_facet Taubert, Oskar
von der Lehr, Fabrice
Bazarova, Alina
Faber, Christian
Knechtges, Philipp
Weiel, Marie
Debus, Charlotte
Coquelin, Daniel
Basermann, Achim
Streit, Achim
Kesselheim, Stefan
Götz, Markus
Schug, Alexander
author_sort Taubert, Oskar
collection PubMed
description On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited data available, we here focus on predicting spatial adjacencies ("contact maps”) as a proxy for 3D structure. Our model, BARNACLE, combines the utilization of unlabeled data through self-supervised pre-training and efficient use of the sparse labeled data through an XGBoost classifier. BARNACLE shows a considerable improvement over both the established classical baseline and a deep neural network. In order to demonstrate that our approach can be applied to tasks with similar data constraints, we show that our findings generalize to the related setting of accessible surface area prediction.
format Online
Article
Text
id pubmed-10482910
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-104829102023-09-08 RNA contact prediction by data efficient deep learning Taubert, Oskar von der Lehr, Fabrice Bazarova, Alina Faber, Christian Knechtges, Philipp Weiel, Marie Debus, Charlotte Coquelin, Daniel Basermann, Achim Streit, Achim Kesselheim, Stefan Götz, Markus Schug, Alexander Commun Biol Article On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited data available, we here focus on predicting spatial adjacencies ("contact maps”) as a proxy for 3D structure. Our model, BARNACLE, combines the utilization of unlabeled data through self-supervised pre-training and efficient use of the sparse labeled data through an XGBoost classifier. BARNACLE shows a considerable improvement over both the established classical baseline and a deep neural network. In order to demonstrate that our approach can be applied to tasks with similar data constraints, we show that our findings generalize to the related setting of accessible surface area prediction. Nature Publishing Group UK 2023-09-06 /pmc/articles/PMC10482910/ /pubmed/37674020 http://dx.doi.org/10.1038/s42003-023-05244-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Taubert, Oskar
von der Lehr, Fabrice
Bazarova, Alina
Faber, Christian
Knechtges, Philipp
Weiel, Marie
Debus, Charlotte
Coquelin, Daniel
Basermann, Achim
Streit, Achim
Kesselheim, Stefan
Götz, Markus
Schug, Alexander
RNA contact prediction by data efficient deep learning
title RNA contact prediction by data efficient deep learning
title_full RNA contact prediction by data efficient deep learning
title_fullStr RNA contact prediction by data efficient deep learning
title_full_unstemmed RNA contact prediction by data efficient deep learning
title_short RNA contact prediction by data efficient deep learning
title_sort rna contact prediction by data efficient deep learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10482910/
https://www.ncbi.nlm.nih.gov/pubmed/37674020
http://dx.doi.org/10.1038/s42003-023-05244-9
work_keys_str_mv AT taubertoskar rnacontactpredictionbydataefficientdeeplearning
AT vonderlehrfabrice rnacontactpredictionbydataefficientdeeplearning
AT bazarovaalina rnacontactpredictionbydataefficientdeeplearning
AT faberchristian rnacontactpredictionbydataefficientdeeplearning
AT knechtgesphilipp rnacontactpredictionbydataefficientdeeplearning
AT weielmarie rnacontactpredictionbydataefficientdeeplearning
AT debuscharlotte rnacontactpredictionbydataefficientdeeplearning
AT coquelindaniel rnacontactpredictionbydataefficientdeeplearning
AT basermannachim rnacontactpredictionbydataefficientdeeplearning
AT streitachim rnacontactpredictionbydataefficientdeeplearning
AT kesselheimstefan rnacontactpredictionbydataefficientdeeplearning
AT gotzmarkus rnacontactpredictionbydataefficientdeeplearning
AT schugalexander rnacontactpredictionbydataefficientdeeplearning