Cargando…

Predicting RNA secondary structure by a neural network: what features may be learned?

Deep learning is a class of machine learning techniques capable of creating internal representation of data without explicit preprogramming. Hence, in addition to practical applications, it is of interest to analyze what features of biological data may be learned by such models. Here, we describe Pr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Grigorashvili, Elizaveta I., Chervontseva, Zoe S., Gelfand, Mikhail S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2022
Materias:	Bioinformatics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9756865/ https://www.ncbi.nlm.nih.gov/pubmed/36530406 http://dx.doi.org/10.7717/peerj.14335

_version_	1784851710577999872
author	Grigorashvili, Elizaveta I. Chervontseva, Zoe S. Gelfand, Mikhail S.
author_facet	Grigorashvili, Elizaveta I. Chervontseva, Zoe S. Gelfand, Mikhail S.
author_sort	Grigorashvili, Elizaveta I.
collection	PubMed
description	Deep learning is a class of machine learning techniques capable of creating internal representation of data without explicit preprogramming. Hence, in addition to practical applications, it is of interest to analyze what features of biological data may be learned by such models. Here, we describe PredPair, a deep learning neural network trained to predict base pairs in RNA structure from sequence alone, without any incorporated prior knowledge, such as the stacking energies or possible spatial structures. PredPair learned the Watson-Crick and wobble base-pairing rules and created an internal representation of the stacking energies and helices. Application to independent experimental (DMS-Seq) data on nucleotide accessibility in mRNA showed that the nucleotides predicted as paired indeed tend to be involved in the RNA structure. The performance of the constructed model was comparable with the state-of-the-art method based on the thermodynamic approach, but with a higher false positives rate. On the other hand, it successfully predicted pseudoknots. t-SNE clusters of embeddings of RNA sequences created by PredPair tend to contain embeddings from particular Rfam families, supporting the predictions of PredPair being in line with biological classification.
format	Online Article Text
id	pubmed-9756865
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-97568652022-12-17 Predicting RNA secondary structure by a neural network: what features may be learned? Grigorashvili, Elizaveta I. Chervontseva, Zoe S. Gelfand, Mikhail S. PeerJ Bioinformatics Deep learning is a class of machine learning techniques capable of creating internal representation of data without explicit preprogramming. Hence, in addition to practical applications, it is of interest to analyze what features of biological data may be learned by such models. Here, we describe PredPair, a deep learning neural network trained to predict base pairs in RNA structure from sequence alone, without any incorporated prior knowledge, such as the stacking energies or possible spatial structures. PredPair learned the Watson-Crick and wobble base-pairing rules and created an internal representation of the stacking energies and helices. Application to independent experimental (DMS-Seq) data on nucleotide accessibility in mRNA showed that the nucleotides predicted as paired indeed tend to be involved in the RNA structure. The performance of the constructed model was comparable with the state-of-the-art method based on the thermodynamic approach, but with a higher false positives rate. On the other hand, it successfully predicted pseudoknots. t-SNE clusters of embeddings of RNA sequences created by PredPair tend to contain embeddings from particular Rfam families, supporting the predictions of PredPair being in line with biological classification. PeerJ Inc. 2022-12-13 /pmc/articles/PMC9756865/ /pubmed/36530406 http://dx.doi.org/10.7717/peerj.14335 Text en © 2022 Grigorashvili et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle	Bioinformatics Grigorashvili, Elizaveta I. Chervontseva, Zoe S. Gelfand, Mikhail S. Predicting RNA secondary structure by a neural network: what features may be learned?
title	Predicting RNA secondary structure by a neural network: what features may be learned?
title_full	Predicting RNA secondary structure by a neural network: what features may be learned?
title_fullStr	Predicting RNA secondary structure by a neural network: what features may be learned?
title_full_unstemmed	Predicting RNA secondary structure by a neural network: what features may be learned?
title_short	Predicting RNA secondary structure by a neural network: what features may be learned?
title_sort	predicting rna secondary structure by a neural network: what features may be learned?
topic	Bioinformatics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9756865/ https://www.ncbi.nlm.nih.gov/pubmed/36530406 http://dx.doi.org/10.7717/peerj.14335
work_keys_str_mv	AT grigorashvilielizavetai predictingrnasecondarystructurebyaneuralnetworkwhatfeaturesmaybelearned AT chervontsevazoes predictingrnasecondarystructurebyaneuralnetworkwhatfeaturesmaybelearned AT gelfandmikhails predictingrnasecondarystructurebyaneuralnetworkwhatfeaturesmaybelearned

Predicting RNA secondary structure by a neural network: what features may be learned?

Ejemplares similares