Cargando…

Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications

Ribonucleic acid (RNA) secondary structures and branching properties are important for determining functional ramifications in biology. While energy minimization of the Nearest Neighbor Thermodynamic Model (NNTM) is commonly used to identify such properties (number of hairpins, maximum ladder distan...

Descripción completa

Detalles Bibliográficos
Autores principales: Kirkpatrick, Anna, Patton, Kalen, Tetali, Prasad, Mitchell, Cassie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344895/
https://www.ncbi.nlm.nih.gov/pubmed/35924027
http://dx.doi.org/10.3390/mca25040067
_version_ 1784761314735816704
author Kirkpatrick, Anna
Patton, Kalen
Tetali, Prasad
Mitchell, Cassie
author_facet Kirkpatrick, Anna
Patton, Kalen
Tetali, Prasad
Mitchell, Cassie
author_sort Kirkpatrick, Anna
collection PubMed
description Ribonucleic acid (RNA) secondary structures and branching properties are important for determining functional ramifications in biology. While energy minimization of the Nearest Neighbor Thermodynamic Model (NNTM) is commonly used to identify such properties (number of hairpins, maximum ladder distance, etc.), it is difficult to know whether the resultant values fall within expected dispersion thresholds for a given energy function. The goal of this study was to construct a Markov chain capable of examining the dispersion of RNA secondary structures and branching properties obtained from NNTM energy function minimization independent of a specific nucleotide sequence. Plane trees are studied as a model for RNA secondary structure, with energy assigned to each tree based on the NNTM, and a corresponding Gibbs distribution is defined on the trees. Through a bijection between plane trees and 2-Motzkin paths, a Markov chain converging to the Gibbs distribution is constructed, and fast mixing time is established by estimating the spectral gap of the chain. The spectral gap estimate is obtained through a series of decompositions of the chain and also by building on known mixing time results for other chains on Dyck paths. The resulting algorithm can be used as a tool for exploring the branching structure of RNA, especially for long sequences, and to examine branching structure dependence on energy model parameters. Full exposition is provided for the mathematical techniques used with the expectation that these techniques will prove useful in bioinformatics, computational biology, and additional extended applications.
format Online
Article
Text
id pubmed-9344895
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-93448952022-08-02 Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications Kirkpatrick, Anna Patton, Kalen Tetali, Prasad Mitchell, Cassie Math Comput Appl Article Ribonucleic acid (RNA) secondary structures and branching properties are important for determining functional ramifications in biology. While energy minimization of the Nearest Neighbor Thermodynamic Model (NNTM) is commonly used to identify such properties (number of hairpins, maximum ladder distance, etc.), it is difficult to know whether the resultant values fall within expected dispersion thresholds for a given energy function. The goal of this study was to construct a Markov chain capable of examining the dispersion of RNA secondary structures and branching properties obtained from NNTM energy function minimization independent of a specific nucleotide sequence. Plane trees are studied as a model for RNA secondary structure, with energy assigned to each tree based on the NNTM, and a corresponding Gibbs distribution is defined on the trees. Through a bijection between plane trees and 2-Motzkin paths, a Markov chain converging to the Gibbs distribution is constructed, and fast mixing time is established by estimating the spectral gap of the chain. The spectral gap estimate is obtained through a series of decompositions of the chain and also by building on known mixing time results for other chains on Dyck paths. The resulting algorithm can be used as a tool for exploring the branching structure of RNA, especially for long sequences, and to examine branching structure dependence on energy model parameters. Full exposition is provided for the mathematical techniques used with the expectation that these techniques will prove useful in bioinformatics, computational biology, and additional extended applications. 2020-12 2020-10-10 /pmc/articles/PMC9344895/ /pubmed/35924027 http://dx.doi.org/10.3390/mca25040067 Text en https://creativecommons.org/licenses/by/4.0/This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ).
spellingShingle Article
Kirkpatrick, Anna
Patton, Kalen
Tetali, Prasad
Mitchell, Cassie
Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications
title Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications
title_full Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications
title_fullStr Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications
title_full_unstemmed Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications
title_short Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications
title_sort markov chain-based sampling for exploring rna secondary structure under the nearest neighbor thermodynamic model and extended applications
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344895/
https://www.ncbi.nlm.nih.gov/pubmed/35924027
http://dx.doi.org/10.3390/mca25040067
work_keys_str_mv AT kirkpatrickanna markovchainbasedsamplingforexploringrnasecondarystructureunderthenearestneighborthermodynamicmodelandextendedapplications
AT pattonkalen markovchainbasedsamplingforexploringrnasecondarystructureunderthenearestneighborthermodynamicmodelandextendedapplications
AT tetaliprasad markovchainbasedsamplingforexploringrnasecondarystructureunderthenearestneighborthermodynamicmodelandextendedapplications
AT mitchellcassie markovchainbasedsamplingforexploringrnasecondarystructureunderthenearestneighborthermodynamicmodelandextendedapplications