Cargando…

Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition

Obtaining valuable clues for noncoding RNA (ribonucleic acid) subsequences remains a significant challenge, acknowledging that most of the human genome transcribes into noncoding RNA parts related to unknown biological operations. Capturing these clues relies on accurate “base pairing” prediction, a...

Descripción completa

Detalles Bibliográficos
Autores principales: Andrikos, Christos, Makris, Evangelos, Kolaitis, Angelos, Rassias, Georgios, Pavlatos, Christos, Tsanakas, Panayiotis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8876629/
https://www.ncbi.nlm.nih.gov/pubmed/35200530
http://dx.doi.org/10.3390/mps5010014
_version_ 1784658218851500032
author Andrikos, Christos
Makris, Evangelos
Kolaitis, Angelos
Rassias, Georgios
Pavlatos, Christos
Tsanakas, Panayiotis
author_facet Andrikos, Christos
Makris, Evangelos
Kolaitis, Angelos
Rassias, Georgios
Pavlatos, Christos
Tsanakas, Panayiotis
author_sort Andrikos, Christos
collection PubMed
description Obtaining valuable clues for noncoding RNA (ribonucleic acid) subsequences remains a significant challenge, acknowledging that most of the human genome transcribes into noncoding RNA parts related to unknown biological operations. Capturing these clues relies on accurate “base pairing” prediction, also known as “RNA secondary structure prediction”. As COVID-19 is considered a severe global threat, the single-stranded SARS-CoV-2 virus reveals the importance of establishing an efficient RNA analysis toolkit. This work aimed to contribute to that by introducing a novel system committed to predicting RNA secondary structure patterns (i.e., RNA’s pseudoknots) that leverage syntactic pattern-recognition strategies. Having focused on the pseudoknot predictions, we formalized the secondary structure prediction of the RNA to be primarily a parsing and, secondly, an optimization problem. The proposed methodology addresses the problem of predicting pseudoknots of the first order (H-type). We introduce a context-free grammar (CFG) that affords enough expression power to recognize potential pseudoknot pattern. In addition, an alternative methodology of detecting possible pseudoknots is also implemented as well, using a brute-force algorithm. Any input sequence may highlight multiple potential folding patterns requiring a strict methodology to determine the single biologically realistic one. We conscripted a novel heuristic over the widely accepted notion of free-energy minimization to tackle such ambiguity in a performant way by utilizing each pattern’s context to unveil the most prominent pseudoknot pattern. The overall process features polynomial-time complexity, while its parallel implementation enhances the end performance, as proportional to the deployed hardware. The proposed methodology does succeed in predicting the core stems of any RNA pseudoknot of the test dataset by performing a 76.4% recall ratio. The methodology achieved a F1-score equal to 0.774 and MCC equal 0.543 in discovering all the stems of an RNA sequence, outperforming the particular task. Measurements were taken using a dataset of 262 RNA sequences establishing a performance speed of 1.31, 3.45, and 7.76 compared to three well-known platforms. The implementation source code is publicly available under knotify github repo.
format Online
Article
Text
id pubmed-8876629
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-88766292022-02-26 Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition Andrikos, Christos Makris, Evangelos Kolaitis, Angelos Rassias, Georgios Pavlatos, Christos Tsanakas, Panayiotis Methods Protoc Article Obtaining valuable clues for noncoding RNA (ribonucleic acid) subsequences remains a significant challenge, acknowledging that most of the human genome transcribes into noncoding RNA parts related to unknown biological operations. Capturing these clues relies on accurate “base pairing” prediction, also known as “RNA secondary structure prediction”. As COVID-19 is considered a severe global threat, the single-stranded SARS-CoV-2 virus reveals the importance of establishing an efficient RNA analysis toolkit. This work aimed to contribute to that by introducing a novel system committed to predicting RNA secondary structure patterns (i.e., RNA’s pseudoknots) that leverage syntactic pattern-recognition strategies. Having focused on the pseudoknot predictions, we formalized the secondary structure prediction of the RNA to be primarily a parsing and, secondly, an optimization problem. The proposed methodology addresses the problem of predicting pseudoknots of the first order (H-type). We introduce a context-free grammar (CFG) that affords enough expression power to recognize potential pseudoknot pattern. In addition, an alternative methodology of detecting possible pseudoknots is also implemented as well, using a brute-force algorithm. Any input sequence may highlight multiple potential folding patterns requiring a strict methodology to determine the single biologically realistic one. We conscripted a novel heuristic over the widely accepted notion of free-energy minimization to tackle such ambiguity in a performant way by utilizing each pattern’s context to unveil the most prominent pseudoknot pattern. The overall process features polynomial-time complexity, while its parallel implementation enhances the end performance, as proportional to the deployed hardware. The proposed methodology does succeed in predicting the core stems of any RNA pseudoknot of the test dataset by performing a 76.4% recall ratio. The methodology achieved a F1-score equal to 0.774 and MCC equal 0.543 in discovering all the stems of an RNA sequence, outperforming the particular task. Measurements were taken using a dataset of 262 RNA sequences establishing a performance speed of 1.31, 3.45, and 7.76 compared to three well-known platforms. The implementation source code is publicly available under knotify github repo. MDPI 2022-02-02 /pmc/articles/PMC8876629/ /pubmed/35200530 http://dx.doi.org/10.3390/mps5010014 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Andrikos, Christos
Makris, Evangelos
Kolaitis, Angelos
Rassias, Georgios
Pavlatos, Christos
Tsanakas, Panayiotis
Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition
title Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition
title_full Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition
title_fullStr Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition
title_full_unstemmed Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition
title_short Knotify: An Efficient Parallel Platform for RNA Pseudoknot Prediction Using Syntactic Pattern Recognition
title_sort knotify: an efficient parallel platform for rna pseudoknot prediction using syntactic pattern recognition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8876629/
https://www.ncbi.nlm.nih.gov/pubmed/35200530
http://dx.doi.org/10.3390/mps5010014
work_keys_str_mv AT andrikoschristos knotifyanefficientparallelplatformforrnapseudoknotpredictionusingsyntacticpatternrecognition
AT makrisevangelos knotifyanefficientparallelplatformforrnapseudoknotpredictionusingsyntacticpatternrecognition
AT kolaitisangelos knotifyanefficientparallelplatformforrnapseudoknotpredictionusingsyntacticpatternrecognition
AT rassiasgeorgios knotifyanefficientparallelplatformforrnapseudoknotpredictionusingsyntacticpatternrecognition
AT pavlatoschristos knotifyanefficientparallelplatformforrnapseudoknotpredictionusingsyntacticpatternrecognition
AT tsanakaspanayiotis knotifyanefficientparallelplatformforrnapseudoknotpredictionusingsyntacticpatternrecognition