Cargando…

LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2

The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in SARS-CoV-2 genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these con...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Sizhen, Zhang, He, Zhang, Liang, Liu, Kaibo, Liu, Boxiang, Mathews, David H., Huang, Liang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8609897/
https://www.ncbi.nlm.nih.gov/pubmed/34816262
http://dx.doi.org/10.1101/2020.11.23.393488
_version_ 1784603005340876800
author Li, Sizhen
Zhang, He
Zhang, Liang
Liu, Kaibo
Liu, Boxiang
Mathews, David H.
Huang, Liang
author_facet Li, Sizhen
Zhang, He
Zhang, Liang
Liu, Kaibo
Liu, Boxiang
Mathews, David H.
Huang, Liang
author_sort Li, Sizhen
collection PubMed
description The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in SARS-CoV-2 genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length, and are thus infeasible for coronaviruses, which possess the longest genomes (~30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurboFold’s purely in silico prediction not only is close to experimentally-guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5’ and 3’ UTRs (~29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies novel conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, siRNAs, CRISPR-Cas13 guide RNAs and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies, and will be a useful tool in fighting the current and future pandemics.
format Online
Article
Text
id pubmed-8609897
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-86098972021-11-24 LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2 Li, Sizhen Zhang, He Zhang, Liang Liu, Kaibo Liu, Boxiang Mathews, David H. Huang, Liang bioRxiv Article The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in SARS-CoV-2 genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length, and are thus infeasible for coronaviruses, which possess the longest genomes (~30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurboFold’s purely in silico prediction not only is close to experimentally-guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5’ and 3’ UTRs (~29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies novel conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, siRNAs, CRISPR-Cas13 guide RNAs and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies, and will be a useful tool in fighting the current and future pandemics. Cold Spring Harbor Laboratory 2021-11-15 /pmc/articles/PMC8609897/ /pubmed/34816262 http://dx.doi.org/10.1101/2020.11.23.393488 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Li, Sizhen
Zhang, He
Zhang, Liang
Liu, Kaibo
Liu, Boxiang
Mathews, David H.
Huang, Liang
LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2
title LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2
title_full LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2
title_fullStr LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2
title_full_unstemmed LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2
title_short LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2
title_sort linearturbofold: linear-time global prediction of conserved structures for rna homologs with applications to sars-cov-2
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8609897/
https://www.ncbi.nlm.nih.gov/pubmed/34816262
http://dx.doi.org/10.1101/2020.11.23.393488
work_keys_str_mv AT lisizhen linearturbofoldlineartimeglobalpredictionofconservedstructuresforrnahomologswithapplicationstosarscov2
AT zhanghe linearturbofoldlineartimeglobalpredictionofconservedstructuresforrnahomologswithapplicationstosarscov2
AT zhangliang linearturbofoldlineartimeglobalpredictionofconservedstructuresforrnahomologswithapplicationstosarscov2
AT liukaibo linearturbofoldlineartimeglobalpredictionofconservedstructuresforrnahomologswithapplicationstosarscov2
AT liuboxiang linearturbofoldlineartimeglobalpredictionofconservedstructuresforrnahomologswithapplicationstosarscov2
AT mathewsdavidh linearturbofoldlineartimeglobalpredictionofconservedstructuresforrnahomologswithapplicationstosarscov2
AT huangliang linearturbofoldlineartimeglobalpredictionofconservedstructuresforrnahomologswithapplicationstosarscov2