Cargando…

Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers

The recent development and application of methods based on the general principle of “crosslinking and proximity ligation” (crosslink-ligation) are revolutionizing RNA structure studies in living cells. However, extracting structure information from such data presents unique challenges. Here, we intr...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Minjie, Hwang, Irena T., Li, Kongpan, Bai, Jianhui, Chen, Jian-Fu, Weissman, Tsachy, Zou, James Y., Lu, Zhipeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9104705/
https://www.ncbi.nlm.nih.gov/pubmed/35332099
http://dx.doi.org/10.1101/gr.275979.121
_version_ 1784707859536150528
author Zhang, Minjie
Hwang, Irena T.
Li, Kongpan
Bai, Jianhui
Chen, Jian-Fu
Weissman, Tsachy
Zou, James Y.
Lu, Zhipeng
author_facet Zhang, Minjie
Hwang, Irena T.
Li, Kongpan
Bai, Jianhui
Chen, Jian-Fu
Weissman, Tsachy
Zou, James Y.
Lu, Zhipeng
author_sort Zhang, Minjie
collection PubMed
description The recent development and application of methods based on the general principle of “crosslinking and proximity ligation” (crosslink-ligation) are revolutionizing RNA structure studies in living cells. However, extracting structure information from such data presents unique challenges. Here, we introduce a set of computational tools for the systematic analysis of data from a wide variety of crosslink-ligation methods, specifically focusing on read mapping, alignment classification, and clustering. We design a new strategy to map short reads with irregular gaps at high sensitivity and specificity. Analysis of previously published data reveals distinct properties and bias caused by the crosslinking reactions. We perform rigorous and exhaustive classification of alignments and discover eight types of arrangements that provide distinct information on RNA structures and interactions. To deconvolve the dense and intertwined gapped alignments, we develop a network/graph-based tool Crosslinked RNA Secondary Structure Analysis using Network Techniques (CRSSANT), which enables clustering of gapped alignments and discovery of new alternative and dynamic conformations. We discover that multiple crosslinking and ligation events can occur on the same RNA, generating multisegment alignments to report complex high-level RNA structures and multi-RNA interactions. We find that alignments with overlapped segments are produced from potential homodimers and develop a new method for their de novo identification. Analysis of overlapping alignments revealed potential new homodimers in cellular noncoding RNAs and RNA virus genomes in the Picornaviridae family. Together, this suite of computational tools enables rapid and efficient analysis of RNA structure and interaction data in living cells.
format Online
Article
Text
id pubmed-9104705
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-91047052022-11-01 Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers Zhang, Minjie Hwang, Irena T. Li, Kongpan Bai, Jianhui Chen, Jian-Fu Weissman, Tsachy Zou, James Y. Lu, Zhipeng Genome Res Method The recent development and application of methods based on the general principle of “crosslinking and proximity ligation” (crosslink-ligation) are revolutionizing RNA structure studies in living cells. However, extracting structure information from such data presents unique challenges. Here, we introduce a set of computational tools for the systematic analysis of data from a wide variety of crosslink-ligation methods, specifically focusing on read mapping, alignment classification, and clustering. We design a new strategy to map short reads with irregular gaps at high sensitivity and specificity. Analysis of previously published data reveals distinct properties and bias caused by the crosslinking reactions. We perform rigorous and exhaustive classification of alignments and discover eight types of arrangements that provide distinct information on RNA structures and interactions. To deconvolve the dense and intertwined gapped alignments, we develop a network/graph-based tool Crosslinked RNA Secondary Structure Analysis using Network Techniques (CRSSANT), which enables clustering of gapped alignments and discovery of new alternative and dynamic conformations. We discover that multiple crosslinking and ligation events can occur on the same RNA, generating multisegment alignments to report complex high-level RNA structures and multi-RNA interactions. We find that alignments with overlapped segments are produced from potential homodimers and develop a new method for their de novo identification. Analysis of overlapping alignments revealed potential new homodimers in cellular noncoding RNAs and RNA virus genomes in the Picornaviridae family. Together, this suite of computational tools enables rapid and efficient analysis of RNA structure and interaction data in living cells. Cold Spring Harbor Laboratory Press 2022-05 /pmc/articles/PMC9104705/ /pubmed/35332099 http://dx.doi.org/10.1101/gr.275979.121 Text en © 2022 Zhang et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Method
Zhang, Minjie
Hwang, Irena T.
Li, Kongpan
Bai, Jianhui
Chen, Jian-Fu
Weissman, Tsachy
Zou, James Y.
Lu, Zhipeng
Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers
title Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers
title_full Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers
title_fullStr Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers
title_full_unstemmed Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers
title_short Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers
title_sort classification and clustering of rna crosslink-ligation data reveal complex structures and homodimers
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9104705/
https://www.ncbi.nlm.nih.gov/pubmed/35332099
http://dx.doi.org/10.1101/gr.275979.121
work_keys_str_mv AT zhangminjie classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers
AT hwangirenat classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers
AT likongpan classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers
AT baijianhui classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers
AT chenjianfu classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers
AT weissmantsachy classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers
AT zoujamesy classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers
AT luzhipeng classificationandclusteringofrnacrosslinkligationdatarevealcomplexstructuresandhomodimers