Cargando…

Accurate Classification of RNA Structures Using Topological Fingerprints

While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furtherm...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Jiajie, Li, Kejie, Gribskov, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5068708/
https://www.ncbi.nlm.nih.gov/pubmed/27755571
http://dx.doi.org/10.1371/journal.pone.0164726
_version_ 1782460825156452352
author Huang, Jiajie
Li, Kejie
Gribskov, Michael
author_facet Huang, Jiajie
Li, Kejie
Gribskov, Michael
author_sort Huang, Jiajie
collection PubMed
description While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity–an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint.
format Online
Article
Text
id pubmed-5068708
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-50687082016-10-27 Accurate Classification of RNA Structures Using Topological Fingerprints Huang, Jiajie Li, Kejie Gribskov, Michael PLoS One Research Article While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity–an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint. Public Library of Science 2016-10-18 /pmc/articles/PMC5068708/ /pubmed/27755571 http://dx.doi.org/10.1371/journal.pone.0164726 Text en © 2016 Huang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Huang, Jiajie
Li, Kejie
Gribskov, Michael
Accurate Classification of RNA Structures Using Topological Fingerprints
title Accurate Classification of RNA Structures Using Topological Fingerprints
title_full Accurate Classification of RNA Structures Using Topological Fingerprints
title_fullStr Accurate Classification of RNA Structures Using Topological Fingerprints
title_full_unstemmed Accurate Classification of RNA Structures Using Topological Fingerprints
title_short Accurate Classification of RNA Structures Using Topological Fingerprints
title_sort accurate classification of rna structures using topological fingerprints
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5068708/
https://www.ncbi.nlm.nih.gov/pubmed/27755571
http://dx.doi.org/10.1371/journal.pone.0164726
work_keys_str_mv AT huangjiajie accurateclassificationofrnastructuresusingtopologicalfingerprints
AT likejie accurateclassificationofrnastructuresusingtopologicalfingerprints
AT gribskovmichael accurateclassificationofrnastructuresusingtopologicalfingerprints