Cargando…

Computational Prediction of Polycomb-Associated Long Non-Coding RNAs

Among thousands of long non-coding RNAs (lncRNAs) only a small subset is functionally characterized and the functional annotation of lncRNAs on the genomic scale remains inadequate. In this study we computationally characterized two functionally different parts of human lncRNAs transcriptome based o...

Descripción completa

Detalles Bibliográficos
Autores principales: Glazko, Galina V., Zybailov, Boris L., Rogozin, Igor B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3441527/
https://www.ncbi.nlm.nih.gov/pubmed/23028655
http://dx.doi.org/10.1371/journal.pone.0044878
_version_ 1782243310788673536
author Glazko, Galina V.
Zybailov, Boris L.
Rogozin, Igor B.
author_facet Glazko, Galina V.
Zybailov, Boris L.
Rogozin, Igor B.
author_sort Glazko, Galina V.
collection PubMed
description Among thousands of long non-coding RNAs (lncRNAs) only a small subset is functionally characterized and the functional annotation of lncRNAs on the genomic scale remains inadequate. In this study we computationally characterized two functionally different parts of human lncRNAs transcriptome based on their ability to bind the polycomb repressive complex, PRC2. This classification is enabled by the fact that while all lncRNAs constitute a diverse set of sequences, the classes of PRC2-binding and PRC2 non-binding lncRNAs possess characteristic combinations of sequence-structure patterns and, therefore, can be separated within the feature space. Based on the specific combination of features, we built several machine-learning classifiers and identified the SVM-based classifier as the best performing. We further showed that the SVM-based classifier is able to generalize on the independent data sets. We observed that this classifier, trained on the human lncRNAs, can predict up to 59.4% of PRC2-binding lncRNAs in mice. This suggests that, despite the low degree of sequence conservation, many lncRNAs play functionally conserved biological roles.
format Online
Article
Text
id pubmed-3441527
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34415272012-10-01 Computational Prediction of Polycomb-Associated Long Non-Coding RNAs Glazko, Galina V. Zybailov, Boris L. Rogozin, Igor B. PLoS One Research Article Among thousands of long non-coding RNAs (lncRNAs) only a small subset is functionally characterized and the functional annotation of lncRNAs on the genomic scale remains inadequate. In this study we computationally characterized two functionally different parts of human lncRNAs transcriptome based on their ability to bind the polycomb repressive complex, PRC2. This classification is enabled by the fact that while all lncRNAs constitute a diverse set of sequences, the classes of PRC2-binding and PRC2 non-binding lncRNAs possess characteristic combinations of sequence-structure patterns and, therefore, can be separated within the feature space. Based on the specific combination of features, we built several machine-learning classifiers and identified the SVM-based classifier as the best performing. We further showed that the SVM-based classifier is able to generalize on the independent data sets. We observed that this classifier, trained on the human lncRNAs, can predict up to 59.4% of PRC2-binding lncRNAs in mice. This suggests that, despite the low degree of sequence conservation, many lncRNAs play functionally conserved biological roles. Public Library of Science 2012-09-13 /pmc/articles/PMC3441527/ /pubmed/23028655 http://dx.doi.org/10.1371/journal.pone.0044878 Text en © 2012 Glazko et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Glazko, Galina V.
Zybailov, Boris L.
Rogozin, Igor B.
Computational Prediction of Polycomb-Associated Long Non-Coding RNAs
title Computational Prediction of Polycomb-Associated Long Non-Coding RNAs
title_full Computational Prediction of Polycomb-Associated Long Non-Coding RNAs
title_fullStr Computational Prediction of Polycomb-Associated Long Non-Coding RNAs
title_full_unstemmed Computational Prediction of Polycomb-Associated Long Non-Coding RNAs
title_short Computational Prediction of Polycomb-Associated Long Non-Coding RNAs
title_sort computational prediction of polycomb-associated long non-coding rnas
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3441527/
https://www.ncbi.nlm.nih.gov/pubmed/23028655
http://dx.doi.org/10.1371/journal.pone.0044878
work_keys_str_mv AT glazkogalinav computationalpredictionofpolycombassociatedlongnoncodingrnas
AT zybailovborisl computationalpredictionofpolycombassociatedlongnoncodingrnas
AT rogozinigorb computationalpredictionofpolycombassociatedlongnoncodingrnas