Cargando…

Predicting functional long non-coding RNAs validated by low throughput experiments

High-throughput techniques have uncovered hundreds and thousands of long non-coding RNAs (lncRNAs). Among them, only a tiny fraction has experimentally validated functions (EVlncRNAs) by low-throughput methods. What fraction of lncRNAs from high-throughput experiments (HTlncRNAs) is truly functional...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Bailing, Yang, Yuedong, Zhan, Jian, Dou, Xianghua, Wang, Jihua, Zhou, Yaoqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Taylor & Francis 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6779387/
https://www.ncbi.nlm.nih.gov/pubmed/31345106
http://dx.doi.org/10.1080/15476286.2019.1644590
_version_ 1783456915861274624
author Zhou, Bailing
Yang, Yuedong
Zhan, Jian
Dou, Xianghua
Wang, Jihua
Zhou, Yaoqi
author_facet Zhou, Bailing
Yang, Yuedong
Zhan, Jian
Dou, Xianghua
Wang, Jihua
Zhou, Yaoqi
author_sort Zhou, Bailing
collection PubMed
description High-throughput techniques have uncovered hundreds and thousands of long non-coding RNAs (lncRNAs). Among them, only a tiny fraction has experimentally validated functions (EVlncRNAs) by low-throughput methods. What fraction of lncRNAs from high-throughput experiments (HTlncRNAs) is truly functional is an active subject of debate. Here, we developed the first method to distinguish EVlncRNAs from HTlncRNAs and mRNAs by using Support Vector Machines and found that EVlncRNAs can be well separated from HTlncRNAs and mRNAs with 0.6 for Matthews correlation coefficient, 64% for sensitivity, and 81% for precision for the independent human test set. The most useful features for classification are related to sequence conservations at RNA (for separating from HTlncRNAs) and protein (for separating from mRNA) levels. The method is found to be robust as the human-RNA-trained model is applicable to independent mouse RNAs with similar accuracy and to a lesser extent to plant RNAs. The method can recover newly discovered EVlncRNAs with high sensitivity. Its application to randomly selected 2000 human HTlncRNAs indicates that the majority of HTlncRNAs is probably non-functional but a large portion (nearly 30%) are likely functional. In other words, there is an ample number of lncRNAs whose specific biological roles are yet to be discovered. The method developed here is expected to speed up and reduce the cost of the discovery by prioritizing potentially functional lncRNAs prior to experimental validation. EVlncRNA-pred is available as a web server at http://biophy.dzu.edu.cn/lncrnapred/index.html. All datasets used in this study can be obtained from the same website.
format Online
Article
Text
id pubmed-6779387
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Taylor & Francis
record_format MEDLINE/PubMed
spelling pubmed-67793872019-10-16 Predicting functional long non-coding RNAs validated by low throughput experiments Zhou, Bailing Yang, Yuedong Zhan, Jian Dou, Xianghua Wang, Jihua Zhou, Yaoqi RNA Biol Research Paper High-throughput techniques have uncovered hundreds and thousands of long non-coding RNAs (lncRNAs). Among them, only a tiny fraction has experimentally validated functions (EVlncRNAs) by low-throughput methods. What fraction of lncRNAs from high-throughput experiments (HTlncRNAs) is truly functional is an active subject of debate. Here, we developed the first method to distinguish EVlncRNAs from HTlncRNAs and mRNAs by using Support Vector Machines and found that EVlncRNAs can be well separated from HTlncRNAs and mRNAs with 0.6 for Matthews correlation coefficient, 64% for sensitivity, and 81% for precision for the independent human test set. The most useful features for classification are related to sequence conservations at RNA (for separating from HTlncRNAs) and protein (for separating from mRNA) levels. The method is found to be robust as the human-RNA-trained model is applicable to independent mouse RNAs with similar accuracy and to a lesser extent to plant RNAs. The method can recover newly discovered EVlncRNAs with high sensitivity. Its application to randomly selected 2000 human HTlncRNAs indicates that the majority of HTlncRNAs is probably non-functional but a large portion (nearly 30%) are likely functional. In other words, there is an ample number of lncRNAs whose specific biological roles are yet to be discovered. The method developed here is expected to speed up and reduce the cost of the discovery by prioritizing potentially functional lncRNAs prior to experimental validation. EVlncRNA-pred is available as a web server at http://biophy.dzu.edu.cn/lncrnapred/index.html. All datasets used in this study can be obtained from the same website. Taylor & Francis 2019-07-26 /pmc/articles/PMC6779387/ /pubmed/31345106 http://dx.doi.org/10.1080/15476286.2019.1644590 Text en © 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.
spellingShingle Research Paper
Zhou, Bailing
Yang, Yuedong
Zhan, Jian
Dou, Xianghua
Wang, Jihua
Zhou, Yaoqi
Predicting functional long non-coding RNAs validated by low throughput experiments
title Predicting functional long non-coding RNAs validated by low throughput experiments
title_full Predicting functional long non-coding RNAs validated by low throughput experiments
title_fullStr Predicting functional long non-coding RNAs validated by low throughput experiments
title_full_unstemmed Predicting functional long non-coding RNAs validated by low throughput experiments
title_short Predicting functional long non-coding RNAs validated by low throughput experiments
title_sort predicting functional long non-coding rnas validated by low throughput experiments
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6779387/
https://www.ncbi.nlm.nih.gov/pubmed/31345106
http://dx.doi.org/10.1080/15476286.2019.1644590
work_keys_str_mv AT zhoubailing predictingfunctionallongnoncodingrnasvalidatedbylowthroughputexperiments
AT yangyuedong predictingfunctionallongnoncodingrnasvalidatedbylowthroughputexperiments
AT zhanjian predictingfunctionallongnoncodingrnasvalidatedbylowthroughputexperiments
AT douxianghua predictingfunctionallongnoncodingrnasvalidatedbylowthroughputexperiments
AT wangjihua predictingfunctionallongnoncodingrnasvalidatedbylowthroughputexperiments
AT zhouyaoqi predictingfunctionallongnoncodingrnasvalidatedbylowthroughputexperiments