Cargando…

COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features

Recent genomic studies suggest that novel long non-coding RNAs (lncRNAs) are specifically expressed and far outnumber annotated lncRNA sequences. To identify and characterize novel lncRNAs in RNA sequencing data from new samples, we have developed COME, a coding potential calculation tool based on m...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Long, Xu, Zhiyu, Hu, Boqin, Lu, Zhi John
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5224497/
https://www.ncbi.nlm.nih.gov/pubmed/27608726
http://dx.doi.org/10.1093/nar/gkw798
_version_ 1782493371132018688
author Hu, Long
Xu, Zhiyu
Hu, Boqin
Lu, Zhi John
author_facet Hu, Long
Xu, Zhiyu
Hu, Boqin
Lu, Zhi John
author_sort Hu, Long
collection PubMed
description Recent genomic studies suggest that novel long non-coding RNAs (lncRNAs) are specifically expressed and far outnumber annotated lncRNA sequences. To identify and characterize novel lncRNAs in RNA sequencing data from new samples, we have developed COME, a coding potential calculation tool based on multiple features. It integrates multiple sequence-derived and experiment-based features using a decompose–compose method, which makes it more accurate and robust than other well-known tools. We also showed that COME was able to substantially improve the consistency of predication results from other coding potential calculators. Moreover, COME annotates and characterizes each predicted lncRNA transcript with multiple lines of supporting evidence, which are not provided by other tools. Remarkably, we found that one subgroup of lncRNAs classified by such supporting features (i.e. conserved local RNA secondary structure) was highly enriched in a well-validated database (lncRNAdb). We further found that the conserved structural domains on lncRNAs had better chance than other RNA regions to interact with RNA binding proteins, based on the recent eCLIP-seq data in human, indicating their potential regulatory roles. Overall, we present COME as an accurate, robust and multiple-feature supported method for the identification and characterization of novel lncRNAs. The software implementation is available at https://github.com/lulab/COME.
format Online
Article
Text
id pubmed-5224497
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-52244972017-01-17 COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features Hu, Long Xu, Zhiyu Hu, Boqin Lu, Zhi John Nucleic Acids Res Methods Online Recent genomic studies suggest that novel long non-coding RNAs (lncRNAs) are specifically expressed and far outnumber annotated lncRNA sequences. To identify and characterize novel lncRNAs in RNA sequencing data from new samples, we have developed COME, a coding potential calculation tool based on multiple features. It integrates multiple sequence-derived and experiment-based features using a decompose–compose method, which makes it more accurate and robust than other well-known tools. We also showed that COME was able to substantially improve the consistency of predication results from other coding potential calculators. Moreover, COME annotates and characterizes each predicted lncRNA transcript with multiple lines of supporting evidence, which are not provided by other tools. Remarkably, we found that one subgroup of lncRNAs classified by such supporting features (i.e. conserved local RNA secondary structure) was highly enriched in a well-validated database (lncRNAdb). We further found that the conserved structural domains on lncRNAs had better chance than other RNA regions to interact with RNA binding proteins, based on the recent eCLIP-seq data in human, indicating their potential regulatory roles. Overall, we present COME as an accurate, robust and multiple-feature supported method for the identification and characterization of novel lncRNAs. The software implementation is available at https://github.com/lulab/COME. Oxford University Press 2017-01-09 2016-09-07 /pmc/articles/PMC5224497/ /pubmed/27608726 http://dx.doi.org/10.1093/nar/gkw798 Text en © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Hu, Long
Xu, Zhiyu
Hu, Boqin
Lu, Zhi John
COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features
title COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features
title_full COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features
title_fullStr COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features
title_full_unstemmed COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features
title_short COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features
title_sort come: a robust coding potential calculation tool for lncrna identification and characterization based on multiple features
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5224497/
https://www.ncbi.nlm.nih.gov/pubmed/27608726
http://dx.doi.org/10.1093/nar/gkw798
work_keys_str_mv AT hulong comearobustcodingpotentialcalculationtoolforlncrnaidentificationandcharacterizationbasedonmultiplefeatures
AT xuzhiyu comearobustcodingpotentialcalculationtoolforlncrnaidentificationandcharacterizationbasedonmultiplefeatures
AT huboqin comearobustcodingpotentialcalculationtoolforlncrnaidentificationandcharacterizationbasedonmultiplefeatures
AT luzhijohn comearobustcodingpotentialcalculationtoolforlncrnaidentificationandcharacterizationbasedonmultiplefeatures