Cargando…
FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome
Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRN...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5416892/ https://www.ncbi.nlm.nih.gov/pubmed/28053114 http://dx.doi.org/10.1093/nar/gkw1306 |
_version_ | 1783233839110291456 |
---|---|
author | Wucher, Valentin Legeai, Fabrice Hédan, Benoît Rizk, Guillaume Lagoutte, Lætitia Leeb, Tosso Jagannathan, Vidhya Cadieu, Edouard David, Audrey Lohi, Hannes Cirera, Susanna Fredholm, Merete Botherel, Nadine Leegwater, Peter A.J. Le Béguec, Céline Fieten, Hille Johnson, Jeremy Alföldi, Jessica André, Catherine Lindblad-Toh, Kerstin Hitte, Christophe Derrien, Thomas |
author_facet | Wucher, Valentin Legeai, Fabrice Hédan, Benoît Rizk, Guillaume Lagoutte, Lætitia Leeb, Tosso Jagannathan, Vidhya Cadieu, Edouard David, Audrey Lohi, Hannes Cirera, Susanna Fredholm, Merete Botherel, Nadine Leegwater, Peter A.J. Le Béguec, Céline Fieten, Hille Johnson, Jeremy Alföldi, Jessica André, Catherine Lindblad-Toh, Kerstin Hitte, Christophe Derrien, Thomas |
author_sort | Wucher, Valentin |
collection | PubMed |
description | Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https://github.com/tderrien/FEELnc. |
format | Online Article Text |
id | pubmed-5416892 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-54168922017-05-05 FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome Wucher, Valentin Legeai, Fabrice Hédan, Benoît Rizk, Guillaume Lagoutte, Lætitia Leeb, Tosso Jagannathan, Vidhya Cadieu, Edouard David, Audrey Lohi, Hannes Cirera, Susanna Fredholm, Merete Botherel, Nadine Leegwater, Peter A.J. Le Béguec, Céline Fieten, Hille Johnson, Jeremy Alföldi, Jessica André, Catherine Lindblad-Toh, Kerstin Hitte, Christophe Derrien, Thomas Nucleic Acids Res Methods Online Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https://github.com/tderrien/FEELnc. Oxford University Press 2017-05-05 2017-01-02 /pmc/articles/PMC5416892/ /pubmed/28053114 http://dx.doi.org/10.1093/nar/gkw1306 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Wucher, Valentin Legeai, Fabrice Hédan, Benoît Rizk, Guillaume Lagoutte, Lætitia Leeb, Tosso Jagannathan, Vidhya Cadieu, Edouard David, Audrey Lohi, Hannes Cirera, Susanna Fredholm, Merete Botherel, Nadine Leegwater, Peter A.J. Le Béguec, Céline Fieten, Hille Johnson, Jeremy Alföldi, Jessica André, Catherine Lindblad-Toh, Kerstin Hitte, Christophe Derrien, Thomas FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome |
title | FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome |
title_full | FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome |
title_fullStr | FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome |
title_full_unstemmed | FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome |
title_short | FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome |
title_sort | feelnc: a tool for long non-coding rna annotation and its application to the dog transcriptome |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5416892/ https://www.ncbi.nlm.nih.gov/pubmed/28053114 http://dx.doi.org/10.1093/nar/gkw1306 |
work_keys_str_mv | AT wuchervalentin feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT legeaifabrice feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT hedanbenoit feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT rizkguillaume feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT lagouttelætitia feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT leebtosso feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT jagannathanvidhya feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT cadieuedouard feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT davidaudrey feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT lohihannes feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT cirerasusanna feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT fredholmmerete feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT botherelnadine feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT leegwaterpeteraj feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT lebeguecceline feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT fietenhille feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT johnsonjeremy feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT alfoldijessica feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT andrecatherine feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT lindbladtohkerstin feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT hittechristophe feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome AT derrienthomas feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome |