Cargando…

FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome

Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRN...

Descripción completa

Detalles Bibliográficos
Autores principales: Wucher, Valentin, Legeai, Fabrice, Hédan, Benoît, Rizk, Guillaume, Lagoutte, Lætitia, Leeb, Tosso, Jagannathan, Vidhya, Cadieu, Edouard, David, Audrey, Lohi, Hannes, Cirera, Susanna, Fredholm, Merete, Botherel, Nadine, Leegwater, Peter A.J., Le Béguec, Céline, Fieten, Hille, Johnson, Jeremy, Alföldi, Jessica, André, Catherine, Lindblad-Toh, Kerstin, Hitte, Christophe, Derrien, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5416892/
https://www.ncbi.nlm.nih.gov/pubmed/28053114
http://dx.doi.org/10.1093/nar/gkw1306
_version_ 1783233839110291456
author Wucher, Valentin
Legeai, Fabrice
Hédan, Benoît
Rizk, Guillaume
Lagoutte, Lætitia
Leeb, Tosso
Jagannathan, Vidhya
Cadieu, Edouard
David, Audrey
Lohi, Hannes
Cirera, Susanna
Fredholm, Merete
Botherel, Nadine
Leegwater, Peter A.J.
Le Béguec, Céline
Fieten, Hille
Johnson, Jeremy
Alföldi, Jessica
André, Catherine
Lindblad-Toh, Kerstin
Hitte, Christophe
Derrien, Thomas
author_facet Wucher, Valentin
Legeai, Fabrice
Hédan, Benoît
Rizk, Guillaume
Lagoutte, Lætitia
Leeb, Tosso
Jagannathan, Vidhya
Cadieu, Edouard
David, Audrey
Lohi, Hannes
Cirera, Susanna
Fredholm, Merete
Botherel, Nadine
Leegwater, Peter A.J.
Le Béguec, Céline
Fieten, Hille
Johnson, Jeremy
Alföldi, Jessica
André, Catherine
Lindblad-Toh, Kerstin
Hitte, Christophe
Derrien, Thomas
author_sort Wucher, Valentin
collection PubMed
description Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https://github.com/tderrien/FEELnc.
format Online
Article
Text
id pubmed-5416892
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-54168922017-05-05 FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome Wucher, Valentin Legeai, Fabrice Hédan, Benoît Rizk, Guillaume Lagoutte, Lætitia Leeb, Tosso Jagannathan, Vidhya Cadieu, Edouard David, Audrey Lohi, Hannes Cirera, Susanna Fredholm, Merete Botherel, Nadine Leegwater, Peter A.J. Le Béguec, Céline Fieten, Hille Johnson, Jeremy Alföldi, Jessica André, Catherine Lindblad-Toh, Kerstin Hitte, Christophe Derrien, Thomas Nucleic Acids Res Methods Online Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https://github.com/tderrien/FEELnc. Oxford University Press 2017-05-05 2017-01-02 /pmc/articles/PMC5416892/ /pubmed/28053114 http://dx.doi.org/10.1093/nar/gkw1306 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Wucher, Valentin
Legeai, Fabrice
Hédan, Benoît
Rizk, Guillaume
Lagoutte, Lætitia
Leeb, Tosso
Jagannathan, Vidhya
Cadieu, Edouard
David, Audrey
Lohi, Hannes
Cirera, Susanna
Fredholm, Merete
Botherel, Nadine
Leegwater, Peter A.J.
Le Béguec, Céline
Fieten, Hille
Johnson, Jeremy
Alföldi, Jessica
André, Catherine
Lindblad-Toh, Kerstin
Hitte, Christophe
Derrien, Thomas
FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome
title FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome
title_full FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome
title_fullStr FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome
title_full_unstemmed FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome
title_short FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome
title_sort feelnc: a tool for long non-coding rna annotation and its application to the dog transcriptome
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5416892/
https://www.ncbi.nlm.nih.gov/pubmed/28053114
http://dx.doi.org/10.1093/nar/gkw1306
work_keys_str_mv AT wuchervalentin feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT legeaifabrice feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT hedanbenoit feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT rizkguillaume feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT lagouttelætitia feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT leebtosso feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT jagannathanvidhya feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT cadieuedouard feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT davidaudrey feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT lohihannes feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT cirerasusanna feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT fredholmmerete feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT botherelnadine feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT leegwaterpeteraj feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT lebeguecceline feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT fietenhille feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT johnsonjeremy feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT alfoldijessica feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT andrecatherine feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT lindbladtohkerstin feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT hittechristophe feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome
AT derrienthomas feelncatoolforlongnoncodingrnaannotationanditsapplicationtothedogtranscriptome