Cargando…
lncEvo: automated identification and conservation study of long noncoding RNAs
BACKGROUND: Long noncoding RNAs represent a large class of transcripts with two common features: they exceed an arbitrary length threshold of 200 nt and are assumed to not encode proteins. Although a growing body of evidence indicates that the vast majority of lncRNAs are potentially nonfunctional,...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7871587/ https://www.ncbi.nlm.nih.gov/pubmed/33563213 http://dx.doi.org/10.1186/s12859-021-03991-2 |
_version_ | 1783649036636520448 |
---|---|
author | Bryzghalov, Oleksii Makałowska, Izabela Szcześniak, Michał Wojciech |
author_facet | Bryzghalov, Oleksii Makałowska, Izabela Szcześniak, Michał Wojciech |
author_sort | Bryzghalov, Oleksii |
collection | PubMed |
description | BACKGROUND: Long noncoding RNAs represent a large class of transcripts with two common features: they exceed an arbitrary length threshold of 200 nt and are assumed to not encode proteins. Although a growing body of evidence indicates that the vast majority of lncRNAs are potentially nonfunctional, hundreds of them have already been revealed to perform essential gene regulatory functions or to be linked to a number of cellular processes, including those associated with the etiology of human diseases. To better understand the biology of lncRNAs, it is essential to perform a more in-depth study of their evolution. In contrast to protein-encoding transcripts, however, they do not show the strong sequence conservation that usually results from purifying selection; therefore, software that is typically used to resolve the evolutionary relationships of protein-encoding genes and transcripts is not applicable to the study of lncRNAs. RESULTS: To tackle this issue, we developed lncEvo, a computational pipeline that consists of three modules: (1) transcriptome assembly from RNA-Seq data, (2) prediction of lncRNAs, and (3) conservation study—a genome-wide comparison of lncRNA transcriptomes between two species of interest, including search for orthologs. Importantly, one can choose to apply lncEvo solely for transcriptome assembly or lncRNA prediction, without calling the conservation-related part. CONCLUSIONS: lncEvo is an all-in-one tool built with the Nextflow framework, utilizing state-of-the-art software and algorithms with customizable trade-offs between speed and sensitivity, ease of use and built-in reporting functionalities. The source code of the pipeline is freely available for academic and nonacademic use under the MIT license at https://gitlab.com/spirit678/lncrna_conservation_nf. |
format | Online Article Text |
id | pubmed-7871587 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-78715872021-02-09 lncEvo: automated identification and conservation study of long noncoding RNAs Bryzghalov, Oleksii Makałowska, Izabela Szcześniak, Michał Wojciech BMC Bioinformatics Software BACKGROUND: Long noncoding RNAs represent a large class of transcripts with two common features: they exceed an arbitrary length threshold of 200 nt and are assumed to not encode proteins. Although a growing body of evidence indicates that the vast majority of lncRNAs are potentially nonfunctional, hundreds of them have already been revealed to perform essential gene regulatory functions or to be linked to a number of cellular processes, including those associated with the etiology of human diseases. To better understand the biology of lncRNAs, it is essential to perform a more in-depth study of their evolution. In contrast to protein-encoding transcripts, however, they do not show the strong sequence conservation that usually results from purifying selection; therefore, software that is typically used to resolve the evolutionary relationships of protein-encoding genes and transcripts is not applicable to the study of lncRNAs. RESULTS: To tackle this issue, we developed lncEvo, a computational pipeline that consists of three modules: (1) transcriptome assembly from RNA-Seq data, (2) prediction of lncRNAs, and (3) conservation study—a genome-wide comparison of lncRNA transcriptomes between two species of interest, including search for orthologs. Importantly, one can choose to apply lncEvo solely for transcriptome assembly or lncRNA prediction, without calling the conservation-related part. CONCLUSIONS: lncEvo is an all-in-one tool built with the Nextflow framework, utilizing state-of-the-art software and algorithms with customizable trade-offs between speed and sensitivity, ease of use and built-in reporting functionalities. The source code of the pipeline is freely available for academic and nonacademic use under the MIT license at https://gitlab.com/spirit678/lncrna_conservation_nf. BioMed Central 2021-02-09 /pmc/articles/PMC7871587/ /pubmed/33563213 http://dx.doi.org/10.1186/s12859-021-03991-2 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Bryzghalov, Oleksii Makałowska, Izabela Szcześniak, Michał Wojciech lncEvo: automated identification and conservation study of long noncoding RNAs |
title | lncEvo: automated identification and conservation study of long noncoding RNAs |
title_full | lncEvo: automated identification and conservation study of long noncoding RNAs |
title_fullStr | lncEvo: automated identification and conservation study of long noncoding RNAs |
title_full_unstemmed | lncEvo: automated identification and conservation study of long noncoding RNAs |
title_short | lncEvo: automated identification and conservation study of long noncoding RNAs |
title_sort | lncevo: automated identification and conservation study of long noncoding rnas |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7871587/ https://www.ncbi.nlm.nih.gov/pubmed/33563213 http://dx.doi.org/10.1186/s12859-021-03991-2 |
work_keys_str_mv | AT bryzghalovoleksii lncevoautomatedidentificationandconservationstudyoflongnoncodingrnas AT makałowskaizabela lncevoautomatedidentificationandconservationstudyoflongnoncodingrnas AT szczesniakmichałwojciech lncevoautomatedidentificationandconservationstudyoflongnoncodingrnas |