Cargando…

Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs

Long intergenic non-coding RNAs (lincRNAs) are an abundant and functionally diverse class of eukaryotic transcripts. Reported lincRNA repertoires in mammals vary, but are commonly in the thousands to tens of thousands of transcripts, covering ~90% of the genome. In addition to elucidating function,...

Descripción completa

Detalles Bibliográficos
Autores principales: Nelson, Andrew D. L., Devisetty, Upendra K., Palos, Kyle, Haug-Baltzell, Asher K., Lyons, Eric, Beilstein, Mark A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5422434/
https://www.ncbi.nlm.nih.gov/pubmed/28536600
http://dx.doi.org/10.3389/fgene.2017.00052
_version_ 1783234775143677952
author Nelson, Andrew D. L.
Devisetty, Upendra K.
Palos, Kyle
Haug-Baltzell, Asher K.
Lyons, Eric
Beilstein, Mark A.
author_facet Nelson, Andrew D. L.
Devisetty, Upendra K.
Palos, Kyle
Haug-Baltzell, Asher K.
Lyons, Eric
Beilstein, Mark A.
author_sort Nelson, Andrew D. L.
collection PubMed
description Long intergenic non-coding RNAs (lincRNAs) are an abundant and functionally diverse class of eukaryotic transcripts. Reported lincRNA repertoires in mammals vary, but are commonly in the thousands to tens of thousands of transcripts, covering ~90% of the genome. In addition to elucidating function, there is particular interest in understanding the origin and evolution of lincRNAs. Aside from mammals, lincRNA populations have been sparsely sampled, precluding evolutionary analyses focused on their emergence and persistence. Here we present Evolinc, a two-module pipeline designed to facilitate lincRNA discovery and characterize aspects of lincRNA evolution. The first module (Evolinc-I) is a lincRNA identification workflow that also facilitates downstream differential expression analysis and genome browser visualization of identified lincRNAs. The second module (Evolinc-II) is a genomic and transcriptomic comparative analysis workflow that determines the phylogenetic depth to which a lincRNA locus is conserved within a user-defined group of related species. Here we validate lincRNA catalogs generated with Evolinc-I against previously annotated Arabidopsis and human lincRNA data. Evolinc-I recapitulated earlier findings and uncovered an additional 70 Arabidopsis and 43 human lincRNAs. We demonstrate the usefulness of Evolinc-II by examining the evolutionary histories of a public dataset of 5,361 Arabidopsis lincRNAs. We used Evolinc-II to winnow this dataset to 40 lincRNAs conserved across species in Brassicaceae. Finally, we show how Evolinc-II can be used to recover the evolutionary history of a known lincRNA, the human telomerase RNA (TERC). These latter analyses revealed unexpected duplication events as well as the loss and subsequent acquisition of a novel TERC locus in the lineage leading to mice and rats. The Evolinc pipeline is currently integrated in CyVerse's Discovery Environment and is free for use by researchers.
format Online
Article
Text
id pubmed-5422434
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-54224342017-05-23 Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs Nelson, Andrew D. L. Devisetty, Upendra K. Palos, Kyle Haug-Baltzell, Asher K. Lyons, Eric Beilstein, Mark A. Front Genet Genetics Long intergenic non-coding RNAs (lincRNAs) are an abundant and functionally diverse class of eukaryotic transcripts. Reported lincRNA repertoires in mammals vary, but are commonly in the thousands to tens of thousands of transcripts, covering ~90% of the genome. In addition to elucidating function, there is particular interest in understanding the origin and evolution of lincRNAs. Aside from mammals, lincRNA populations have been sparsely sampled, precluding evolutionary analyses focused on their emergence and persistence. Here we present Evolinc, a two-module pipeline designed to facilitate lincRNA discovery and characterize aspects of lincRNA evolution. The first module (Evolinc-I) is a lincRNA identification workflow that also facilitates downstream differential expression analysis and genome browser visualization of identified lincRNAs. The second module (Evolinc-II) is a genomic and transcriptomic comparative analysis workflow that determines the phylogenetic depth to which a lincRNA locus is conserved within a user-defined group of related species. Here we validate lincRNA catalogs generated with Evolinc-I against previously annotated Arabidopsis and human lincRNA data. Evolinc-I recapitulated earlier findings and uncovered an additional 70 Arabidopsis and 43 human lincRNAs. We demonstrate the usefulness of Evolinc-II by examining the evolutionary histories of a public dataset of 5,361 Arabidopsis lincRNAs. We used Evolinc-II to winnow this dataset to 40 lincRNAs conserved across species in Brassicaceae. Finally, we show how Evolinc-II can be used to recover the evolutionary history of a known lincRNA, the human telomerase RNA (TERC). These latter analyses revealed unexpected duplication events as well as the loss and subsequent acquisition of a novel TERC locus in the lineage leading to mice and rats. The Evolinc pipeline is currently integrated in CyVerse's Discovery Environment and is free for use by researchers. Frontiers Media S.A. 2017-05-09 /pmc/articles/PMC5422434/ /pubmed/28536600 http://dx.doi.org/10.3389/fgene.2017.00052 Text en Copyright © 2017 Nelson, Devisetty, Palos, Haug-Baltzell, Lyons and Beilstein. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Nelson, Andrew D. L.
Devisetty, Upendra K.
Palos, Kyle
Haug-Baltzell, Asher K.
Lyons, Eric
Beilstein, Mark A.
Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs
title Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs
title_full Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs
title_fullStr Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs
title_full_unstemmed Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs
title_short Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs
title_sort evolinc: a tool for the identification and evolutionary comparison of long intergenic non-coding rnas
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5422434/
https://www.ncbi.nlm.nih.gov/pubmed/28536600
http://dx.doi.org/10.3389/fgene.2017.00052
work_keys_str_mv AT nelsonandrewdl evolincatoolfortheidentificationandevolutionarycomparisonoflongintergenicnoncodingrnas
AT devisettyupendrak evolincatoolfortheidentificationandevolutionarycomparisonoflongintergenicnoncodingrnas
AT paloskyle evolincatoolfortheidentificationandevolutionarycomparisonoflongintergenicnoncodingrnas
AT haugbaltzellasherk evolincatoolfortheidentificationandevolutionarycomparisonoflongintergenicnoncodingrnas
AT lyonseric evolincatoolfortheidentificationandevolutionarycomparisonoflongintergenicnoncodingrnas
AT beilsteinmarka evolincatoolfortheidentificationandevolutionarycomparisonoflongintergenicnoncodingrnas