Cargando…

Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships

Long non-coding RNAs (lncRNAs) represent a diverse class of regulatory loci with roles in development and stress responses throughout all kingdoms of life. LncRNAs, however, remain under-studied in plants compared to animal systems. To address this deficiency, we applied a machine learning predictio...

Descripción completa

Detalles Bibliográficos
Autores principales: Simopoulos, Caitlin M. A., Weretilnyk, Elizabeth A., Golding, G. Brian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6686929/
https://www.ncbi.nlm.nih.gov/pubmed/31235560
http://dx.doi.org/10.1534/g3.119.400201
_version_ 1783442643166953472
author Simopoulos, Caitlin M. A.
Weretilnyk, Elizabeth A.
Golding, G. Brian
author_facet Simopoulos, Caitlin M. A.
Weretilnyk, Elizabeth A.
Golding, G. Brian
author_sort Simopoulos, Caitlin M. A.
collection PubMed
description Long non-coding RNAs (lncRNAs) represent a diverse class of regulatory loci with roles in development and stress responses throughout all kingdoms of life. LncRNAs, however, remain under-studied in plants compared to animal systems. To address this deficiency, we applied a machine learning prediction tool, Classifying RNA by Ensemble Machine learning Algorithm (CREMA), to analyze RNAseq data from 11 plant species chosen to represent a wide range of evolutionary histories. Transcript sequences of all expressed and/or annotated loci from plants grown in unstressed (control) conditions were assembled and input into CREMA for comparative analyses. On average, 6.4% of the plant transcripts were identified by CREMA as encoding lncRNAs. Gene annotation associated with the transcripts showed that up to 99% of all predicted lncRNAs for Solanum tuberosum and Amborella trichopoda were missing from their reference annotations whereas the reference annotation for the genetic model plant Arabidopsis thaliana contains 96% of all predicted lncRNAs for this species. Thus a reliance on reference annotations for use in lncRNA research in less well-studied plants can be impeded by the near absence of annotations associated with these regulatory transcripts. Moreover, our work using phylogenetic signal analyses suggests that molecular traits of plant lncRNAs display different evolutionary patterns than all other transcripts in plants and have molecular traits that do not follow a classic evolutionary pattern. Specifically, GC content was the only tested trait of lncRNAs with consistently significant and high phylogenetic signal, contrary to high signal in all tested molecular traits for the other transcripts in our tested plant species.
format Online
Article
Text
id pubmed-6686929
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-66869292019-08-11 Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships Simopoulos, Caitlin M. A. Weretilnyk, Elizabeth A. Golding, G. Brian G3 (Bethesda) Investigations Long non-coding RNAs (lncRNAs) represent a diverse class of regulatory loci with roles in development and stress responses throughout all kingdoms of life. LncRNAs, however, remain under-studied in plants compared to animal systems. To address this deficiency, we applied a machine learning prediction tool, Classifying RNA by Ensemble Machine learning Algorithm (CREMA), to analyze RNAseq data from 11 plant species chosen to represent a wide range of evolutionary histories. Transcript sequences of all expressed and/or annotated loci from plants grown in unstressed (control) conditions were assembled and input into CREMA for comparative analyses. On average, 6.4% of the plant transcripts were identified by CREMA as encoding lncRNAs. Gene annotation associated with the transcripts showed that up to 99% of all predicted lncRNAs for Solanum tuberosum and Amborella trichopoda were missing from their reference annotations whereas the reference annotation for the genetic model plant Arabidopsis thaliana contains 96% of all predicted lncRNAs for this species. Thus a reliance on reference annotations for use in lncRNA research in less well-studied plants can be impeded by the near absence of annotations associated with these regulatory transcripts. Moreover, our work using phylogenetic signal analyses suggests that molecular traits of plant lncRNAs display different evolutionary patterns than all other transcripts in plants and have molecular traits that do not follow a classic evolutionary pattern. Specifically, GC content was the only tested trait of lncRNAs with consistently significant and high phylogenetic signal, contrary to high signal in all tested molecular traits for the other transcripts in our tested plant species. Genetics Society of America 2019-06-24 /pmc/articles/PMC6686929/ /pubmed/31235560 http://dx.doi.org/10.1534/g3.119.400201 Text en Copyright © 2019 Simopoulos et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Simopoulos, Caitlin M. A.
Weretilnyk, Elizabeth A.
Golding, G. Brian
Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships
title Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships
title_full Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships
title_fullStr Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships
title_full_unstemmed Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships
title_short Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships
title_sort molecular traits of long non-protein coding rnas from diverse plant species show little evidence of phylogenetic relationships
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6686929/
https://www.ncbi.nlm.nih.gov/pubmed/31235560
http://dx.doi.org/10.1534/g3.119.400201
work_keys_str_mv AT simopouloscaitlinma moleculartraitsoflongnonproteincodingrnasfromdiverseplantspeciesshowlittleevidenceofphylogeneticrelationships
AT weretilnykelizabetha moleculartraitsoflongnonproteincodingrnasfromdiverseplantspeciesshowlittleevidenceofphylogeneticrelationships
AT goldinggbrian moleculartraitsoflongnonproteincodingrnasfromdiverseplantspeciesshowlittleevidenceofphylogeneticrelationships