Cargando…

Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome

BACKGROUND: mRNA-like long non-coding RNAs (lncRNAs) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. Thus, in many cases lncRNA detection by RNA-sequencing (RNA-seq) is comp...

Descripción completa

Detalles Bibliográficos
Autores principales: Bush, Stephen J., Muriuki, Charity, McCulloch, Mary E. B., Farquhar, Iseabail L., Clark, Emily L., Hume, David A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5926538/
https://www.ncbi.nlm.nih.gov/pubmed/29690875
http://dx.doi.org/10.1186/s12711-018-0391-0
_version_ 1783318928108290048
author Bush, Stephen J.
Muriuki, Charity
McCulloch, Mary E. B.
Farquhar, Iseabail L.
Clark, Emily L.
Hume, David A.
author_facet Bush, Stephen J.
Muriuki, Charity
McCulloch, Mary E. B.
Farquhar, Iseabail L.
Clark, Emily L.
Hume, David A.
author_sort Bush, Stephen J.
collection PubMed
description BACKGROUND: mRNA-like long non-coding RNAs (lncRNAs) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. Thus, in many cases lncRNA detection by RNA-sequencing (RNA-seq) is compromised by stochastic sampling. To account for this and create a catalogue of ruminant lncRNAs, we compared de novo assembled lncRNAs derived from large RNA-seq datasets in transcriptional atlas projects for sheep and goats with previous lncRNAs assembled in cattle and human. We then combined the novel lncRNAs with the sheep transcriptional atlas to identify co-regulated sets of protein-coding and non-coding loci. RESULTS: Few lncRNAs could be reproducibly assembled from a single dataset, even with deep sequencing of the same tissues from multiple animals. Furthermore, there was little sequence overlap between lncRNAs that were assembled from pooled RNA-seq data. We combined positional conservation (synteny) with cross-species mapping of candidate lncRNAs to identify a consensus set of ruminant lncRNAs and then used the RNA-seq data to demonstrate detectable and reproducible expression in each species. In sheep, 20 to 30% of lncRNAs were located close to protein-coding genes with which they are strongly co-expressed, which is consistent with the evolutionary origin of some ncRNAs in enhancer sequences. Nevertheless, most of the lncRNAs are not co-expressed with neighbouring protein-coding genes. CONCLUSIONS: Alongside substantially expanding the ruminant lncRNA repertoire, the outcomes of our analysis demonstrate that stochastic sampling can be partly overcome by combining RNA-seq datasets from related species. This has practical implications for the future discovery of lncRNAs in other species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12711-018-0391-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5926538
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59265382018-05-01 Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome Bush, Stephen J. Muriuki, Charity McCulloch, Mary E. B. Farquhar, Iseabail L. Clark, Emily L. Hume, David A. Genet Sel Evol Research Article BACKGROUND: mRNA-like long non-coding RNAs (lncRNAs) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. Thus, in many cases lncRNA detection by RNA-sequencing (RNA-seq) is compromised by stochastic sampling. To account for this and create a catalogue of ruminant lncRNAs, we compared de novo assembled lncRNAs derived from large RNA-seq datasets in transcriptional atlas projects for sheep and goats with previous lncRNAs assembled in cattle and human. We then combined the novel lncRNAs with the sheep transcriptional atlas to identify co-regulated sets of protein-coding and non-coding loci. RESULTS: Few lncRNAs could be reproducibly assembled from a single dataset, even with deep sequencing of the same tissues from multiple animals. Furthermore, there was little sequence overlap between lncRNAs that were assembled from pooled RNA-seq data. We combined positional conservation (synteny) with cross-species mapping of candidate lncRNAs to identify a consensus set of ruminant lncRNAs and then used the RNA-seq data to demonstrate detectable and reproducible expression in each species. In sheep, 20 to 30% of lncRNAs were located close to protein-coding genes with which they are strongly co-expressed, which is consistent with the evolutionary origin of some ncRNAs in enhancer sequences. Nevertheless, most of the lncRNAs are not co-expressed with neighbouring protein-coding genes. CONCLUSIONS: Alongside substantially expanding the ruminant lncRNA repertoire, the outcomes of our analysis demonstrate that stochastic sampling can be partly overcome by combining RNA-seq datasets from related species. This has practical implications for the future discovery of lncRNAs in other species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12711-018-0391-0) contains supplementary material, which is available to authorized users. BioMed Central 2018-04-24 /pmc/articles/PMC5926538/ /pubmed/29690875 http://dx.doi.org/10.1186/s12711-018-0391-0 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Bush, Stephen J.
Muriuki, Charity
McCulloch, Mary E. B.
Farquhar, Iseabail L.
Clark, Emily L.
Hume, David A.
Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome
title Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome
title_full Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome
title_fullStr Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome
title_full_unstemmed Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome
title_short Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome
title_sort cross-species inference of long non-coding rnas greatly expands the ruminant transcriptome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5926538/
https://www.ncbi.nlm.nih.gov/pubmed/29690875
http://dx.doi.org/10.1186/s12711-018-0391-0
work_keys_str_mv AT bushstephenj crossspeciesinferenceoflongnoncodingrnasgreatlyexpandstheruminanttranscriptome
AT muriukicharity crossspeciesinferenceoflongnoncodingrnasgreatlyexpandstheruminanttranscriptome
AT mccullochmaryeb crossspeciesinferenceoflongnoncodingrnasgreatlyexpandstheruminanttranscriptome
AT farquhariseabaill crossspeciesinferenceoflongnoncodingrnasgreatlyexpandstheruminanttranscriptome
AT clarkemilyl crossspeciesinferenceoflongnoncodingrnasgreatlyexpandstheruminanttranscriptome
AT humedavida crossspeciesinferenceoflongnoncodingrnasgreatlyexpandstheruminanttranscriptome