Cargando…
Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data
BACKGROUND: The increasing interest in small non-coding RNAs (ncRNAs) such as microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs) and recent advances in sequencing technology have yielded large numbers of short (18-32 nt) RNA sequences from different organisms, som...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2825236/ https://www.ncbi.nlm.nih.gov/pubmed/20113528 http://dx.doi.org/10.1186/1471-2164-11-77 |
_version_ | 1782177799348420608 |
---|---|
author | Jung, Chol-Hee Hansen, Martin A Makunin, Igor V Korbie, Darren J Mattick, John S |
author_facet | Jung, Chol-Hee Hansen, Martin A Makunin, Igor V Korbie, Darren J Mattick, John S |
author_sort | Jung, Chol-Hee |
collection | PubMed |
description | BACKGROUND: The increasing interest in small non-coding RNAs (ncRNAs) such as microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs) and recent advances in sequencing technology have yielded large numbers of short (18-32 nt) RNA sequences from different organisms, some of which are derived from small nucleolar RNAs (snoRNAs) and transfer RNAs (tRNAs). We observed that these short ncRNAs frequently cover the entire length of annotated snoRNAs or tRNAs, which suggests that other loci specifying similar ncRNAs can be identified by clusters of short RNA sequences. RESULTS: We combined publicly available datasets of tens of millions of short RNA sequence tags from Drosophila melanogaster, and mapped them to the Drosophila genome. Approximately 6 million perfectly mapping sequence tags were then assembled into 521,302 tag-contigs (TCs) based on tag overlap. Most transposon-derived sequences, exons and annotated miRNAs, tRNAs and snoRNAs are detected by TCs, which show distinct patterns of length and tag-depth for different categories. The typical length and tag-depth of snoRNA-derived TCs was used to predict 7 previously unrecognized box H/ACA and 26 box C/D snoRNA candidates. We also identified one snRNA candidate and 86 loci with a high number of tags that are yet to be annotated, 7 of which have a particular 18mer motif and are located in introns of genes involved in development. A subset of new snoRNA candidates and putative ncRNA candidates was verified by Northern blot. CONCLUSIONS: In this study, we have introduced a new approach to identify new members of known classes of ncRNAs based on the features of TCs corresponding to known ncRNAs. A large number of the identified TCs are yet to be examined experimentally suggesting that many more novel ncRNAs remain to be discovered. |
format | Text |
id | pubmed-2825236 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-28252362010-02-20 Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data Jung, Chol-Hee Hansen, Martin A Makunin, Igor V Korbie, Darren J Mattick, John S BMC Genomics Research Article BACKGROUND: The increasing interest in small non-coding RNAs (ncRNAs) such as microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs) and recent advances in sequencing technology have yielded large numbers of short (18-32 nt) RNA sequences from different organisms, some of which are derived from small nucleolar RNAs (snoRNAs) and transfer RNAs (tRNAs). We observed that these short ncRNAs frequently cover the entire length of annotated snoRNAs or tRNAs, which suggests that other loci specifying similar ncRNAs can be identified by clusters of short RNA sequences. RESULTS: We combined publicly available datasets of tens of millions of short RNA sequence tags from Drosophila melanogaster, and mapped them to the Drosophila genome. Approximately 6 million perfectly mapping sequence tags were then assembled into 521,302 tag-contigs (TCs) based on tag overlap. Most transposon-derived sequences, exons and annotated miRNAs, tRNAs and snoRNAs are detected by TCs, which show distinct patterns of length and tag-depth for different categories. The typical length and tag-depth of snoRNA-derived TCs was used to predict 7 previously unrecognized box H/ACA and 26 box C/D snoRNA candidates. We also identified one snRNA candidate and 86 loci with a high number of tags that are yet to be annotated, 7 of which have a particular 18mer motif and are located in introns of genes involved in development. A subset of new snoRNA candidates and putative ncRNA candidates was verified by Northern blot. CONCLUSIONS: In this study, we have introduced a new approach to identify new members of known classes of ncRNAs based on the features of TCs corresponding to known ncRNAs. A large number of the identified TCs are yet to be examined experimentally suggesting that many more novel ncRNAs remain to be discovered. BioMed Central 2010-02-01 /pmc/articles/PMC2825236/ /pubmed/20113528 http://dx.doi.org/10.1186/1471-2164-11-77 Text en Copyright ©2010 Jung et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Jung, Chol-Hee Hansen, Martin A Makunin, Igor V Korbie, Darren J Mattick, John S Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data |
title | Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data |
title_full | Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data |
title_fullStr | Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data |
title_full_unstemmed | Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data |
title_short | Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data |
title_sort | identification of novel non-coding rnas using profiles of short sequence reads from next generation sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2825236/ https://www.ncbi.nlm.nih.gov/pubmed/20113528 http://dx.doi.org/10.1186/1471-2164-11-77 |
work_keys_str_mv | AT jungcholhee identificationofnovelnoncodingrnasusingprofilesofshortsequencereadsfromnextgenerationsequencingdata AT hansenmartina identificationofnovelnoncodingrnasusingprofilesofshortsequencereadsfromnextgenerationsequencingdata AT makuninigorv identificationofnovelnoncodingrnasusingprofilesofshortsequencereadsfromnextgenerationsequencingdata AT korbiedarrenj identificationofnovelnoncodingrnasusingprofilesofshortsequencereadsfromnextgenerationsequencingdata AT mattickjohns identificationofnovelnoncodingrnasusingprofilesofshortsequencereadsfromnextgenerationsequencingdata |