Cargando…
Structured RNAs and synteny regions in the pig genome
BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However,...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4124155/ https://www.ncbi.nlm.nih.gov/pubmed/24917120 http://dx.doi.org/10.1186/1471-2164-15-459 |
_version_ | 1782329591567745024 |
---|---|
author | Anthon, Christian Tafer, Hakim Havgaard, Jakob H Thomsen, Bo Hedegaard, Jakob Seemann, Stefan E Pundhir, Sachin Kehr, Stephanie Bartschat, Sebastian Nielsen, Mathilde Nielsen, Rasmus O Fredholm, Merete Stadler, Peter F Gorodkin, Jan |
author_facet | Anthon, Christian Tafer, Hakim Havgaard, Jakob H Thomsen, Bo Hedegaard, Jakob Seemann, Stefan E Pundhir, Sachin Kehr, Stephanie Bartschat, Sebastian Nielsen, Mathilde Nielsen, Rasmus O Fredholm, Merete Stadler, Peter F Gorodkin, Jan |
author_sort | Anthon, Christian |
collection | PubMed |
description | BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure similarity search as well as class specific methods, we obtained a conservative set with a total of 3,391 structured RNA loci of which 1,011 and 2,314, respectively, hold strong sequence and structure similarity to structured RNAs in existing databases. The RNA loci cover 139 cis-regulatory element loci, 58 lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome, we obtained no matches at the highest confidence level. Additional analysis of RNA-seq data from a pooled library from 10 different pig tissues added another 165 miRNA loci, yielding an overall annotation of 3,556 structured RNA loci. This annotation represents our best effort at making an automated annotation. To further enhance the reliability, 571 of the 3,556 structured RNAs were manually curated by methods depending on the RNA class while 1,581 were declared as pseudogenes. We further created a multiple alignment of pig against 20 representative vertebrates, from which RNAz predicted 83,859 de novo RNA loci with conserved RNA structures. 528 of the RNAz predictions overlapped with the homology based annotation or novel miRNAs. We further present a substantial synteny analysis which includes 1,004 lineage specific de novo RNA loci and 4 ncRNA loci in the known annotation specific for Laurasiatheria (pig, cow, dolphin, horse, cat, dog, hedgehog). CONCLUSIONS: We have obtained one of the most comprehensive annotations for structured ncRNAs of a mammalian genome, which is likely to play central roles in both health modelling and production. The core annotation is available in Ensembl 70 and the complete annotation is available at http://rth.dk/resources/rnannotator/susscr102/version1.02. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-459) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4124155 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41241552014-08-12 Structured RNAs and synteny regions in the pig genome Anthon, Christian Tafer, Hakim Havgaard, Jakob H Thomsen, Bo Hedegaard, Jakob Seemann, Stefan E Pundhir, Sachin Kehr, Stephanie Bartschat, Sebastian Nielsen, Mathilde Nielsen, Rasmus O Fredholm, Merete Stadler, Peter F Gorodkin, Jan BMC Genomics Research Article BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure similarity search as well as class specific methods, we obtained a conservative set with a total of 3,391 structured RNA loci of which 1,011 and 2,314, respectively, hold strong sequence and structure similarity to structured RNAs in existing databases. The RNA loci cover 139 cis-regulatory element loci, 58 lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome, we obtained no matches at the highest confidence level. Additional analysis of RNA-seq data from a pooled library from 10 different pig tissues added another 165 miRNA loci, yielding an overall annotation of 3,556 structured RNA loci. This annotation represents our best effort at making an automated annotation. To further enhance the reliability, 571 of the 3,556 structured RNAs were manually curated by methods depending on the RNA class while 1,581 were declared as pseudogenes. We further created a multiple alignment of pig against 20 representative vertebrates, from which RNAz predicted 83,859 de novo RNA loci with conserved RNA structures. 528 of the RNAz predictions overlapped with the homology based annotation or novel miRNAs. We further present a substantial synteny analysis which includes 1,004 lineage specific de novo RNA loci and 4 ncRNA loci in the known annotation specific for Laurasiatheria (pig, cow, dolphin, horse, cat, dog, hedgehog). CONCLUSIONS: We have obtained one of the most comprehensive annotations for structured ncRNAs of a mammalian genome, which is likely to play central roles in both health modelling and production. The core annotation is available in Ensembl 70 and the complete annotation is available at http://rth.dk/resources/rnannotator/susscr102/version1.02. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-459) contains supplementary material, which is available to authorized users. BioMed Central 2014-06-10 /pmc/articles/PMC4124155/ /pubmed/24917120 http://dx.doi.org/10.1186/1471-2164-15-459 Text en © Anthon et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | Research Article Anthon, Christian Tafer, Hakim Havgaard, Jakob H Thomsen, Bo Hedegaard, Jakob Seemann, Stefan E Pundhir, Sachin Kehr, Stephanie Bartschat, Sebastian Nielsen, Mathilde Nielsen, Rasmus O Fredholm, Merete Stadler, Peter F Gorodkin, Jan Structured RNAs and synteny regions in the pig genome |
title | Structured RNAs and synteny regions in the pig genome |
title_full | Structured RNAs and synteny regions in the pig genome |
title_fullStr | Structured RNAs and synteny regions in the pig genome |
title_full_unstemmed | Structured RNAs and synteny regions in the pig genome |
title_short | Structured RNAs and synteny regions in the pig genome |
title_sort | structured rnas and synteny regions in the pig genome |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4124155/ https://www.ncbi.nlm.nih.gov/pubmed/24917120 http://dx.doi.org/10.1186/1471-2164-15-459 |
work_keys_str_mv | AT anthonchristian structuredrnasandsyntenyregionsinthepiggenome AT taferhakim structuredrnasandsyntenyregionsinthepiggenome AT havgaardjakobh structuredrnasandsyntenyregionsinthepiggenome AT thomsenbo structuredrnasandsyntenyregionsinthepiggenome AT hedegaardjakob structuredrnasandsyntenyregionsinthepiggenome AT seemannstefane structuredrnasandsyntenyregionsinthepiggenome AT pundhirsachin structuredrnasandsyntenyregionsinthepiggenome AT kehrstephanie structuredrnasandsyntenyregionsinthepiggenome AT bartschatsebastian structuredrnasandsyntenyregionsinthepiggenome AT nielsenmathilde structuredrnasandsyntenyregionsinthepiggenome AT nielsenrasmuso structuredrnasandsyntenyregionsinthepiggenome AT fredholmmerete structuredrnasandsyntenyregionsinthepiggenome AT stadlerpeterf structuredrnasandsyntenyregionsinthepiggenome AT gorodkinjan structuredrnasandsyntenyregionsinthepiggenome |