Cargando…

De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs

BACKGROUND: Accurate gene model predictions and annotation of alternative splicing events are imperative for genomic studies in organisms that contain genes with multiple exons. Currently most gene models for the intracellular parasite, Toxoplasma gondii, are based on computer model predictions with...

Descripción completa

Detalles Bibliográficos
Autores principales: Hassan, Musa A, Melo, Mariane B, Haas, Brian, Jensen, Kirk D C, Saeij, Jeroen P J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3543268/
https://www.ncbi.nlm.nih.gov/pubmed/23231500
http://dx.doi.org/10.1186/1471-2164-13-696
_version_ 1782255629228834816
author Hassan, Musa A
Melo, Mariane B
Haas, Brian
Jensen, Kirk D C
Saeij, Jeroen P J
author_facet Hassan, Musa A
Melo, Mariane B
Haas, Brian
Jensen, Kirk D C
Saeij, Jeroen P J
author_sort Hassan, Musa A
collection PubMed
description BACKGROUND: Accurate gene model predictions and annotation of alternative splicing events are imperative for genomic studies in organisms that contain genes with multiple exons. Currently most gene models for the intracellular parasite, Toxoplasma gondii, are based on computer model predictions without cDNA sequence verification. Additionally, the nature and extent of alternative splicing in Toxoplasma gondii is unknown. In this study, we used de novo transcript assembly and the published type II (ME49) genomic sequence to quantify the extent of alternative splicing in Toxoplasma and to improve the current Toxoplasma gene annotations. RESULTS: We used high-throughput RNA-sequencing data to assemble full-length transcripts, independently of a reference genome, followed by gene annotation based on the ME49 genome. We assembled 13,533 transcripts overlapping with known ME49 genes in ToxoDB and then used this set to; a) improve the annotation in the untranslated regions of ToxoDB genes, b) identify novel exons within protein-coding ToxoDB genes, and c) report on 50 previously unidentified alternatively spliced transcripts. Additionally, we assembled a set of 2,930 transcripts not overlapping with any known ME49 genes in ToxoDB. From this set, we have identified 118 new ME49 genes, 18 novel Toxoplasma genes, and putative non-coding RNAs. CONCLUSION: RNA-seq data and de novo transcript assembly provide a robust way to update incompletely annotated genomes, like the Toxoplasma genome. We have used RNA-seq to improve the annotation of several Toxoplasma genes, identify alternatively spliced genes, novel genes, novel exons, and putative non-coding RNAs.
format Online
Article
Text
id pubmed-3543268
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35432682013-01-14 De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs Hassan, Musa A Melo, Mariane B Haas, Brian Jensen, Kirk D C Saeij, Jeroen P J BMC Genomics Research Article BACKGROUND: Accurate gene model predictions and annotation of alternative splicing events are imperative for genomic studies in organisms that contain genes with multiple exons. Currently most gene models for the intracellular parasite, Toxoplasma gondii, are based on computer model predictions without cDNA sequence verification. Additionally, the nature and extent of alternative splicing in Toxoplasma gondii is unknown. In this study, we used de novo transcript assembly and the published type II (ME49) genomic sequence to quantify the extent of alternative splicing in Toxoplasma and to improve the current Toxoplasma gene annotations. RESULTS: We used high-throughput RNA-sequencing data to assemble full-length transcripts, independently of a reference genome, followed by gene annotation based on the ME49 genome. We assembled 13,533 transcripts overlapping with known ME49 genes in ToxoDB and then used this set to; a) improve the annotation in the untranslated regions of ToxoDB genes, b) identify novel exons within protein-coding ToxoDB genes, and c) report on 50 previously unidentified alternatively spliced transcripts. Additionally, we assembled a set of 2,930 transcripts not overlapping with any known ME49 genes in ToxoDB. From this set, we have identified 118 new ME49 genes, 18 novel Toxoplasma genes, and putative non-coding RNAs. CONCLUSION: RNA-seq data and de novo transcript assembly provide a robust way to update incompletely annotated genomes, like the Toxoplasma genome. We have used RNA-seq to improve the annotation of several Toxoplasma genes, identify alternatively spliced genes, novel genes, novel exons, and putative non-coding RNAs. BioMed Central 2012-12-12 /pmc/articles/PMC3543268/ /pubmed/23231500 http://dx.doi.org/10.1186/1471-2164-13-696 Text en Copyright ©2012 Hassan et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Hassan, Musa A
Melo, Mariane B
Haas, Brian
Jensen, Kirk D C
Saeij, Jeroen P J
De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs
title De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs
title_full De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs
title_fullStr De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs
title_full_unstemmed De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs
title_short De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs
title_sort de novo reconstruction of the toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding rnas
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3543268/
https://www.ncbi.nlm.nih.gov/pubmed/23231500
http://dx.doi.org/10.1186/1471-2164-13-696
work_keys_str_mv AT hassanmusaa denovoreconstructionofthetoxoplasmagondiitranscriptomeimprovesonthecurrentgenomeannotationandrevealsalternativelysplicedtranscriptsandputativelongnoncodingrnas
AT melomarianeb denovoreconstructionofthetoxoplasmagondiitranscriptomeimprovesonthecurrentgenomeannotationandrevealsalternativelysplicedtranscriptsandputativelongnoncodingrnas
AT haasbrian denovoreconstructionofthetoxoplasmagondiitranscriptomeimprovesonthecurrentgenomeannotationandrevealsalternativelysplicedtranscriptsandputativelongnoncodingrnas
AT jensenkirkdc denovoreconstructionofthetoxoplasmagondiitranscriptomeimprovesonthecurrentgenomeannotationandrevealsalternativelysplicedtranscriptsandputativelongnoncodingrnas
AT saeijjeroenpj denovoreconstructionofthetoxoplasmagondiitranscriptomeimprovesonthecurrentgenomeannotationandrevealsalternativelysplicedtranscriptsandputativelongnoncodingrnas