Cargando…

An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta

BACKGROUND: Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Xiaolong, Jiang, Haobo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5645894/
https://www.ncbi.nlm.nih.gov/pubmed/29041902
http://dx.doi.org/10.1186/s12864-017-4147-y
_version_ 1783271976212627456
author Cao, Xiaolong
Jiang, Haobo
author_facet Cao, Xiaolong
Jiang, Haobo
author_sort Cao, Xiaolong
collection PubMed
description BACKGROUND: Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data were used to study genes related to cuticle structure, chitin metabolism and immunity, a vast amount of the information has not yet been mined for understanding the basic molecular biology of this model insect. In fact, the basic features of these data, such as composition of the RNA-seq reads and lists of library-correlated genes, are unclear. From an extended view of all insects, clear-cut tempospatial expression data are rarely seen in the largest group of animals including Drosophila and mosquitoes, mainly due to their small sizes. RESULTS: We obtained the transcriptome data, analyzed the raw reads in relation to the assembled genome, and generated heatmaps for clustered genes. Library characteristics (tissues, stages), number of mapped bases, and sequencing methods affected the observed percentages of genome transcription. While up to 40% of the reads were not mapped to the genome in the initial Cufflinks gene modeling, we identified the causes for the mapping failure and reduced the number of non-mappable reads to <8%. Similarities between libraries, measured based on library-correlated genes, clearly identified differences among tissues or life stages. We calculated gene expression levels, analyzed the most abundantly expressed genes in the libraries. Furthermore, we analyzed tissue-specific gene expression and identified 18 groups of genes with distinct expression patterns. CONCLUSION: We performed a thorough analysis of the 67 RNA-seq datasets to characterize new genomic features of M. sexta. Integrated knowledge of gene functions and expression features will facilitate future functional studies in this biochemical model insect. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article doi: (10.1186/s12864-017-4147-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5645894
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-56458942017-10-26 An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta Cao, Xiaolong Jiang, Haobo BMC Genomics Research Article BACKGROUND: Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data were used to study genes related to cuticle structure, chitin metabolism and immunity, a vast amount of the information has not yet been mined for understanding the basic molecular biology of this model insect. In fact, the basic features of these data, such as composition of the RNA-seq reads and lists of library-correlated genes, are unclear. From an extended view of all insects, clear-cut tempospatial expression data are rarely seen in the largest group of animals including Drosophila and mosquitoes, mainly due to their small sizes. RESULTS: We obtained the transcriptome data, analyzed the raw reads in relation to the assembled genome, and generated heatmaps for clustered genes. Library characteristics (tissues, stages), number of mapped bases, and sequencing methods affected the observed percentages of genome transcription. While up to 40% of the reads were not mapped to the genome in the initial Cufflinks gene modeling, we identified the causes for the mapping failure and reduced the number of non-mappable reads to <8%. Similarities between libraries, measured based on library-correlated genes, clearly identified differences among tissues or life stages. We calculated gene expression levels, analyzed the most abundantly expressed genes in the libraries. Furthermore, we analyzed tissue-specific gene expression and identified 18 groups of genes with distinct expression patterns. CONCLUSION: We performed a thorough analysis of the 67 RNA-seq datasets to characterize new genomic features of M. sexta. Integrated knowledge of gene functions and expression features will facilitate future functional studies in this biochemical model insect. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article doi: (10.1186/s12864-017-4147-y) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-17 /pmc/articles/PMC5645894/ /pubmed/29041902 http://dx.doi.org/10.1186/s12864-017-4147-y Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Cao, Xiaolong
Jiang, Haobo
An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta
title An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta
title_full An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta
title_fullStr An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta
title_full_unstemmed An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta
title_short An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta
title_sort analysis of 67 rna-seq datasets from various tissues at different stages of a model insect, manduca sexta
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5645894/
https://www.ncbi.nlm.nih.gov/pubmed/29041902
http://dx.doi.org/10.1186/s12864-017-4147-y
work_keys_str_mv AT caoxiaolong ananalysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta
AT jianghaobo ananalysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta
AT caoxiaolong analysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta
AT jianghaobo analysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta