Cargando…
An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta
BACKGROUND: Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5645894/ https://www.ncbi.nlm.nih.gov/pubmed/29041902 http://dx.doi.org/10.1186/s12864-017-4147-y |
_version_ | 1783271976212627456 |
---|---|
author | Cao, Xiaolong Jiang, Haobo |
author_facet | Cao, Xiaolong Jiang, Haobo |
author_sort | Cao, Xiaolong |
collection | PubMed |
description | BACKGROUND: Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data were used to study genes related to cuticle structure, chitin metabolism and immunity, a vast amount of the information has not yet been mined for understanding the basic molecular biology of this model insect. In fact, the basic features of these data, such as composition of the RNA-seq reads and lists of library-correlated genes, are unclear. From an extended view of all insects, clear-cut tempospatial expression data are rarely seen in the largest group of animals including Drosophila and mosquitoes, mainly due to their small sizes. RESULTS: We obtained the transcriptome data, analyzed the raw reads in relation to the assembled genome, and generated heatmaps for clustered genes. Library characteristics (tissues, stages), number of mapped bases, and sequencing methods affected the observed percentages of genome transcription. While up to 40% of the reads were not mapped to the genome in the initial Cufflinks gene modeling, we identified the causes for the mapping failure and reduced the number of non-mappable reads to <8%. Similarities between libraries, measured based on library-correlated genes, clearly identified differences among tissues or life stages. We calculated gene expression levels, analyzed the most abundantly expressed genes in the libraries. Furthermore, we analyzed tissue-specific gene expression and identified 18 groups of genes with distinct expression patterns. CONCLUSION: We performed a thorough analysis of the 67 RNA-seq datasets to characterize new genomic features of M. sexta. Integrated knowledge of gene functions and expression features will facilitate future functional studies in this biochemical model insect. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article doi: (10.1186/s12864-017-4147-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5645894 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56458942017-10-26 An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta Cao, Xiaolong Jiang, Haobo BMC Genomics Research Article BACKGROUND: Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data were used to study genes related to cuticle structure, chitin metabolism and immunity, a vast amount of the information has not yet been mined for understanding the basic molecular biology of this model insect. In fact, the basic features of these data, such as composition of the RNA-seq reads and lists of library-correlated genes, are unclear. From an extended view of all insects, clear-cut tempospatial expression data are rarely seen in the largest group of animals including Drosophila and mosquitoes, mainly due to their small sizes. RESULTS: We obtained the transcriptome data, analyzed the raw reads in relation to the assembled genome, and generated heatmaps for clustered genes. Library characteristics (tissues, stages), number of mapped bases, and sequencing methods affected the observed percentages of genome transcription. While up to 40% of the reads were not mapped to the genome in the initial Cufflinks gene modeling, we identified the causes for the mapping failure and reduced the number of non-mappable reads to <8%. Similarities between libraries, measured based on library-correlated genes, clearly identified differences among tissues or life stages. We calculated gene expression levels, analyzed the most abundantly expressed genes in the libraries. Furthermore, we analyzed tissue-specific gene expression and identified 18 groups of genes with distinct expression patterns. CONCLUSION: We performed a thorough analysis of the 67 RNA-seq datasets to characterize new genomic features of M. sexta. Integrated knowledge of gene functions and expression features will facilitate future functional studies in this biochemical model insect. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article doi: (10.1186/s12864-017-4147-y) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-17 /pmc/articles/PMC5645894/ /pubmed/29041902 http://dx.doi.org/10.1186/s12864-017-4147-y Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Cao, Xiaolong Jiang, Haobo An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta |
title | An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta |
title_full | An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta |
title_fullStr | An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta |
title_full_unstemmed | An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta |
title_short | An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta |
title_sort | analysis of 67 rna-seq datasets from various tissues at different stages of a model insect, manduca sexta |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5645894/ https://www.ncbi.nlm.nih.gov/pubmed/29041902 http://dx.doi.org/10.1186/s12864-017-4147-y |
work_keys_str_mv | AT caoxiaolong ananalysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta AT jianghaobo ananalysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta AT caoxiaolong analysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta AT jianghaobo analysisof67rnaseqdatasetsfromvarioustissuesatdifferentstagesofamodelinsectmanducasexta |