Cargando…

An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data

BACKGROUND: As sequencing costs are being lowered continuously, RNA-seq has gradually been adopted as the first choice for comparative transcriptome studies with bacteria. Unlike microarrays, RNA-seq can directly detect cDNA derived from mRNA transcripts at a single nucleotide resolution. Not only d...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yejun, MacKenzie, Keith D, White, Aaron P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4422608/
https://www.ncbi.nlm.nih.gov/pubmed/25947005
http://dx.doi.org/10.1186/s12864-015-1555-8
_version_ 1782370081047576576
author Wang, Yejun
MacKenzie, Keith D
White, Aaron P
author_facet Wang, Yejun
MacKenzie, Keith D
White, Aaron P
author_sort Wang, Yejun
collection PubMed
description BACKGROUND: As sequencing costs are being lowered continuously, RNA-seq has gradually been adopted as the first choice for comparative transcriptome studies with bacteria. Unlike microarrays, RNA-seq can directly detect cDNA derived from mRNA transcripts at a single nucleotide resolution. Not only does this allow researchers to determine the absolute expression level of genes, but it also conveys information about transcript structure. Few automatic software tools have yet been established to investigate large-scale RNA-seq data for bacterial transcript structure analysis. RESULTS: In this study, 54 directional RNA-seq libraries from Salmonella serovar Typhimurium (S. Typhimurium) 14028s were examined for potential relationships between read mapping patterns and transcript structure. We developed an empirical method, combined with statistical tests, to automatically detect key transcript features, including transcriptional start sites (TSSs), transcriptional termination sites (TTSs) and operon organization. Using our method, we obtained 2,764 TSSs and 1,467 TTSs for 1331 and 844 different genes, respectively. Identification of TSSs facilitated further discrimination of 215 putative sigma 38 regulons and 863 potential sigma 70 regulons. Combining the TSSs and TTSs with intergenic distance and co-expression information, we comprehensively annotated the operon organization in S. Typhimurium 14028s. CONCLUSIONS: Our results show that directional RNA-seq can be used to detect transcriptional borders at an acceptable resolution of ±10-20 nucleotides. Technical limitations of the RNA-seq procedure may prevent single nucleotide resolution. The automatic transcript border detection methods, statistical models and operon organization pipeline that we have described could be widely applied to RNA-seq studies in other bacteria. Furthermore, the TSSs, TTSs, operons, promoters and unstranslated regions that we have defined for S. Typhimurium 14028s may constitute valuable resources that can be used for comparative analyses with other Salmonella serotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1555-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4422608
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44226082015-05-07 An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data Wang, Yejun MacKenzie, Keith D White, Aaron P BMC Genomics Research Article BACKGROUND: As sequencing costs are being lowered continuously, RNA-seq has gradually been adopted as the first choice for comparative transcriptome studies with bacteria. Unlike microarrays, RNA-seq can directly detect cDNA derived from mRNA transcripts at a single nucleotide resolution. Not only does this allow researchers to determine the absolute expression level of genes, but it also conveys information about transcript structure. Few automatic software tools have yet been established to investigate large-scale RNA-seq data for bacterial transcript structure analysis. RESULTS: In this study, 54 directional RNA-seq libraries from Salmonella serovar Typhimurium (S. Typhimurium) 14028s were examined for potential relationships between read mapping patterns and transcript structure. We developed an empirical method, combined with statistical tests, to automatically detect key transcript features, including transcriptional start sites (TSSs), transcriptional termination sites (TTSs) and operon organization. Using our method, we obtained 2,764 TSSs and 1,467 TTSs for 1331 and 844 different genes, respectively. Identification of TSSs facilitated further discrimination of 215 putative sigma 38 regulons and 863 potential sigma 70 regulons. Combining the TSSs and TTSs with intergenic distance and co-expression information, we comprehensively annotated the operon organization in S. Typhimurium 14028s. CONCLUSIONS: Our results show that directional RNA-seq can be used to detect transcriptional borders at an acceptable resolution of ±10-20 nucleotides. Technical limitations of the RNA-seq procedure may prevent single nucleotide resolution. The automatic transcript border detection methods, statistical models and operon organization pipeline that we have described could be widely applied to RNA-seq studies in other bacteria. Furthermore, the TSSs, TTSs, operons, promoters and unstranslated regions that we have defined for S. Typhimurium 14028s may constitute valuable resources that can be used for comparative analyses with other Salmonella serotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1555-8) contains supplementary material, which is available to authorized users. BioMed Central 2015-05-07 /pmc/articles/PMC4422608/ /pubmed/25947005 http://dx.doi.org/10.1186/s12864-015-1555-8 Text en © Wang et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wang, Yejun
MacKenzie, Keith D
White, Aaron P
An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data
title An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data
title_full An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data
title_fullStr An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data
title_full_unstemmed An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data
title_short An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data
title_sort empirical strategy to detect bacterial transcript structure from directional rna-seq transcriptome data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4422608/
https://www.ncbi.nlm.nih.gov/pubmed/25947005
http://dx.doi.org/10.1186/s12864-015-1555-8
work_keys_str_mv AT wangyejun anempiricalstrategytodetectbacterialtranscriptstructurefromdirectionalrnaseqtranscriptomedata
AT mackenziekeithd anempiricalstrategytodetectbacterialtranscriptstructurefromdirectionalrnaseqtranscriptomedata
AT whiteaaronp anempiricalstrategytodetectbacterialtranscriptstructurefromdirectionalrnaseqtranscriptomedata
AT wangyejun empiricalstrategytodetectbacterialtranscriptstructurefromdirectionalrnaseqtranscriptomedata
AT mackenziekeithd empiricalstrategytodetectbacterialtranscriptstructurefromdirectionalrnaseqtranscriptomedata
AT whiteaaronp empiricalstrategytodetectbacterialtranscriptstructurefromdirectionalrnaseqtranscriptomedata