Cargando…
Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics
BACKGROUND: Growing interest in cellulolytic clostridia with potential for consolidated biofuels production is mitigated by low conversion of raw substrates to desired end products. Strategies to improve conversion are likely to benefit from emerging techniques to define molecular systems biology of...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4102724/ https://www.ncbi.nlm.nih.gov/pubmed/24998381 http://dx.doi.org/10.1186/1471-2164-15-567 |
_version_ | 1782327049914941440 |
---|---|
author | Schellenberg, John J Verbeke, Tobin J McQueen, Peter Krokhin, Oleg V Zhang, Xiangli Alvare, Graham Fristensky, Brian Thallinger, Gerhard G Henrissat, Bernard Wilkins, John A Levin, David B Sparling, Richard |
author_facet | Schellenberg, John J Verbeke, Tobin J McQueen, Peter Krokhin, Oleg V Zhang, Xiangli Alvare, Graham Fristensky, Brian Thallinger, Gerhard G Henrissat, Bernard Wilkins, John A Levin, David B Sparling, Richard |
author_sort | Schellenberg, John J |
collection | PubMed |
description | BACKGROUND: Growing interest in cellulolytic clostridia with potential for consolidated biofuels production is mitigated by low conversion of raw substrates to desired end products. Strategies to improve conversion are likely to benefit from emerging techniques to define molecular systems biology of these organisms. Clostridium stercorarium DSM8532(T) is an anaerobic thermophile with demonstrated high ethanol production on cellulose and hemicellulose. Although several lignocellulolytic enzymes in this organism have been well-characterized, details concerning carbohydrate transporters and central metabolism have not been described. Therefore, the goal of this study is to define an improved whole genome sequence (WGS) for this organism using in-depth molecular profiling by RNA-seq transcriptomics and tandem mass spectrometry-based proteomics. RESULTS: A paired-end Roche/454 WGS assembly was closed through application of an in silico algorithm designed to resolve repetitive sequence regions, resulting in a circular replicon with one gap and a region of 2 kilobases with 10 ambiguous bases. RNA-seq transcriptomics resulted in nearly complete coverage of the genome, identifying errors in homopolymer length attributable to 454 sequencing. Peptide sequences resulting from high-throughput tandem mass spectrometry of trypsin-digested protein extracts were mapped to 1,755 annotated proteins (68% of all protein-coding regions). Proteogenomic analysis confirmed the quality of annotation and improvement pipelines, identifying a missing gene and an alternative reading frame. Peptide coverage of genes hypothetically involved in substrate hydrolysis, transport and utilization confirmed multiple pathways for glycolysis, pyruvate conversion and recycling of intermediates. No sequences homologous to transaldolase, a central enzyme in the pentose phosphate pathway, were observed by any method, despite demonstrated growth of this organism on xylose and xylan hemicellulose. CONCLUSIONS: Complementary omics techniques confirm the quality of genome sequence assembly, annotation and error-reporting. Nearly complete genome coverage by RNA-seq likely indicates background DNA in RNA extracts, however these preps resulted in WGS enhancement and transcriptome profiling in a single Illumina run. No detection of transaldolase by any method despite xylose utilization by this organism indicates an alternative pathway for sedoheptulose-7-phosphate degradation. This report combines next-generation omics techniques to elucidate previously undefined features of substrate transport and central metabolism for this organism and its potential for consolidated biofuels production from lignocellulose. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-567) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4102724 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41027242014-07-30 Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics Schellenberg, John J Verbeke, Tobin J McQueen, Peter Krokhin, Oleg V Zhang, Xiangli Alvare, Graham Fristensky, Brian Thallinger, Gerhard G Henrissat, Bernard Wilkins, John A Levin, David B Sparling, Richard BMC Genomics Research Article BACKGROUND: Growing interest in cellulolytic clostridia with potential for consolidated biofuels production is mitigated by low conversion of raw substrates to desired end products. Strategies to improve conversion are likely to benefit from emerging techniques to define molecular systems biology of these organisms. Clostridium stercorarium DSM8532(T) is an anaerobic thermophile with demonstrated high ethanol production on cellulose and hemicellulose. Although several lignocellulolytic enzymes in this organism have been well-characterized, details concerning carbohydrate transporters and central metabolism have not been described. Therefore, the goal of this study is to define an improved whole genome sequence (WGS) for this organism using in-depth molecular profiling by RNA-seq transcriptomics and tandem mass spectrometry-based proteomics. RESULTS: A paired-end Roche/454 WGS assembly was closed through application of an in silico algorithm designed to resolve repetitive sequence regions, resulting in a circular replicon with one gap and a region of 2 kilobases with 10 ambiguous bases. RNA-seq transcriptomics resulted in nearly complete coverage of the genome, identifying errors in homopolymer length attributable to 454 sequencing. Peptide sequences resulting from high-throughput tandem mass spectrometry of trypsin-digested protein extracts were mapped to 1,755 annotated proteins (68% of all protein-coding regions). Proteogenomic analysis confirmed the quality of annotation and improvement pipelines, identifying a missing gene and an alternative reading frame. Peptide coverage of genes hypothetically involved in substrate hydrolysis, transport and utilization confirmed multiple pathways for glycolysis, pyruvate conversion and recycling of intermediates. No sequences homologous to transaldolase, a central enzyme in the pentose phosphate pathway, were observed by any method, despite demonstrated growth of this organism on xylose and xylan hemicellulose. CONCLUSIONS: Complementary omics techniques confirm the quality of genome sequence assembly, annotation and error-reporting. Nearly complete genome coverage by RNA-seq likely indicates background DNA in RNA extracts, however these preps resulted in WGS enhancement and transcriptome profiling in a single Illumina run. No detection of transaldolase by any method despite xylose utilization by this organism indicates an alternative pathway for sedoheptulose-7-phosphate degradation. This report combines next-generation omics techniques to elucidate previously undefined features of substrate transport and central metabolism for this organism and its potential for consolidated biofuels production from lignocellulose. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-567) contains supplementary material, which is available to authorized users. BioMed Central 2014-07-07 /pmc/articles/PMC4102724/ /pubmed/24998381 http://dx.doi.org/10.1186/1471-2164-15-567 Text en © Schellenberg et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | Research Article Schellenberg, John J Verbeke, Tobin J McQueen, Peter Krokhin, Oleg V Zhang, Xiangli Alvare, Graham Fristensky, Brian Thallinger, Gerhard G Henrissat, Bernard Wilkins, John A Levin, David B Sparling, Richard Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics |
title | Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics |
title_full | Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics |
title_fullStr | Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics |
title_full_unstemmed | Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics |
title_short | Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532(T) using RNA-seq transcriptomics and high-throughput proteomics |
title_sort | enhanced whole genome sequence and annotation of clostridium stercorarium dsm8532(t) using rna-seq transcriptomics and high-throughput proteomics |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4102724/ https://www.ncbi.nlm.nih.gov/pubmed/24998381 http://dx.doi.org/10.1186/1471-2164-15-567 |
work_keys_str_mv | AT schellenbergjohnj enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT verbeketobinj enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT mcqueenpeter enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT krokhinolegv enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT zhangxiangli enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT alvaregraham enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT fristenskybrian enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT thallingergerhardg enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT henrissatbernard enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT wilkinsjohna enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT levindavidb enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics AT sparlingrichard enhancedwholegenomesequenceandannotationofclostridiumstercorariumdsm8532tusingrnaseqtranscriptomicsandhighthroughputproteomics |