Cargando…
Into the unknown: expression profiling without genome sequence information in CHO by next generation sequencing
The arrival of next-generation sequencing (NGS) technologies has led to novel opportunities for expression profiling and genome analysis by utilizing vast amounts of short read sequence data. Here, we demonstrate that expression profiling in organisms lacking any genome or transcriptome sequence inf...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2896516/ https://www.ncbi.nlm.nih.gov/pubmed/20194116 http://dx.doi.org/10.1093/nar/gkq116 |
Sumario: | The arrival of next-generation sequencing (NGS) technologies has led to novel opportunities for expression profiling and genome analysis by utilizing vast amounts of short read sequence data. Here, we demonstrate that expression profiling in organisms lacking any genome or transcriptome sequence information is feasible by combining Illumina’s mRNA-seq technology with a novel bioinformatics pipeline that integrates assembled and annotated Chinese hamster ovary (CHO) sequences with information derived from related organisms. We applied this pipeline to the analysis of CHO cells which were chosen as a model system owing to its relevance in the production of therapeutic proteins. Specifically, we analysed CHO cells undergoing butyrate treatment which is known to affect cell cycle regulation and to increase the specific productivity of recombinant proteins. By this means, we identified sequences for >13 000 CHO genes which added sequence information of ∼5000 novel genes to the CHO model. More than 6000 transcript sequences are predicted to be complete, as they covered >95% of the corresponding mouse orthologs. Detailed analysis of selected biological functions such as DNA replication and cell cycle control, demonstrated the potential of NGS expression profiling in organisms without extended genome sequence to improve both data quantity and quality. |
---|