Cargando…

Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms

High-throughput mRNA sequencing (RNA-Seq) holds the promise of simultaneous transcript discovery and abundance estimation(1-3). We introduce an algorithm for transcript assembly coupled with a statistical model for RNA-Seq experiments that produces estimates of abundances. Our algorithms are impleme...

Descripción completa

Detalles Bibliográficos
Autores principales: Trapnell, Cole, Williams, Brian A., Pertea, Geo, Mortazavi, Ali, Kwan, Gordon, van Baren, Marijke J., Salzberg, Steven L., Wold, Barbara J., Pachter, Lior
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3146043/
https://www.ncbi.nlm.nih.gov/pubmed/20436464
http://dx.doi.org/10.1038/nbt.1621
_version_ 1782209153971781632
author Trapnell, Cole
Williams, Brian A.
Pertea, Geo
Mortazavi, Ali
Kwan, Gordon
van Baren, Marijke J.
Salzberg, Steven L.
Wold, Barbara J.
Pachter, Lior
author_facet Trapnell, Cole
Williams, Brian A.
Pertea, Geo
Mortazavi, Ali
Kwan, Gordon
van Baren, Marijke J.
Salzberg, Steven L.
Wold, Barbara J.
Pachter, Lior
author_sort Trapnell, Cole
collection PubMed
description High-throughput mRNA sequencing (RNA-Seq) holds the promise of simultaneous transcript discovery and abundance estimation(1-3). We introduce an algorithm for transcript assembly coupled with a statistical model for RNA-Seq experiments that produces estimates of abundances. Our algorithms are implemented in an open source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed more than 430 million paired 75bp RNA-Seq reads from a mouse myoblast cell line representing a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Analysis of transcript expression over the time series revealed complete switches in the dominant transcription start site (TSS) or splice-isoform in 330 genes, along with more subtle shifts in a further 1,304 genes. These dynamics suggest substantial regulatory flexibility and complexity in this well-studied model of muscle development.
format Online
Article
Text
id pubmed-3146043
institution National Center for Biotechnology Information
language English
publishDate 2010
record_format MEDLINE/PubMed
spelling pubmed-31460432011-07-29 Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms Trapnell, Cole Williams, Brian A. Pertea, Geo Mortazavi, Ali Kwan, Gordon van Baren, Marijke J. Salzberg, Steven L. Wold, Barbara J. Pachter, Lior Nat Biotechnol Article High-throughput mRNA sequencing (RNA-Seq) holds the promise of simultaneous transcript discovery and abundance estimation(1-3). We introduce an algorithm for transcript assembly coupled with a statistical model for RNA-Seq experiments that produces estimates of abundances. Our algorithms are implemented in an open source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed more than 430 million paired 75bp RNA-Seq reads from a mouse myoblast cell line representing a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Analysis of transcript expression over the time series revealed complete switches in the dominant transcription start site (TSS) or splice-isoform in 330 genes, along with more subtle shifts in a further 1,304 genes. These dynamics suggest substantial regulatory flexibility and complexity in this well-studied model of muscle development. 2010-05-02 2010-05 /pmc/articles/PMC3146043/ /pubmed/20436464 http://dx.doi.org/10.1038/nbt.1621 Text en Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Trapnell, Cole
Williams, Brian A.
Pertea, Geo
Mortazavi, Ali
Kwan, Gordon
van Baren, Marijke J.
Salzberg, Steven L.
Wold, Barbara J.
Pachter, Lior
Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms
title Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms
title_full Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms
title_fullStr Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms
title_full_unstemmed Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms
title_short Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms
title_sort transcript assembly and abundance estimation from rna-seq reveals thousands of new transcripts and switching among isoforms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3146043/
https://www.ncbi.nlm.nih.gov/pubmed/20436464
http://dx.doi.org/10.1038/nbt.1621
work_keys_str_mv AT trapnellcole transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT williamsbriana transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT perteageo transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT mortazaviali transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT kwangordon transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT vanbarenmarijkej transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT salzbergstevenl transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT woldbarbaraj transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms
AT pachterlior transcriptassemblyandabundanceestimationfromrnaseqrevealsthousandsofnewtranscriptsandswitchingamongisoforms