Cargando…

CLASS: constrained transcript assembly of RNA-seq reads

BACKGROUND: RNA-seq has revolutionized our ability to survey the cellular transcriptome in great detail. However, while several approaches have been developed, the problem of assembling the short reads into full-length transcripts remains challenging. RESULTS: We developed a novel algorithm and soft...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Li, Florea, Liliana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622639/
https://www.ncbi.nlm.nih.gov/pubmed/23734605
http://dx.doi.org/10.1186/1471-2105-14-S5-S14
_version_ 1782265859190816768
author Song, Li
Florea, Liliana
author_facet Song, Li
Florea, Liliana
author_sort Song, Li
collection PubMed
description BACKGROUND: RNA-seq has revolutionized our ability to survey the cellular transcriptome in great detail. However, while several approaches have been developed, the problem of assembling the short reads into full-length transcripts remains challenging. RESULTS: We developed a novel algorithm and software tool, CLASS (Constraint-based Local Assembly and Selection of Splice variants), for accurately assembling splice variants using local read coverage patterns of RNA-seq reads, contiguity constraints from read pairs and spliced reads, and optionally information about gene structure extracted from cDNA sequence databases. The algorithmic underpinnings of CLASS are: i) a linear program to infer exons, ii) a compact splice graph representation of a gene and its splice variants, and iii) a transcript selection scheme that takes into account contiguity constraints and, where available, knowledge about gene structure. CONCLUSION: In comparisons against leading transcript assembly programs, CLASS is more accurate on both simulated and real reads and produces results that are easier to interpret when applied to large scale real data, and therefore is a promising analysis tool for next generation sequencing data. AVAILABILITY: CLASS is available from http://sourceforge.net/projects/splicebox.
format Online
Article
Text
id pubmed-3622639
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36226392013-04-15 CLASS: constrained transcript assembly of RNA-seq reads Song, Li Florea, Liliana BMC Bioinformatics Proceedings BACKGROUND: RNA-seq has revolutionized our ability to survey the cellular transcriptome in great detail. However, while several approaches have been developed, the problem of assembling the short reads into full-length transcripts remains challenging. RESULTS: We developed a novel algorithm and software tool, CLASS (Constraint-based Local Assembly and Selection of Splice variants), for accurately assembling splice variants using local read coverage patterns of RNA-seq reads, contiguity constraints from read pairs and spliced reads, and optionally information about gene structure extracted from cDNA sequence databases. The algorithmic underpinnings of CLASS are: i) a linear program to infer exons, ii) a compact splice graph representation of a gene and its splice variants, and iii) a transcript selection scheme that takes into account contiguity constraints and, where available, knowledge about gene structure. CONCLUSION: In comparisons against leading transcript assembly programs, CLASS is more accurate on both simulated and real reads and produces results that are easier to interpret when applied to large scale real data, and therefore is a promising analysis tool for next generation sequencing data. AVAILABILITY: CLASS is available from http://sourceforge.net/projects/splicebox. BioMed Central 2013-04-10 /pmc/articles/PMC3622639/ /pubmed/23734605 http://dx.doi.org/10.1186/1471-2105-14-S5-S14 Text en Copyright © 2013 Song and Florea; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Song, Li
Florea, Liliana
CLASS: constrained transcript assembly of RNA-seq reads
title CLASS: constrained transcript assembly of RNA-seq reads
title_full CLASS: constrained transcript assembly of RNA-seq reads
title_fullStr CLASS: constrained transcript assembly of RNA-seq reads
title_full_unstemmed CLASS: constrained transcript assembly of RNA-seq reads
title_short CLASS: constrained transcript assembly of RNA-seq reads
title_sort class: constrained transcript assembly of rna-seq reads
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622639/
https://www.ncbi.nlm.nih.gov/pubmed/23734605
http://dx.doi.org/10.1186/1471-2105-14-S5-S14
work_keys_str_mv AT songli classconstrainedtranscriptassemblyofrnaseqreads
AT florealiliana classconstrainedtranscriptassemblyofrnaseqreads