Cargando…

PASTA: splice junction identification from RNA-Sequencing data

BACKGROUND: Next generation transcriptome sequencing (RNA-Seq) is emerging as a powerful experimental tool for the study of alternative splicing and its regulation, but requires ad-hoc analysis methods and tools. PASTA (Patterned Alignments for Splicing and Transcriptome Analysis) is a splice juncti...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Shaojun, Riva, Alberto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3623791/
https://www.ncbi.nlm.nih.gov/pubmed/23557086
http://dx.doi.org/10.1186/1471-2105-14-116
_version_ 1782265969734844416
author Tang, Shaojun
Riva, Alberto
author_facet Tang, Shaojun
Riva, Alberto
author_sort Tang, Shaojun
collection PubMed
description BACKGROUND: Next generation transcriptome sequencing (RNA-Seq) is emerging as a powerful experimental tool for the study of alternative splicing and its regulation, but requires ad-hoc analysis methods and tools. PASTA (Patterned Alignments for Splicing and Transcriptome Analysis) is a splice junction detection algorithm specifically designed for RNA-Seq data, relying on a highly accurate alignment strategy and on a combination of heuristic and statistical methods to identify exon-intron junctions with high accuracy. RESULTS: Comparisons against TopHat and other splice junction prediction software on real and simulated datasets show that PASTA exhibits high specificity and sensitivity, especially at lower coverage levels. Moreover, PASTA is highly configurable and flexible, and can therefore be applied in a wide range of analysis scenarios: it is able to handle both single-end and paired-end reads, it does not rely on the presence of canonical splicing signals, and it uses organism-specific regression models to accurately identify junctions. CONCLUSIONS: PASTA is a highly efficient and sensitive tool to identify splicing junctions from RNA-Seq data. Compared to similar programs, it has the ability to identify a higher number of real splicing junctions, and provides highly annotated output files containing detailed information about their location and characteristics. Accurate junction data in turn facilitates the reconstruction of the splicing isoforms and the analysis of their expression levels, which will be performed by the remaining modules of the PASTA pipeline, still under development. Use of PASTA can therefore enable the large-scale investigation of transcription and alternative splicing.
format Online
Article
Text
id pubmed-3623791
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36237912013-04-12 PASTA: splice junction identification from RNA-Sequencing data Tang, Shaojun Riva, Alberto BMC Bioinformatics Software BACKGROUND: Next generation transcriptome sequencing (RNA-Seq) is emerging as a powerful experimental tool for the study of alternative splicing and its regulation, but requires ad-hoc analysis methods and tools. PASTA (Patterned Alignments for Splicing and Transcriptome Analysis) is a splice junction detection algorithm specifically designed for RNA-Seq data, relying on a highly accurate alignment strategy and on a combination of heuristic and statistical methods to identify exon-intron junctions with high accuracy. RESULTS: Comparisons against TopHat and other splice junction prediction software on real and simulated datasets show that PASTA exhibits high specificity and sensitivity, especially at lower coverage levels. Moreover, PASTA is highly configurable and flexible, and can therefore be applied in a wide range of analysis scenarios: it is able to handle both single-end and paired-end reads, it does not rely on the presence of canonical splicing signals, and it uses organism-specific regression models to accurately identify junctions. CONCLUSIONS: PASTA is a highly efficient and sensitive tool to identify splicing junctions from RNA-Seq data. Compared to similar programs, it has the ability to identify a higher number of real splicing junctions, and provides highly annotated output files containing detailed information about their location and characteristics. Accurate junction data in turn facilitates the reconstruction of the splicing isoforms and the analysis of their expression levels, which will be performed by the remaining modules of the PASTA pipeline, still under development. Use of PASTA can therefore enable the large-scale investigation of transcription and alternative splicing. BioMed Central 2013-04-04 /pmc/articles/PMC3623791/ /pubmed/23557086 http://dx.doi.org/10.1186/1471-2105-14-116 Text en Copyright © 2013 Tang and Riva; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Tang, Shaojun
Riva, Alberto
PASTA: splice junction identification from RNA-Sequencing data
title PASTA: splice junction identification from RNA-Sequencing data
title_full PASTA: splice junction identification from RNA-Sequencing data
title_fullStr PASTA: splice junction identification from RNA-Sequencing data
title_full_unstemmed PASTA: splice junction identification from RNA-Sequencing data
title_short PASTA: splice junction identification from RNA-Sequencing data
title_sort pasta: splice junction identification from rna-sequencing data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3623791/
https://www.ncbi.nlm.nih.gov/pubmed/23557086
http://dx.doi.org/10.1186/1471-2105-14-116
work_keys_str_mv AT tangshaojun pastasplicejunctionidentificationfromrnasequencingdata
AT rivaalberto pastasplicejunctionidentificationfromrnasequencingdata