Cargando…

TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads

BACKGROUND: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nariai, Naoki, Kojima, Kaname, Mimori, Takahiro, Sato, Yukuto, Kawai, Yosuke, Yamaguchi-Kabata, Yumi, Nagasaki, Masao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2014
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304212/ https://www.ncbi.nlm.nih.gov/pubmed/25560536 http://dx.doi.org/10.1186/1471-2164-15-S10-S5

_version_	1782354056747941888
author	Nariai, Naoki Kojima, Kaname Mimori, Takahiro Sato, Yukuto Kawai, Yosuke Yamaguchi-Kabata, Yumi Nagasaki, Masao
author_facet	Nariai, Naoki Kojima, Kaname Mimori, Takahiro Sato, Yukuto Kawai, Yosuke Yamaguchi-Kabata, Yumi Nagasaki, Masao
author_sort	Nariai, Naoki
collection	PubMed
description	BACKGROUND: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. > 250 bp). RESULTS: We propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods. CONCLUSIONS: TIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp.
format	Online Article Text
id	pubmed-4304212
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-43042122015-02-09 TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads Nariai, Naoki Kojima, Kaname Mimori, Takahiro Sato, Yukuto Kawai, Yosuke Yamaguchi-Kabata, Yumi Nagasaki, Masao BMC Genomics Research BACKGROUND: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. > 250 bp). RESULTS: We propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods. CONCLUSIONS: TIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp. BioMed Central 2014-12-12 /pmc/articles/PMC4304212/ /pubmed/25560536 http://dx.doi.org/10.1186/1471-2164-15-S10-S5 Text en Copyright © 2014 Nariai et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Nariai, Naoki Kojima, Kaname Mimori, Takahiro Sato, Yukuto Kawai, Yosuke Yamaguchi-Kabata, Yumi Nagasaki, Masao TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
title	TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
title_full	TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
title_fullStr	TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
title_full_unstemmed	TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
title_short	TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
title_sort	tigar2: sensitive and accurate estimation of transcript isoform expression with longer rna-seq reads
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304212/ https://www.ncbi.nlm.nih.gov/pubmed/25560536 http://dx.doi.org/10.1186/1471-2164-15-S10-S5
work_keys_str_mv	AT nariainaoki tigar2sensitiveandaccurateestimationoftranscriptisoformexpressionwithlongerrnaseqreads AT kojimakaname tigar2sensitiveandaccurateestimationoftranscriptisoformexpressionwithlongerrnaseqreads AT mimoritakahiro tigar2sensitiveandaccurateestimationoftranscriptisoformexpressionwithlongerrnaseqreads AT satoyukuto tigar2sensitiveandaccurateestimationoftranscriptisoformexpressionwithlongerrnaseqreads AT kawaiyosuke tigar2sensitiveandaccurateestimationoftranscriptisoformexpressionwithlongerrnaseqreads AT yamaguchikabatayumi tigar2sensitiveandaccurateestimationoftranscriptisoformexpressionwithlongerrnaseqreads AT nagasakimasao tigar2sensitiveandaccurateestimationoftranscriptisoformexpressionwithlongerrnaseqreads

TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads

Ejemplares similares