Cargando…

Comparing reference-based RNA-Seq mapping methods for non-human primate data

BACKGROUND: The application of next-generation sequencing technology to gene expression quantification analysis, namely, RNA-Sequencing, has transformed the way in which gene expression studies are conducted and analyzed. These advances are of particular interest to researchers studying organisms wi...

Descripción completa

Detalles Bibliográficos
Autores principales: Benjamin, Ashlee M, Nichols, Marshall, Burke, Thomas W, Ginsburg, Geoffrey S, Lucas, Joseph E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4112205/
https://www.ncbi.nlm.nih.gov/pubmed/25001289
http://dx.doi.org/10.1186/1471-2164-15-570
_version_ 1782328159861997568
author Benjamin, Ashlee M
Nichols, Marshall
Burke, Thomas W
Ginsburg, Geoffrey S
Lucas, Joseph E
author_facet Benjamin, Ashlee M
Nichols, Marshall
Burke, Thomas W
Ginsburg, Geoffrey S
Lucas, Joseph E
author_sort Benjamin, Ashlee M
collection PubMed
description BACKGROUND: The application of next-generation sequencing technology to gene expression quantification analysis, namely, RNA-Sequencing, has transformed the way in which gene expression studies are conducted and analyzed. These advances are of particular interest to researchers studying organisms with missing or incomplete genomes, as the need for knowledge of sequence information is overcome. De novo assembly methods have gained widespread acceptance in the RNA-Seq community for organisms with no true reference genome or transcriptome. While such methods have tremendous utility, computational cost is still a significant challenge for organisms with large and complex genomes. RESULTS: In this manuscript, we present a comparison of four reference-based mapping methods for non-human primate data. We utilize TopHat2 and GSNAP for mapping to the human genome, and Bowtie2 and Stampy for mapping to the human genome and transcriptome for a total of six mapping approaches. For each of these methods, we explore mapping rates and locations, number of detected genes, correlations between computed expression values, and the utility of the resulting data for differential expression analysis. CONCLUSIONS: We show that reference-based mapping methods indeed have utility in RNA-Seq analysis of mammalian data with no true reference, and the details of mapping methods should be carefully considered when doing so. Critical algorithm features include short seed sequences, the allowance of mismatches, and the allowance of gapped alignments in addition to splice junction gaps. Such features facilitate sensitive alignment of non-human primate RNA-Seq data to a human reference. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-570) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4112205
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41122052014-08-05 Comparing reference-based RNA-Seq mapping methods for non-human primate data Benjamin, Ashlee M Nichols, Marshall Burke, Thomas W Ginsburg, Geoffrey S Lucas, Joseph E BMC Genomics Research Article BACKGROUND: The application of next-generation sequencing technology to gene expression quantification analysis, namely, RNA-Sequencing, has transformed the way in which gene expression studies are conducted and analyzed. These advances are of particular interest to researchers studying organisms with missing or incomplete genomes, as the need for knowledge of sequence information is overcome. De novo assembly methods have gained widespread acceptance in the RNA-Seq community for organisms with no true reference genome or transcriptome. While such methods have tremendous utility, computational cost is still a significant challenge for organisms with large and complex genomes. RESULTS: In this manuscript, we present a comparison of four reference-based mapping methods for non-human primate data. We utilize TopHat2 and GSNAP for mapping to the human genome, and Bowtie2 and Stampy for mapping to the human genome and transcriptome for a total of six mapping approaches. For each of these methods, we explore mapping rates and locations, number of detected genes, correlations between computed expression values, and the utility of the resulting data for differential expression analysis. CONCLUSIONS: We show that reference-based mapping methods indeed have utility in RNA-Seq analysis of mammalian data with no true reference, and the details of mapping methods should be carefully considered when doing so. Critical algorithm features include short seed sequences, the allowance of mismatches, and the allowance of gapped alignments in addition to splice junction gaps. Such features facilitate sensitive alignment of non-human primate RNA-Seq data to a human reference. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-570) contains supplementary material, which is available to authorized users. BioMed Central 2014-07-07 /pmc/articles/PMC4112205/ /pubmed/25001289 http://dx.doi.org/10.1186/1471-2164-15-570 Text en © Benjamin et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Benjamin, Ashlee M
Nichols, Marshall
Burke, Thomas W
Ginsburg, Geoffrey S
Lucas, Joseph E
Comparing reference-based RNA-Seq mapping methods for non-human primate data
title Comparing reference-based RNA-Seq mapping methods for non-human primate data
title_full Comparing reference-based RNA-Seq mapping methods for non-human primate data
title_fullStr Comparing reference-based RNA-Seq mapping methods for non-human primate data
title_full_unstemmed Comparing reference-based RNA-Seq mapping methods for non-human primate data
title_short Comparing reference-based RNA-Seq mapping methods for non-human primate data
title_sort comparing reference-based rna-seq mapping methods for non-human primate data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4112205/
https://www.ncbi.nlm.nih.gov/pubmed/25001289
http://dx.doi.org/10.1186/1471-2164-15-570
work_keys_str_mv AT benjaminashleem comparingreferencebasedrnaseqmappingmethodsfornonhumanprimatedata
AT nicholsmarshall comparingreferencebasedrnaseqmappingmethodsfornonhumanprimatedata
AT burkethomasw comparingreferencebasedrnaseqmappingmethodsfornonhumanprimatedata
AT ginsburggeoffreys comparingreferencebasedrnaseqmappingmethodsfornonhumanprimatedata
AT lucasjosephe comparingreferencebasedrnaseqmappingmethodsfornonhumanprimatedata