Cargando…

Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data

BACKGROUND: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data a...

Descripción completa

Detalles Bibliográficos
Autores principales: Okamura, Yasunobu, Kinoshita, Kengo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6048772/
https://www.ncbi.nlm.nih.gov/pubmed/30012088
http://dx.doi.org/10.1186/s12859-018-2279-y
_version_ 1783340160271777792
author Okamura, Yasunobu
Kinoshita, Kengo
author_facet Okamura, Yasunobu
Kinoshita, Kengo
author_sort Okamura, Yasunobu
collection PubMed
description BACKGROUND: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data are frequently performed using microarray data, meta-analyses using RNA-Seq data are still rare. This lag is partly due to the limitations in reanalyzing RNA-Seq data, which requires extensive computational resources. Moreover, it is nearly impossible to calculate the gene expression levels of all samples in a public repository using currently available methods. Here, we propose a novel method, Matataki, for rapidly estimating gene expression levels from RNA-Seq data. RESULTS: The proposed method uses k-mers that are unique to each gene for the mapping of fragments to genes. Since aligning fragments to reference sequences requires high computational costs, our method could reduce the calculation cost by focusing on k-mers that are unique to each gene and by skipping uninformative regions. Indeed, Matataki outperformed conventional methods with regards to speed while demonstrating sufficient accuracy. CONCLUSIONS: The development of Matataki can overcome current limitations in reanalyzing RNA-Seq data toward improving the potential for discovering genes and pathways associated with disease at reduced computational cost. Thus, the main bottleneck of RNA-Seq analyses has shifted to achieving the decompression of sequenced data. The implementation of Matataki is available at https://github.com/informationsea/Matataki. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2279-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6048772
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-60487722018-07-19 Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data Okamura, Yasunobu Kinoshita, Kengo BMC Bioinformatics Research Article BACKGROUND: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data are frequently performed using microarray data, meta-analyses using RNA-Seq data are still rare. This lag is partly due to the limitations in reanalyzing RNA-Seq data, which requires extensive computational resources. Moreover, it is nearly impossible to calculate the gene expression levels of all samples in a public repository using currently available methods. Here, we propose a novel method, Matataki, for rapidly estimating gene expression levels from RNA-Seq data. RESULTS: The proposed method uses k-mers that are unique to each gene for the mapping of fragments to genes. Since aligning fragments to reference sequences requires high computational costs, our method could reduce the calculation cost by focusing on k-mers that are unique to each gene and by skipping uninformative regions. Indeed, Matataki outperformed conventional methods with regards to speed while demonstrating sufficient accuracy. CONCLUSIONS: The development of Matataki can overcome current limitations in reanalyzing RNA-Seq data toward improving the potential for discovering genes and pathways associated with disease at reduced computational cost. Thus, the main bottleneck of RNA-Seq analyses has shifted to achieving the decompression of sequenced data. The implementation of Matataki is available at https://github.com/informationsea/Matataki. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2279-y) contains supplementary material, which is available to authorized users. BioMed Central 2018-07-16 /pmc/articles/PMC6048772/ /pubmed/30012088 http://dx.doi.org/10.1186/s12859-018-2279-y Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Okamura, Yasunobu
Kinoshita, Kengo
Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data
title Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data
title_full Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data
title_fullStr Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data
title_full_unstemmed Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data
title_short Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data
title_sort matataki: an ultrafast mrna quantification method for large-scale reanalysis of rna-seq data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6048772/
https://www.ncbi.nlm.nih.gov/pubmed/30012088
http://dx.doi.org/10.1186/s12859-018-2279-y
work_keys_str_mv AT okamurayasunobu matatakianultrafastmrnaquantificationmethodforlargescalereanalysisofrnaseqdata
AT kinoshitakengo matatakianultrafastmrnaquantificationmethodforlargescalereanalysisofrnaseqdata