Cargando…
Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data
BACKGROUND: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6048772/ https://www.ncbi.nlm.nih.gov/pubmed/30012088 http://dx.doi.org/10.1186/s12859-018-2279-y |
_version_ | 1783340160271777792 |
---|---|
author | Okamura, Yasunobu Kinoshita, Kengo |
author_facet | Okamura, Yasunobu Kinoshita, Kengo |
author_sort | Okamura, Yasunobu |
collection | PubMed |
description | BACKGROUND: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data are frequently performed using microarray data, meta-analyses using RNA-Seq data are still rare. This lag is partly due to the limitations in reanalyzing RNA-Seq data, which requires extensive computational resources. Moreover, it is nearly impossible to calculate the gene expression levels of all samples in a public repository using currently available methods. Here, we propose a novel method, Matataki, for rapidly estimating gene expression levels from RNA-Seq data. RESULTS: The proposed method uses k-mers that are unique to each gene for the mapping of fragments to genes. Since aligning fragments to reference sequences requires high computational costs, our method could reduce the calculation cost by focusing on k-mers that are unique to each gene and by skipping uninformative regions. Indeed, Matataki outperformed conventional methods with regards to speed while demonstrating sufficient accuracy. CONCLUSIONS: The development of Matataki can overcome current limitations in reanalyzing RNA-Seq data toward improving the potential for discovering genes and pathways associated with disease at reduced computational cost. Thus, the main bottleneck of RNA-Seq analyses has shifted to achieving the decompression of sequenced data. The implementation of Matataki is available at https://github.com/informationsea/Matataki. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2279-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6048772 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-60487722018-07-19 Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data Okamura, Yasunobu Kinoshita, Kengo BMC Bioinformatics Research Article BACKGROUND: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data are frequently performed using microarray data, meta-analyses using RNA-Seq data are still rare. This lag is partly due to the limitations in reanalyzing RNA-Seq data, which requires extensive computational resources. Moreover, it is nearly impossible to calculate the gene expression levels of all samples in a public repository using currently available methods. Here, we propose a novel method, Matataki, for rapidly estimating gene expression levels from RNA-Seq data. RESULTS: The proposed method uses k-mers that are unique to each gene for the mapping of fragments to genes. Since aligning fragments to reference sequences requires high computational costs, our method could reduce the calculation cost by focusing on k-mers that are unique to each gene and by skipping uninformative regions. Indeed, Matataki outperformed conventional methods with regards to speed while demonstrating sufficient accuracy. CONCLUSIONS: The development of Matataki can overcome current limitations in reanalyzing RNA-Seq data toward improving the potential for discovering genes and pathways associated with disease at reduced computational cost. Thus, the main bottleneck of RNA-Seq analyses has shifted to achieving the decompression of sequenced data. The implementation of Matataki is available at https://github.com/informationsea/Matataki. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2279-y) contains supplementary material, which is available to authorized users. BioMed Central 2018-07-16 /pmc/articles/PMC6048772/ /pubmed/30012088 http://dx.doi.org/10.1186/s12859-018-2279-y Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Okamura, Yasunobu Kinoshita, Kengo Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data |
title | Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data |
title_full | Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data |
title_fullStr | Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data |
title_full_unstemmed | Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data |
title_short | Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data |
title_sort | matataki: an ultrafast mrna quantification method for large-scale reanalysis of rna-seq data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6048772/ https://www.ncbi.nlm.nih.gov/pubmed/30012088 http://dx.doi.org/10.1186/s12859-018-2279-y |
work_keys_str_mv | AT okamurayasunobu matatakianultrafastmrnaquantificationmethodforlargescalereanalysisofrnaseqdata AT kinoshitakengo matatakianultrafastmrnaquantificationmethodforlargescalereanalysisofrnaseqdata |