Cargando…

Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability

Rapidly growing public gene expression databases contain a wealth of data for building an unprecedentedly detailed picture of human biology and disease. This data comes from many diverse measurement platforms that make integrating it all difficult. Although RNA-sequencing (RNA-seq) is attracting the...

Descripción completa

Detalles Bibliográficos
Autores principales: Uziela, Karolis, Honkela, Antti
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429080/
https://www.ncbi.nlm.nih.gov/pubmed/25966034
http://dx.doi.org/10.1371/journal.pone.0126545
_version_ 1782370977680719872
author Uziela, Karolis
Honkela, Antti
author_facet Uziela, Karolis
Honkela, Antti
author_sort Uziela, Karolis
collection PubMed
description Rapidly growing public gene expression databases contain a wealth of data for building an unprecedentedly detailed picture of human biology and disease. This data comes from many diverse measurement platforms that make integrating it all difficult. Although RNA-sequencing (RNA-seq) is attracting the most attention, at present, the rate of new microarray studies submitted to public databases far exceeds the rate of new RNA-seq studies. There is clearly a need for methods that make it easier to combine data from different technologies. In this paper, we propose a new method for processing RNA-seq data that yields gene expression estimates that are much more similar to corresponding estimates from microarray data, hence greatly improving cross-platform comparability. The method we call PREBS is based on estimating the expression from RNA-seq reads overlapping the microarray probe regions, and processing these estimates with standard microarray summarisation algorithms. Using paired microarray and RNA-seq samples from TCGA LAML data set we show that PREBS expression estimates derived from RNA-seq are more similar to microarray-based expression estimates than those from other RNA-seq processing methods. In an experiment to retrieve paired microarray samples from a database using an RNA-seq query sample, gene signatures defined based on PREBS expression estimates were found to be much more accurate than those from other methods. PREBS also allows new ways of using RNA-seq data, such as expression estimation for microarray probe sets. An implementation of the proposed method is available in the Bioconductor package “prebs.”
format Online
Article
Text
id pubmed-4429080
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44290802015-05-21 Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability Uziela, Karolis Honkela, Antti PLoS One Research Article Rapidly growing public gene expression databases contain a wealth of data for building an unprecedentedly detailed picture of human biology and disease. This data comes from many diverse measurement platforms that make integrating it all difficult. Although RNA-sequencing (RNA-seq) is attracting the most attention, at present, the rate of new microarray studies submitted to public databases far exceeds the rate of new RNA-seq studies. There is clearly a need for methods that make it easier to combine data from different technologies. In this paper, we propose a new method for processing RNA-seq data that yields gene expression estimates that are much more similar to corresponding estimates from microarray data, hence greatly improving cross-platform comparability. The method we call PREBS is based on estimating the expression from RNA-seq reads overlapping the microarray probe regions, and processing these estimates with standard microarray summarisation algorithms. Using paired microarray and RNA-seq samples from TCGA LAML data set we show that PREBS expression estimates derived from RNA-seq are more similar to microarray-based expression estimates than those from other RNA-seq processing methods. In an experiment to retrieve paired microarray samples from a database using an RNA-seq query sample, gene signatures defined based on PREBS expression estimates were found to be much more accurate than those from other methods. PREBS also allows new ways of using RNA-seq data, such as expression estimation for microarray probe sets. An implementation of the proposed method is available in the Bioconductor package “prebs.” Public Library of Science 2015-05-12 /pmc/articles/PMC4429080/ /pubmed/25966034 http://dx.doi.org/10.1371/journal.pone.0126545 Text en © 2015 Uziela, Honkela http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Uziela, Karolis
Honkela, Antti
Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability
title Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability
title_full Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability
title_fullStr Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability
title_full_unstemmed Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability
title_short Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability
title_sort probe region expression estimation for rna-seq data for improved microarray comparability
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429080/
https://www.ncbi.nlm.nih.gov/pubmed/25966034
http://dx.doi.org/10.1371/journal.pone.0126545
work_keys_str_mv AT uzielakarolis proberegionexpressionestimationforrnaseqdataforimprovedmicroarraycomparability
AT honkelaantti proberegionexpressionestimationforrnaseqdataforimprovedmicroarraycomparability