Cargando…
Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge
Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcri...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3715474/ https://www.ncbi.nlm.nih.gov/pubmed/23874524 http://dx.doi.org/10.1371/journal.pone.0068141 |
_version_ | 1782277462632169472 |
---|---|
author | Mostafavi, Sara Battle, Alexis Zhu, Xiaowei Urban, Alexander E. Levinson, Douglas Montgomery, Stephen B. Koller, Daphne |
author_facet | Mostafavi, Sara Battle, Alexis Zhu, Xiaowei Urban, Alexander E. Levinson, Douglas Montgomery, Stephen B. Koller, Daphne |
author_sort | Mostafavi, Sara |
collection | PubMed |
description | Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks. |
format | Online Article Text |
id | pubmed-3715474 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37154742013-07-19 Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge Mostafavi, Sara Battle, Alexis Zhu, Xiaowei Urban, Alexander E. Levinson, Douglas Montgomery, Stephen B. Koller, Daphne PLoS One Research Article Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks. Public Library of Science 2013-07-18 /pmc/articles/PMC3715474/ /pubmed/23874524 http://dx.doi.org/10.1371/journal.pone.0068141 Text en © 2013 Mostafavi et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Mostafavi, Sara Battle, Alexis Zhu, Xiaowei Urban, Alexander E. Levinson, Douglas Montgomery, Stephen B. Koller, Daphne Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge |
title | Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge |
title_full | Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge |
title_fullStr | Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge |
title_full_unstemmed | Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge |
title_short | Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge |
title_sort | normalizing rna-sequencing data by modeling hidden covariates with prior knowledge |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3715474/ https://www.ncbi.nlm.nih.gov/pubmed/23874524 http://dx.doi.org/10.1371/journal.pone.0068141 |
work_keys_str_mv | AT mostafavisara normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT battlealexis normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT zhuxiaowei normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT urbanalexandere normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT levinsondouglas normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT montgomerystephenb normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT kollerdaphne normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge |