Cargando…

DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates

MOTIVATION: High-throughput mRNA sequencing (RNA-Seq) is a powerful tool for quantifying gene expression. Identification of transcript isoforms that are differentially expressed in different conditions, such as in patients and healthy subjects, can provide insights into the molecular basis of diseas...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Hao, Yang, Yifan, Zhe, Shandian, Wang, Jian, Gribskov, Michael, Qi, Yuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870796/
https://www.ncbi.nlm.nih.gov/pubmed/28595376
http://dx.doi.org/10.1093/bioinformatics/btx357
_version_ 1783309549449510912
author Peng, Hao
Yang, Yifan
Zhe, Shandian
Wang, Jian
Gribskov, Michael
Qi, Yuan
author_facet Peng, Hao
Yang, Yifan
Zhe, Shandian
Wang, Jian
Gribskov, Michael
Qi, Yuan
author_sort Peng, Hao
collection PubMed
description MOTIVATION: High-throughput mRNA sequencing (RNA-Seq) is a powerful tool for quantifying gene expression. Identification of transcript isoforms that are differentially expressed in different conditions, such as in patients and healthy subjects, can provide insights into the molecular basis of diseases. Current transcript quantification approaches, however, do not take advantage of the shared information in the biological replicates, potentially decreasing sensitivity and accuracy. RESULTS: We present a novel hierarchical Bayesian model called Differentially Expressed Isoform detection from Multiple biological replicates (DEIsoM) for identifying differentially expressed (DE) isoforms from multiple biological replicates representing two conditions, e.g. multiple samples from healthy and diseased subjects. DEIsoM first estimates isoform expression within each condition by (1) capturing common patterns from sample replicates while allowing individual differences, and (2) modeling the uncertainty introduced by ambiguous read mapping in each replicate. Specifically, we introduce a Dirichlet prior distribution to capture the common expression pattern of replicates from the same condition, and treat the isoform expression of individual replicates as samples from this distribution. Ambiguous read mapping is modeled as a multinomial distribution, and ambiguous reads are assigned to the most probable isoform in each replicate. Additionally, DEIsoM couples an efficient variational inference and a post-analysis method to improve the accuracy and speed of identification of DE isoforms over alternative methods. Application of DEIsoM to an hepatocellular carcinoma (HCC) dataset identifies biologically relevant DE isoforms. The relevance of these genes/isoforms to HCC are supported by principal component analysis (PCA), read coverage visualization, and the biological literature. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/hao-peng/DEIsoM SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5870796
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58707962018-03-29 DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates Peng, Hao Yang, Yifan Zhe, Shandian Wang, Jian Gribskov, Michael Qi, Yuan Bioinformatics Original Papers MOTIVATION: High-throughput mRNA sequencing (RNA-Seq) is a powerful tool for quantifying gene expression. Identification of transcript isoforms that are differentially expressed in different conditions, such as in patients and healthy subjects, can provide insights into the molecular basis of diseases. Current transcript quantification approaches, however, do not take advantage of the shared information in the biological replicates, potentially decreasing sensitivity and accuracy. RESULTS: We present a novel hierarchical Bayesian model called Differentially Expressed Isoform detection from Multiple biological replicates (DEIsoM) for identifying differentially expressed (DE) isoforms from multiple biological replicates representing two conditions, e.g. multiple samples from healthy and diseased subjects. DEIsoM first estimates isoform expression within each condition by (1) capturing common patterns from sample replicates while allowing individual differences, and (2) modeling the uncertainty introduced by ambiguous read mapping in each replicate. Specifically, we introduce a Dirichlet prior distribution to capture the common expression pattern of replicates from the same condition, and treat the isoform expression of individual replicates as samples from this distribution. Ambiguous read mapping is modeled as a multinomial distribution, and ambiguous reads are assigned to the most probable isoform in each replicate. Additionally, DEIsoM couples an efficient variational inference and a post-analysis method to improve the accuracy and speed of identification of DE isoforms over alternative methods. Application of DEIsoM to an hepatocellular carcinoma (HCC) dataset identifies biologically relevant DE isoforms. The relevance of these genes/isoforms to HCC are supported by principal component analysis (PCA), read coverage visualization, and the biological literature. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/hao-peng/DEIsoM SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-10-01 2017-06-08 /pmc/articles/PMC5870796/ /pubmed/28595376 http://dx.doi.org/10.1093/bioinformatics/btx357 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Peng, Hao
Yang, Yifan
Zhe, Shandian
Wang, Jian
Gribskov, Michael
Qi, Yuan
DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates
title DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates
title_full DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates
title_fullStr DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates
title_full_unstemmed DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates
title_short DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates
title_sort deisom: a hierarchical bayesian model for identifying differentially expressed isoforms using biological replicates
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870796/
https://www.ncbi.nlm.nih.gov/pubmed/28595376
http://dx.doi.org/10.1093/bioinformatics/btx357
work_keys_str_mv AT penghao deisomahierarchicalbayesianmodelforidentifyingdifferentiallyexpressedisoformsusingbiologicalreplicates
AT yangyifan deisomahierarchicalbayesianmodelforidentifyingdifferentiallyexpressedisoformsusingbiologicalreplicates
AT zheshandian deisomahierarchicalbayesianmodelforidentifyingdifferentiallyexpressedisoformsusingbiologicalreplicates
AT wangjian deisomahierarchicalbayesianmodelforidentifyingdifferentiallyexpressedisoformsusingbiologicalreplicates
AT gribskovmichael deisomahierarchicalbayesianmodelforidentifyingdifferentiallyexpressedisoformsusingbiologicalreplicates
AT qiyuan deisomahierarchicalbayesianmodelforidentifyingdifferentiallyexpressedisoformsusingbiologicalreplicates