Cargando…

Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling

mRNA-Seq is a precise and highly reproducible technique for measurement of transcripts levels and yields sequence information of a transcriptome at a single nucleotide base-level thus enabling us to determine splice junctions and alternative splicing events with high confidence. Often analysis of mR...

Descripción completa

Detalles Bibliográficos
Autores principales: Bhattacharjee, M., Gupta, Ravi, Davuluri, R. V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3536024/
https://www.ncbi.nlm.nih.gov/pubmed/23293650
http://dx.doi.org/10.3389/fgene.2012.00239
_version_ 1782254729538043904
author Bhattacharjee, M.
Gupta, Ravi
Davuluri, R. V.
author_facet Bhattacharjee, M.
Gupta, Ravi
Davuluri, R. V.
author_sort Bhattacharjee, M.
collection PubMed
description mRNA-Seq is a precise and highly reproducible technique for measurement of transcripts levels and yields sequence information of a transcriptome at a single nucleotide base-level thus enabling us to determine splice junctions and alternative splicing events with high confidence. Often analysis of mRNA-Seq data does not attempt to quantify the expressions at isoform level. In this paper our objective would be use the mRNA-Seq data to infer expression at isoform level, where splicing patterns of a gene is assumed to be known. A Bayesian latent variable based modeling framework is proposed here, where the parameterization enables us to infer at various levels. For example, expression variability of an isoform across different conditions; the model parameterization also allows us to carry out two-sample comparisons, e.g., using a Bayesian t-test, in addition simple presence or absence of an isoform can also be estimated by the use of the latent variables present in the model. In this paper we would carry out inference on isoform expression under different normalization techniques, since it has been recently shown that one of the most prominent sources of variation in differential call using mRNA-Seq data is the normalization method used. The statistical framework is developed for multiple isoforms and easily extends to reads mapping to multiple genes. This could be achieved by slight conceptual modifications in definitions of what we consider as a gene and what as an exon. Additionally proposed framework can be extended by appropriate modeling of the design matrix to infer about yet unknown novel transcripts. However such attempts should be made judiciously since the input date used in the proposed model does not use reads from splice junctions.
format Online
Article
Text
id pubmed-3536024
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-35360242013-01-04 Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling Bhattacharjee, M. Gupta, Ravi Davuluri, R. V. Front Genet Genetics mRNA-Seq is a precise and highly reproducible technique for measurement of transcripts levels and yields sequence information of a transcriptome at a single nucleotide base-level thus enabling us to determine splice junctions and alternative splicing events with high confidence. Often analysis of mRNA-Seq data does not attempt to quantify the expressions at isoform level. In this paper our objective would be use the mRNA-Seq data to infer expression at isoform level, where splicing patterns of a gene is assumed to be known. A Bayesian latent variable based modeling framework is proposed here, where the parameterization enables us to infer at various levels. For example, expression variability of an isoform across different conditions; the model parameterization also allows us to carry out two-sample comparisons, e.g., using a Bayesian t-test, in addition simple presence or absence of an isoform can also be estimated by the use of the latent variables present in the model. In this paper we would carry out inference on isoform expression under different normalization techniques, since it has been recently shown that one of the most prominent sources of variation in differential call using mRNA-Seq data is the normalization method used. The statistical framework is developed for multiple isoforms and easily extends to reads mapping to multiple genes. This could be achieved by slight conceptual modifications in definitions of what we consider as a gene and what as an exon. Additionally proposed framework can be extended by appropriate modeling of the design matrix to infer about yet unknown novel transcripts. However such attempts should be made judiciously since the input date used in the proposed model does not use reads from splice junctions. Frontiers Media S.A. 2012-11-27 /pmc/articles/PMC3536024/ /pubmed/23293650 http://dx.doi.org/10.3389/fgene.2012.00239 Text en Copyright © 2012 Bhattacharjee, Gupta and Davuluri. http://www.frontiersin.org/licenseagreement This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
spellingShingle Genetics
Bhattacharjee, M.
Gupta, Ravi
Davuluri, R. V.
Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_full Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_fullStr Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_full_unstemmed Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_short Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_sort estimation of gene expression at isoform level from mrna-seq data by bayesian hierarchical modeling
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3536024/
https://www.ncbi.nlm.nih.gov/pubmed/23293650
http://dx.doi.org/10.3389/fgene.2012.00239
work_keys_str_mv AT bhattacharjeem estimationofgeneexpressionatisoformlevelfrommrnaseqdatabybayesianhierarchicalmodeling
AT guptaravi estimationofgeneexpressionatisoformlevelfrommrnaseqdatabybayesianhierarchicalmodeling
AT davulurirv estimationofgeneexpressionatisoformlevelfrommrnaseqdatabybayesianhierarchicalmodeling