Cargando…

A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data

Motivation: N(6)-methyl-adenosine (m(6)A) is the most prevalent mRNA methylation but precise prediction of its mRNA location is important for understanding its function. A recent sequencing technology, known as Methylated RNA Immunoprecipitation Sequencing technology (MeRIP-seq), has been developed...

Descripción completa

Detalles Bibliográficos
Autores principales: Cui, Xiaodong, Meng, Jia, Zhang, Shaowu, Chen, Yidong, Huang, Yufei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908365/
https://www.ncbi.nlm.nih.gov/pubmed/27307641
http://dx.doi.org/10.1093/bioinformatics/btw281
_version_ 1782437668889559040
author Cui, Xiaodong
Meng, Jia
Zhang, Shaowu
Chen, Yidong
Huang, Yufei
author_facet Cui, Xiaodong
Meng, Jia
Zhang, Shaowu
Chen, Yidong
Huang, Yufei
author_sort Cui, Xiaodong
collection PubMed
description Motivation: N(6)-methyl-adenosine (m(6)A) is the most prevalent mRNA methylation but precise prediction of its mRNA location is important for understanding its function. A recent sequencing technology, known as Methylated RNA Immunoprecipitation Sequencing technology (MeRIP-seq), has been developed for transcriptome-wide profiling of m(6)A. We previously developed a peak calling algorithm called exomePeak. However, exomePeak over-simplifies data characteristics and ignores the reads’ variances among replicates or reads dependency across a site region. To further improve the performance, new model is needed to address these important issues of MeRIP-seq data. Results: We propose a novel, graphical model-based peak calling method, MeTPeak, for transcriptome-wide detection of m(6)A sites from MeRIP-seq data. MeTPeak explicitly models read count of an m(6)A site and introduces a hierarchical layer of Beta variables to capture the variances and a Hidden Markov model to characterize the reads dependency across a site. In addition, we developed a constrained Newton’s method and designed a log-barrier function to compute analytically intractable, positively constrained Beta parameters. We applied our algorithm to simulated and real biological datasets and demonstrated significant improvement in detection performance and robustness over exomePeak. Prediction results on publicly available MeRIP-seq datasets are also validated and shown to be able to recapitulate the known patterns of m(6)A, further validating the improved performance of MeTPeak. Availability and implementation: The package ‘MeTPeak’ is implemented in R and C ++, and additional details are available at https://github.com/compgenomics/MeTPeak Contact: yufei.huang@utsa.edu or xdchoi@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4908365
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-49083652016-06-17 A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data Cui, Xiaodong Meng, Jia Zhang, Shaowu Chen, Yidong Huang, Yufei Bioinformatics Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida Motivation: N(6)-methyl-adenosine (m(6)A) is the most prevalent mRNA methylation but precise prediction of its mRNA location is important for understanding its function. A recent sequencing technology, known as Methylated RNA Immunoprecipitation Sequencing technology (MeRIP-seq), has been developed for transcriptome-wide profiling of m(6)A. We previously developed a peak calling algorithm called exomePeak. However, exomePeak over-simplifies data characteristics and ignores the reads’ variances among replicates or reads dependency across a site region. To further improve the performance, new model is needed to address these important issues of MeRIP-seq data. Results: We propose a novel, graphical model-based peak calling method, MeTPeak, for transcriptome-wide detection of m(6)A sites from MeRIP-seq data. MeTPeak explicitly models read count of an m(6)A site and introduces a hierarchical layer of Beta variables to capture the variances and a Hidden Markov model to characterize the reads dependency across a site. In addition, we developed a constrained Newton’s method and designed a log-barrier function to compute analytically intractable, positively constrained Beta parameters. We applied our algorithm to simulated and real biological datasets and demonstrated significant improvement in detection performance and robustness over exomePeak. Prediction results on publicly available MeRIP-seq datasets are also validated and shown to be able to recapitulate the known patterns of m(6)A, further validating the improved performance of MeTPeak. Availability and implementation: The package ‘MeTPeak’ is implemented in R and C ++, and additional details are available at https://github.com/compgenomics/MeTPeak Contact: yufei.huang@utsa.edu or xdchoi@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-06-15 2016-06-11 /pmc/articles/PMC4908365/ /pubmed/27307641 http://dx.doi.org/10.1093/bioinformatics/btw281 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida
Cui, Xiaodong
Meng, Jia
Zhang, Shaowu
Chen, Yidong
Huang, Yufei
A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data
title A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data
title_full A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data
title_fullStr A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data
title_full_unstemmed A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data
title_short A novel algorithm for calling mRNA m(6)A peaks by modeling biological variances in MeRIP-seq data
title_sort novel algorithm for calling mrna m(6)a peaks by modeling biological variances in merip-seq data
topic Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908365/
https://www.ncbi.nlm.nih.gov/pubmed/27307641
http://dx.doi.org/10.1093/bioinformatics/btw281
work_keys_str_mv AT cuixiaodong anovelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT mengjia anovelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT zhangshaowu anovelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT chenyidong anovelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT huangyufei anovelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT cuixiaodong novelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT mengjia novelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT zhangshaowu novelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT chenyidong novelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata
AT huangyufei novelalgorithmforcallingmrnam6apeaksbymodelingbiologicalvariancesinmeripseqdata