Cargando…

A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads

DNA methylation is an epigenetic modification critical for normal development and diseases. The determination of genome-wide DNA methylation at single-nucleotide resolution is made possible by sequencing bisulfite treated DNA with next generation high-throughput sequencing. However, aligning bisulfi...

Descripción completa

Detalles Bibliográficos
Autores principales: Tran, Hong, Wu, Xiaowei, Tithi, Saima, Sun, Ming-an, Xie, Hehuang, Zhang, Liqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4806927/
https://www.ncbi.nlm.nih.gov/pubmed/27011215
http://dx.doi.org/10.1371/journal.pone.0151826
_version_ 1782423307114512384
author Tran, Hong
Wu, Xiaowei
Tithi, Saima
Sun, Ming-an
Xie, Hehuang
Zhang, Liqing
author_facet Tran, Hong
Wu, Xiaowei
Tithi, Saima
Sun, Ming-an
Xie, Hehuang
Zhang, Liqing
author_sort Tran, Hong
collection PubMed
description DNA methylation is an epigenetic modification critical for normal development and diseases. The determination of genome-wide DNA methylation at single-nucleotide resolution is made possible by sequencing bisulfite treated DNA with next generation high-throughput sequencing. However, aligning bisulfite short reads to a reference genome remains challenging as only a limited proportion of them (around 50–70%) can be aligned uniquely; a significant proportion, known as multireads, are mapped to multiple locations and thus discarded from downstream analyses, causing financial waste and biased methylation inference. To address this issue, we develop a Bayesian model that assigns multireads to their most likely locations based on the posterior probability derived from information hidden in uniquely aligned reads. Analyses of both simulated data and real hairpin bisulfite sequencing data show that our method can effectively assign approximately 70% of the multireads to their best locations with up to 90% accuracy, leading to a significant increase in the overall mapping efficiency. Moreover, the assignment model shows robust performance with low coverage depth, making it particularly attractive considering the prohibitive cost of bisulfite sequencing. Additionally, results show that longer reads help improve the performance of the assignment model. The assignment model is also robust to varying degrees of methylation and varying sequencing error rates. Finally, incorporating prior knowledge on mutation rate and context specific methylation level into the assignment model increases inference accuracy. The assignment model is implemented in the BAM-ABS package and freely available at https://github.com/zhanglabvt/BAM_ABS.
format Online
Article
Text
id pubmed-4806927
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-48069272016-03-25 A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads Tran, Hong Wu, Xiaowei Tithi, Saima Sun, Ming-an Xie, Hehuang Zhang, Liqing PLoS One Research Article DNA methylation is an epigenetic modification critical for normal development and diseases. The determination of genome-wide DNA methylation at single-nucleotide resolution is made possible by sequencing bisulfite treated DNA with next generation high-throughput sequencing. However, aligning bisulfite short reads to a reference genome remains challenging as only a limited proportion of them (around 50–70%) can be aligned uniquely; a significant proportion, known as multireads, are mapped to multiple locations and thus discarded from downstream analyses, causing financial waste and biased methylation inference. To address this issue, we develop a Bayesian model that assigns multireads to their most likely locations based on the posterior probability derived from information hidden in uniquely aligned reads. Analyses of both simulated data and real hairpin bisulfite sequencing data show that our method can effectively assign approximately 70% of the multireads to their best locations with up to 90% accuracy, leading to a significant increase in the overall mapping efficiency. Moreover, the assignment model shows robust performance with low coverage depth, making it particularly attractive considering the prohibitive cost of bisulfite sequencing. Additionally, results show that longer reads help improve the performance of the assignment model. The assignment model is also robust to varying degrees of methylation and varying sequencing error rates. Finally, incorporating prior knowledge on mutation rate and context specific methylation level into the assignment model increases inference accuracy. The assignment model is implemented in the BAM-ABS package and freely available at https://github.com/zhanglabvt/BAM_ABS. Public Library of Science 2016-03-24 /pmc/articles/PMC4806927/ /pubmed/27011215 http://dx.doi.org/10.1371/journal.pone.0151826 Text en © 2016 Tran et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tran, Hong
Wu, Xiaowei
Tithi, Saima
Sun, Ming-an
Xie, Hehuang
Zhang, Liqing
A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads
title A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads
title_full A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads
title_fullStr A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads
title_full_unstemmed A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads
title_short A Bayesian Assignment Method for Ambiguous Bisulfite Short Reads
title_sort bayesian assignment method for ambiguous bisulfite short reads
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4806927/
https://www.ncbi.nlm.nih.gov/pubmed/27011215
http://dx.doi.org/10.1371/journal.pone.0151826
work_keys_str_mv AT tranhong abayesianassignmentmethodforambiguousbisulfiteshortreads
AT wuxiaowei abayesianassignmentmethodforambiguousbisulfiteshortreads
AT tithisaima abayesianassignmentmethodforambiguousbisulfiteshortreads
AT sunmingan abayesianassignmentmethodforambiguousbisulfiteshortreads
AT xiehehuang abayesianassignmentmethodforambiguousbisulfiteshortreads
AT zhangliqing abayesianassignmentmethodforambiguousbisulfiteshortreads
AT tranhong bayesianassignmentmethodforambiguousbisulfiteshortreads
AT wuxiaowei bayesianassignmentmethodforambiguousbisulfiteshortreads
AT tithisaima bayesianassignmentmethodforambiguousbisulfiteshortreads
AT sunmingan bayesianassignmentmethodforambiguousbisulfiteshortreads
AT xiehehuang bayesianassignmentmethodforambiguousbisulfiteshortreads
AT zhangliqing bayesianassignmentmethodforambiguousbisulfiteshortreads