Cargando…

An effective method to resolve ambiguous bisulfite-treated reads

BACKGROUND: The combination of the bisulfite treatment and the next-generation sequencing is an important method for methylation analysis, and aligning the bisulfite-treated reads (BS-reads) is the critical step for the downstream applications. As bisulfite treatment reduces the complexity of the se...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Mengya, Xu, Yun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8161933/
https://www.ncbi.nlm.nih.gov/pubmed/34044763
http://dx.doi.org/10.1186/s12859-021-04204-6
_version_ 1783700611139633152
author Liu, Mengya
Xu, Yun
author_facet Liu, Mengya
Xu, Yun
author_sort Liu, Mengya
collection PubMed
description BACKGROUND: The combination of the bisulfite treatment and the next-generation sequencing is an important method for methylation analysis, and aligning the bisulfite-treated reads (BS-reads) is the critical step for the downstream applications. As bisulfite treatment reduces the complexity of the sequences, a large portion of BS-reads might be aligned to multiple locations of the reference genome ambiguously, called multireads. These multireads cannot be employed in the downstream applications since they are likely to introduce artifacts. To identify the best mapping location of each multiread, existing Bayesian-based methods calculate the probability of the read at each position by considering how does it overlap with unique mapped reads. However, [Formula: see text] % of multireads are not overlapped with any unique reads, which are unresolvable for existing method. RESULTS: Here we propose a novel method (EM-MUL) that not only rescues multireads overlapped with unique reads, but also uses the overall coverage and accurate base-level alignment to resolve multireads that cannot be handled by current methods. We benchmark our method on both simulated datasets and real datasets. Experimental results show that it is able to align more than 80% of multireads to the best mapping position with very high accuracy. CONCLUSIONS: EM-MUL is an effective method designed to accurately determine the best mapping position of multireads in BS-reads. For the downstream applications, it is useful to improve the methylation resolution on the repetitive regions of genome. EM-MUL is free available at https://github.com/lmylynn/EM-MUL.
format Online
Article
Text
id pubmed-8161933
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81619332021-06-01 An effective method to resolve ambiguous bisulfite-treated reads Liu, Mengya Xu, Yun BMC Bioinformatics Methodology Article BACKGROUND: The combination of the bisulfite treatment and the next-generation sequencing is an important method for methylation analysis, and aligning the bisulfite-treated reads (BS-reads) is the critical step for the downstream applications. As bisulfite treatment reduces the complexity of the sequences, a large portion of BS-reads might be aligned to multiple locations of the reference genome ambiguously, called multireads. These multireads cannot be employed in the downstream applications since they are likely to introduce artifacts. To identify the best mapping location of each multiread, existing Bayesian-based methods calculate the probability of the read at each position by considering how does it overlap with unique mapped reads. However, [Formula: see text] % of multireads are not overlapped with any unique reads, which are unresolvable for existing method. RESULTS: Here we propose a novel method (EM-MUL) that not only rescues multireads overlapped with unique reads, but also uses the overall coverage and accurate base-level alignment to resolve multireads that cannot be handled by current methods. We benchmark our method on both simulated datasets and real datasets. Experimental results show that it is able to align more than 80% of multireads to the best mapping position with very high accuracy. CONCLUSIONS: EM-MUL is an effective method designed to accurately determine the best mapping position of multireads in BS-reads. For the downstream applications, it is useful to improve the methylation resolution on the repetitive regions of genome. EM-MUL is free available at https://github.com/lmylynn/EM-MUL. BioMed Central 2021-05-27 /pmc/articles/PMC8161933/ /pubmed/34044763 http://dx.doi.org/10.1186/s12859-021-04204-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Liu, Mengya
Xu, Yun
An effective method to resolve ambiguous bisulfite-treated reads
title An effective method to resolve ambiguous bisulfite-treated reads
title_full An effective method to resolve ambiguous bisulfite-treated reads
title_fullStr An effective method to resolve ambiguous bisulfite-treated reads
title_full_unstemmed An effective method to resolve ambiguous bisulfite-treated reads
title_short An effective method to resolve ambiguous bisulfite-treated reads
title_sort effective method to resolve ambiguous bisulfite-treated reads
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8161933/
https://www.ncbi.nlm.nih.gov/pubmed/34044763
http://dx.doi.org/10.1186/s12859-021-04204-6
work_keys_str_mv AT liumengya aneffectivemethodtoresolveambiguousbisulfitetreatedreads
AT xuyun aneffectivemethodtoresolveambiguousbisulfitetreatedreads
AT liumengya effectivemethodtoresolveambiguousbisulfitetreatedreads
AT xuyun effectivemethodtoresolveambiguousbisulfitetreatedreads