Cargando…

Recall DNA methylation levels at low coverage sites using a CNN model in WGBS

DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Ximei, Wang, Yansu, Zou, Quan, Xu, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10266633/
https://www.ncbi.nlm.nih.gov/pubmed/37315069
http://dx.doi.org/10.1371/journal.pcbi.1011205
_version_ 1785058779914567680
author Luo, Ximei
Wang, Yansu
Zou, Quan
Xu, Lei
author_facet Luo, Ximei
Wang, Yansu
Zou, Quan
Xu, Lei
author_sort Luo, Ximei
collection PubMed
description DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels of individual sites. Many state-of-arts computation methods were proposed to predict the missing value. However, many methods required either other omics datasets or other cross-sample data. And most of them only predicted the state of DNA methylation. In this study, we proposed the RcWGBS, which can impute the missing (or low coverage) values from the DNA methylation levels on the adjacent sides. Deep learning techniques were employed for the accurate prediction. The WGBS datasets of H1-hESC and GM12878 were down-sampled. The average difference between the DNA methylation level at 12× depth predicted by RcWGBS and that at >50× depth in the H1-hESC and GM2878 cells are less than 0.03 and 0.01, respectively. RcWGBS performed better than METHimpute even though the sequencing depth was as low as 12×. Our work would help to process methylation data of low sequencing depth. It is beneficial for researchers to save sequencing costs and improve data utilization through computational methods.
format Online
Article
Text
id pubmed-10266633
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-102666332023-06-15 Recall DNA methylation levels at low coverage sites using a CNN model in WGBS Luo, Ximei Wang, Yansu Zou, Quan Xu, Lei PLoS Comput Biol Research Article DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels of individual sites. Many state-of-arts computation methods were proposed to predict the missing value. However, many methods required either other omics datasets or other cross-sample data. And most of them only predicted the state of DNA methylation. In this study, we proposed the RcWGBS, which can impute the missing (or low coverage) values from the DNA methylation levels on the adjacent sides. Deep learning techniques were employed for the accurate prediction. The WGBS datasets of H1-hESC and GM12878 were down-sampled. The average difference between the DNA methylation level at 12× depth predicted by RcWGBS and that at >50× depth in the H1-hESC and GM2878 cells are less than 0.03 and 0.01, respectively. RcWGBS performed better than METHimpute even though the sequencing depth was as low as 12×. Our work would help to process methylation data of low sequencing depth. It is beneficial for researchers to save sequencing costs and improve data utilization through computational methods. Public Library of Science 2023-06-14 /pmc/articles/PMC10266633/ /pubmed/37315069 http://dx.doi.org/10.1371/journal.pcbi.1011205 Text en © 2023 Luo et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Luo, Ximei
Wang, Yansu
Zou, Quan
Xu, Lei
Recall DNA methylation levels at low coverage sites using a CNN model in WGBS
title Recall DNA methylation levels at low coverage sites using a CNN model in WGBS
title_full Recall DNA methylation levels at low coverage sites using a CNN model in WGBS
title_fullStr Recall DNA methylation levels at low coverage sites using a CNN model in WGBS
title_full_unstemmed Recall DNA methylation levels at low coverage sites using a CNN model in WGBS
title_short Recall DNA methylation levels at low coverage sites using a CNN model in WGBS
title_sort recall dna methylation levels at low coverage sites using a cnn model in wgbs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10266633/
https://www.ncbi.nlm.nih.gov/pubmed/37315069
http://dx.doi.org/10.1371/journal.pcbi.1011205
work_keys_str_mv AT luoximei recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs
AT wangyansu recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs
AT zouquan recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs
AT xulei recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs