Cargando…
Recall DNA methylation levels at low coverage sites using a CNN model in WGBS
DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10266633/ https://www.ncbi.nlm.nih.gov/pubmed/37315069 http://dx.doi.org/10.1371/journal.pcbi.1011205 |
_version_ | 1785058779914567680 |
---|---|
author | Luo, Ximei Wang, Yansu Zou, Quan Xu, Lei |
author_facet | Luo, Ximei Wang, Yansu Zou, Quan Xu, Lei |
author_sort | Luo, Ximei |
collection | PubMed |
description | DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels of individual sites. Many state-of-arts computation methods were proposed to predict the missing value. However, many methods required either other omics datasets or other cross-sample data. And most of them only predicted the state of DNA methylation. In this study, we proposed the RcWGBS, which can impute the missing (or low coverage) values from the DNA methylation levels on the adjacent sides. Deep learning techniques were employed for the accurate prediction. The WGBS datasets of H1-hESC and GM12878 were down-sampled. The average difference between the DNA methylation level at 12× depth predicted by RcWGBS and that at >50× depth in the H1-hESC and GM2878 cells are less than 0.03 and 0.01, respectively. RcWGBS performed better than METHimpute even though the sequencing depth was as low as 12×. Our work would help to process methylation data of low sequencing depth. It is beneficial for researchers to save sequencing costs and improve data utilization through computational methods. |
format | Online Article Text |
id | pubmed-10266633 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-102666332023-06-15 Recall DNA methylation levels at low coverage sites using a CNN model in WGBS Luo, Ximei Wang, Yansu Zou, Quan Xu, Lei PLoS Comput Biol Research Article DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels of individual sites. Many state-of-arts computation methods were proposed to predict the missing value. However, many methods required either other omics datasets or other cross-sample data. And most of them only predicted the state of DNA methylation. In this study, we proposed the RcWGBS, which can impute the missing (or low coverage) values from the DNA methylation levels on the adjacent sides. Deep learning techniques were employed for the accurate prediction. The WGBS datasets of H1-hESC and GM12878 were down-sampled. The average difference between the DNA methylation level at 12× depth predicted by RcWGBS and that at >50× depth in the H1-hESC and GM2878 cells are less than 0.03 and 0.01, respectively. RcWGBS performed better than METHimpute even though the sequencing depth was as low as 12×. Our work would help to process methylation data of low sequencing depth. It is beneficial for researchers to save sequencing costs and improve data utilization through computational methods. Public Library of Science 2023-06-14 /pmc/articles/PMC10266633/ /pubmed/37315069 http://dx.doi.org/10.1371/journal.pcbi.1011205 Text en © 2023 Luo et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Luo, Ximei Wang, Yansu Zou, Quan Xu, Lei Recall DNA methylation levels at low coverage sites using a CNN model in WGBS |
title | Recall DNA methylation levels at low coverage sites using a CNN model in WGBS |
title_full | Recall DNA methylation levels at low coverage sites using a CNN model in WGBS |
title_fullStr | Recall DNA methylation levels at low coverage sites using a CNN model in WGBS |
title_full_unstemmed | Recall DNA methylation levels at low coverage sites using a CNN model in WGBS |
title_short | Recall DNA methylation levels at low coverage sites using a CNN model in WGBS |
title_sort | recall dna methylation levels at low coverage sites using a cnn model in wgbs |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10266633/ https://www.ncbi.nlm.nih.gov/pubmed/37315069 http://dx.doi.org/10.1371/journal.pcbi.1011205 |
work_keys_str_mv | AT luoximei recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs AT wangyansu recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs AT zouquan recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs AT xulei recalldnamethylationlevelsatlowcoveragesitesusingacnnmodelinwgbs |