Cargando…

Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs

Understanding the DNA methylation patterns in the human genome is a key step to decipher gene regulatory mechanisms and model mutation rate heterogeneity in the human genome. While methylation rates can be measured e.g. with bisulfite sequencing, such measures do not capture historical patterns. Her...

Descripción completa

Detalles Bibliográficos
Autores principales: Si, Yichen, Zöllner, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055312/
https://www.ncbi.nlm.nih.gov/pubmed/36993375
http://dx.doi.org/10.1101/2023.03.24.534151
_version_ 1785015854500413440
author Si, Yichen
Zöllner, Sebastian
author_facet Si, Yichen
Zöllner, Sebastian
author_sort Si, Yichen
collection PubMed
description Understanding the DNA methylation patterns in the human genome is a key step to decipher gene regulatory mechanisms and model mutation rate heterogeneity in the human genome. While methylation rates can be measured e.g. with bisulfite sequencing, such measures do not capture historical patterns. Here we present a new method, Methylation Hidden Markov Model (MHMM), to estimate the accumulated germline methylation signature in human population history leveraging two properties: (1) Mutation rates of cytosine to thymine transitions at methylated CG dinucleotides are orders of magnitude higher than that in the rest of the genome. (2) Methylation levels are locally correlated, so the allele frequencies of neighboring CpGs can be used jointly to estimate methylation status. We applied MHMM to allele frequencies from the TOPMed and the gnomAD genetic variation catalogs. Our estimates are consistent with whole genome bisulfite sequencing (WGBS) measured human germ cell methylation levels at 90% of CpG sites, but we also identified ~ 442, 000 historically methylated CpG sites that could not be captured due to sample genetic variation, and inferred methylation status for ~ 721, 000 CpG sites that were missing from WGBS. Hypo-methylated regions identified by combining our results with experimental measures are 1.7 times more likely to recover known active genomic regions than those identified by WGBS alone. Our estimated historical methylation status can be leveraged to enhance bioinformatic analysis of germline methylation such as annotating regulatory and inactivated genomic regions and provide insights in sequence evolution including predicting mutation constraint.
format Online
Article
Text
id pubmed-10055312
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100553122023-03-30 Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs Si, Yichen Zöllner, Sebastian bioRxiv Article Understanding the DNA methylation patterns in the human genome is a key step to decipher gene regulatory mechanisms and model mutation rate heterogeneity in the human genome. While methylation rates can be measured e.g. with bisulfite sequencing, such measures do not capture historical patterns. Here we present a new method, Methylation Hidden Markov Model (MHMM), to estimate the accumulated germline methylation signature in human population history leveraging two properties: (1) Mutation rates of cytosine to thymine transitions at methylated CG dinucleotides are orders of magnitude higher than that in the rest of the genome. (2) Methylation levels are locally correlated, so the allele frequencies of neighboring CpGs can be used jointly to estimate methylation status. We applied MHMM to allele frequencies from the TOPMed and the gnomAD genetic variation catalogs. Our estimates are consistent with whole genome bisulfite sequencing (WGBS) measured human germ cell methylation levels at 90% of CpG sites, but we also identified ~ 442, 000 historically methylated CpG sites that could not be captured due to sample genetic variation, and inferred methylation status for ~ 721, 000 CpG sites that were missing from WGBS. Hypo-methylated regions identified by combining our results with experimental measures are 1.7 times more likely to recover known active genomic regions than those identified by WGBS alone. Our estimated historical methylation status can be leveraged to enhance bioinformatic analysis of germline methylation such as annotating regulatory and inactivated genomic regions and provide insights in sequence evolution including predicting mutation constraint. Cold Spring Harbor Laboratory 2023-03-25 /pmc/articles/PMC10055312/ /pubmed/36993375 http://dx.doi.org/10.1101/2023.03.24.534151 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Si, Yichen
Zöllner, Sebastian
Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs
title Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs
title_full Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs
title_fullStr Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs
title_full_unstemmed Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs
title_short Inferring CpG methylation signatures accumulated along human history from genetic variation catalogs
title_sort inferring cpg methylation signatures accumulated along human history from genetic variation catalogs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055312/
https://www.ncbi.nlm.nih.gov/pubmed/36993375
http://dx.doi.org/10.1101/2023.03.24.534151
work_keys_str_mv AT siyichen inferringcpgmethylationsignaturesaccumulatedalonghumanhistoryfromgeneticvariationcatalogs
AT zollnersebastian inferringcpgmethylationsignaturesaccumulatedalonghumanhistoryfromgeneticvariationcatalogs