Cargando…

SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes

Genome-wide nucleosome profiles are predominantly characterized using MNase-seq, which involves extensive MNase digestion and size selection to enrich for mono-nucleosome-sized fragments. Most available MNase-seq analysis packages assume that nucleosomes uniformly protect 147bp DNA fragments. Howeve...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jianyu, Yen, Kuangyu, Mahony, Shaun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10614873/
https://www.ncbi.nlm.nih.gov/pubmed/37904910
http://dx.doi.org/10.1101/2023.10.17.562727
_version_ 1785129113447563264
author Yang, Jianyu
Yen, Kuangyu
Mahony, Shaun
author_facet Yang, Jianyu
Yen, Kuangyu
Mahony, Shaun
author_sort Yang, Jianyu
collection PubMed
description Genome-wide nucleosome profiles are predominantly characterized using MNase-seq, which involves extensive MNase digestion and size selection to enrich for mono-nucleosome-sized fragments. Most available MNase-seq analysis packages assume that nucleosomes uniformly protect 147bp DNA fragments. However, some nucleosomes with atypical histone or chemical compositions protect shorter lengths of DNA. The rigid assumptions imposed by current nucleosome analysis packages ignore variation in nucleosome lengths, potentially blinding investigators to regulatory roles played by atypical nucleosomes. To enable the characterization of different nucleosome types from MNase-seq data, we introduce the Size-based Expectation Maximization (SEM) nucleosome calling package. SEM employs a hierarchical Gaussian mixture model to estimate the positions and subtype identity of nucleosomes from MNase-seq fragments. Nucleosome subtypes are automatically identified based on the distribution of protected DNA fragment lengths at nucleosome positions. Benchmark analysis indicates that SEM is on par with existing packages in terms of standard nucleosome-calling accuracy metrics, while uniquely providing the ability to characterize nucleosome subtype identities. Using SEM on a low-dose MNase H2B MNase-ChIP-seq dataset from mouse embryonic stem cells, we identified three nucleosome types: short-fragment nucleosomes, canonical nucleosomes, and di-nucleosomes. The short-fragment nucleosomes can be divided further into two subtypes based on their chromatin accessibility. Interestingly, the subset of short-fragment nucleosomes in accessible regions exhibit high MNase sensitivity and display distribution patterns around transcription start sites (TSSs) and CTCF peaks, similar to the previously reported “fragile nucleosomes”. These SEM-defined accessible short-fragment nucleosomes are found not just in promoters, but also in enhancers and other regulatory regions. Additional investigations reveal their co-localization with the chromatin remodelers Chd6, Chd8, and Ep400. In summary, SEM provides an effective platform for distinguishing various nucleosome subtypes, paving the way for future exploration of non-standard nucleosomes.
format Online
Article
Text
id pubmed-10614873
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-106148732023-10-31 SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes Yang, Jianyu Yen, Kuangyu Mahony, Shaun bioRxiv Article Genome-wide nucleosome profiles are predominantly characterized using MNase-seq, which involves extensive MNase digestion and size selection to enrich for mono-nucleosome-sized fragments. Most available MNase-seq analysis packages assume that nucleosomes uniformly protect 147bp DNA fragments. However, some nucleosomes with atypical histone or chemical compositions protect shorter lengths of DNA. The rigid assumptions imposed by current nucleosome analysis packages ignore variation in nucleosome lengths, potentially blinding investigators to regulatory roles played by atypical nucleosomes. To enable the characterization of different nucleosome types from MNase-seq data, we introduce the Size-based Expectation Maximization (SEM) nucleosome calling package. SEM employs a hierarchical Gaussian mixture model to estimate the positions and subtype identity of nucleosomes from MNase-seq fragments. Nucleosome subtypes are automatically identified based on the distribution of protected DNA fragment lengths at nucleosome positions. Benchmark analysis indicates that SEM is on par with existing packages in terms of standard nucleosome-calling accuracy metrics, while uniquely providing the ability to characterize nucleosome subtype identities. Using SEM on a low-dose MNase H2B MNase-ChIP-seq dataset from mouse embryonic stem cells, we identified three nucleosome types: short-fragment nucleosomes, canonical nucleosomes, and di-nucleosomes. The short-fragment nucleosomes can be divided further into two subtypes based on their chromatin accessibility. Interestingly, the subset of short-fragment nucleosomes in accessible regions exhibit high MNase sensitivity and display distribution patterns around transcription start sites (TSSs) and CTCF peaks, similar to the previously reported “fragile nucleosomes”. These SEM-defined accessible short-fragment nucleosomes are found not just in promoters, but also in enhancers and other regulatory regions. Additional investigations reveal their co-localization with the chromatin remodelers Chd6, Chd8, and Ep400. In summary, SEM provides an effective platform for distinguishing various nucleosome subtypes, paving the way for future exploration of non-standard nucleosomes. Cold Spring Harbor Laboratory 2023-10-20 /pmc/articles/PMC10614873/ /pubmed/37904910 http://dx.doi.org/10.1101/2023.10.17.562727 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Yang, Jianyu
Yen, Kuangyu
Mahony, Shaun
SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes
title SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes
title_full SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes
title_fullStr SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes
title_full_unstemmed SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes
title_short SEM: sized-based expectation maximization for characterizing nucleosome positions and subtypes
title_sort sem: sized-based expectation maximization for characterizing nucleosome positions and subtypes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10614873/
https://www.ncbi.nlm.nih.gov/pubmed/37904910
http://dx.doi.org/10.1101/2023.10.17.562727
work_keys_str_mv AT yangjianyu semsizedbasedexpectationmaximizationforcharacterizingnucleosomepositionsandsubtypes
AT yenkuangyu semsizedbasedexpectationmaximizationforcharacterizingnucleosomepositionsandsubtypes
AT mahonyshaun semsizedbasedexpectationmaximizationforcharacterizingnucleosomepositionsandsubtypes