Cargando…

Construction and validation of customized genomes for human and mouse ribosomal DNA mapping

rRNAs are transcribed from ribosomal DNA (rDNA) repeats, the most intensively transcribed loci in the genome. Due to their repetitive nature, there is a lack of genome assemblies suitable for rDNA mapping, creating a vacuum in our understanding of how the most abundant RNA in the cell is regulated....

Descripción completa

Detalles Bibliográficos
Autores principales: George, Subin S., Pimkin, Maxim, Paralkar, Vikram R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Biochemistry and Molecular Biology 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10245113/
https://www.ncbi.nlm.nih.gov/pubmed/37121547
http://dx.doi.org/10.1016/j.jbc.2023.104766
Descripción
Sumario:rRNAs are transcribed from ribosomal DNA (rDNA) repeats, the most intensively transcribed loci in the genome. Due to their repetitive nature, there is a lack of genome assemblies suitable for rDNA mapping, creating a vacuum in our understanding of how the most abundant RNA in the cell is regulated. Our recent work revealed binding of numerous mammalian transcription and chromatin factors to rDNA. Several of these factors were known to play critical roles in development, tissue function, and malignancy, but their potential roles in rDNA regulation remained unexplored. This demonstrated the blind spot into which rDNA has fallen in genetic and epigenetic studies and highlighted an unmet need for public rDNA-optimized genome assemblies. Here, we customized five human and mouse assemblies—hg19 (GRCh37), hg38 (GRCh38), hs1 (T2T-CHM13), mm10 (GRCm38), and mm39 (GRCm39)—to render them suitable for rDNA mapping. The standard builds of these genomes contain numerous fragmented or repetitive rDNA loci. We identified and masked all rDNA-like regions, added a single rDNA reference sequence of the appropriate species as a ∼45 kb chromosome designated “chromosome R,” and created annotation files to aid visualization of rDNA features in browser tracks. We validated these customized genomes for mapping of known rDNA-binding proteins and present a simple workflow for mapping chromatin immunoprecipitation-sequencing datasets. Customized genome assemblies, annotation files, positive and negative control tracks, and Snapgene files of standard rDNA reference sequences have been deposited to GitHub. These resources make rDNA mapping and visualization more readily accessible to a broad audience.