Cargando…
Towards a comprehensive regulatory map of Mammalian Genomes
Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We deve...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Journal Experts
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10571623/ https://www.ncbi.nlm.nih.gov/pubmed/37841836 http://dx.doi.org/10.21203/rs.3.rs-3294408/v1 |
Sumario: | Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We developed a training data-free CRM-prediction algorithm, the Mammalian Regulatory MOdule Detector (MrMOD) for accurate CRM prediction in mammalian genomes. MrMOD provides genome position-fixed CRM models similar to the fixed gene models for the mouse and human genomes using only genomic sequences as the inputs with one adjustable parameter – the significance p-value. Importantly, MrMOD predicts a comprehensive set of high-resolution CRMs in the mouse and human genomes including all types of regulatory modules not limited to any tissue, cell type, developmental stage, or condition. We computationally validated MrMOD predictions used a compendium of 21 orthogonal experimental data sets including thousands of experimentally defined CRMs and millions of putative regulatory elements derived from hundreds of different tissues, cell types, and stimulus conditions obtained from multiple databases. In ovo transgenic reporter assay demonstrates the power of our prediction in guiding experimental design. We analyzed CRMs located in the chromosome 17 using unsupervised machine learning and identified groups of CRMs with multiple lines of evidence supporting their functionality, linking CRMs with upstream binding transcription factors and downstream target genes. Our work provides a comprehensive base pair resolution annotation of the functional regulatory elements and non-functional regions in the mammalian genomes. |
---|