Cargando…

Towards a comprehensive regulatory map of Mammalian Genomes

Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We deve...

Descripción completa

Detalles Bibliográficos
Autores principales: Gonçalves, Tássia Mangetti, Stewart, Casey L, Baxley, Samantha D, Xu, Jason, Li, Daofeng, Gabel, Harrison W, Wang, Ting, Avraham, Oshri, Zhao, Guoyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Journal Experts 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10571623/
https://www.ncbi.nlm.nih.gov/pubmed/37841836
http://dx.doi.org/10.21203/rs.3.rs-3294408/v1
_version_ 1785120044219367424
author Gonçalves, Tássia Mangetti
Stewart, Casey L
Baxley, Samantha D
Xu, Jason
Li, Daofeng
Gabel, Harrison W
Wang, Ting
Avraham, Oshri
Zhao, Guoyan
author_facet Gonçalves, Tássia Mangetti
Stewart, Casey L
Baxley, Samantha D
Xu, Jason
Li, Daofeng
Gabel, Harrison W
Wang, Ting
Avraham, Oshri
Zhao, Guoyan
author_sort Gonçalves, Tássia Mangetti
collection PubMed
description Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We developed a training data-free CRM-prediction algorithm, the Mammalian Regulatory MOdule Detector (MrMOD) for accurate CRM prediction in mammalian genomes. MrMOD provides genome position-fixed CRM models similar to the fixed gene models for the mouse and human genomes using only genomic sequences as the inputs with one adjustable parameter – the significance p-value. Importantly, MrMOD predicts a comprehensive set of high-resolution CRMs in the mouse and human genomes including all types of regulatory modules not limited to any tissue, cell type, developmental stage, or condition. We computationally validated MrMOD predictions used a compendium of 21 orthogonal experimental data sets including thousands of experimentally defined CRMs and millions of putative regulatory elements derived from hundreds of different tissues, cell types, and stimulus conditions obtained from multiple databases. In ovo transgenic reporter assay demonstrates the power of our prediction in guiding experimental design. We analyzed CRMs located in the chromosome 17 using unsupervised machine learning and identified groups of CRMs with multiple lines of evidence supporting their functionality, linking CRMs with upstream binding transcription factors and downstream target genes. Our work provides a comprehensive base pair resolution annotation of the functional regulatory elements and non-functional regions in the mammalian genomes.
format Online
Article
Text
id pubmed-10571623
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Journal Experts
record_format MEDLINE/PubMed
spelling pubmed-105716232023-10-14 Towards a comprehensive regulatory map of Mammalian Genomes Gonçalves, Tássia Mangetti Stewart, Casey L Baxley, Samantha D Xu, Jason Li, Daofeng Gabel, Harrison W Wang, Ting Avraham, Oshri Zhao, Guoyan Res Sq Article Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We developed a training data-free CRM-prediction algorithm, the Mammalian Regulatory MOdule Detector (MrMOD) for accurate CRM prediction in mammalian genomes. MrMOD provides genome position-fixed CRM models similar to the fixed gene models for the mouse and human genomes using only genomic sequences as the inputs with one adjustable parameter – the significance p-value. Importantly, MrMOD predicts a comprehensive set of high-resolution CRMs in the mouse and human genomes including all types of regulatory modules not limited to any tissue, cell type, developmental stage, or condition. We computationally validated MrMOD predictions used a compendium of 21 orthogonal experimental data sets including thousands of experimentally defined CRMs and millions of putative regulatory elements derived from hundreds of different tissues, cell types, and stimulus conditions obtained from multiple databases. In ovo transgenic reporter assay demonstrates the power of our prediction in guiding experimental design. We analyzed CRMs located in the chromosome 17 using unsupervised machine learning and identified groups of CRMs with multiple lines of evidence supporting their functionality, linking CRMs with upstream binding transcription factors and downstream target genes. Our work provides a comprehensive base pair resolution annotation of the functional regulatory elements and non-functional regions in the mammalian genomes. American Journal Experts 2023-09-28 /pmc/articles/PMC10571623/ /pubmed/37841836 http://dx.doi.org/10.21203/rs.3.rs-3294408/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. https://creativecommons.org/licenses/by/4.0/License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License (https://creativecommons.org/licenses/by/4.0/)
spellingShingle Article
Gonçalves, Tássia Mangetti
Stewart, Casey L
Baxley, Samantha D
Xu, Jason
Li, Daofeng
Gabel, Harrison W
Wang, Ting
Avraham, Oshri
Zhao, Guoyan
Towards a comprehensive regulatory map of Mammalian Genomes
title Towards a comprehensive regulatory map of Mammalian Genomes
title_full Towards a comprehensive regulatory map of Mammalian Genomes
title_fullStr Towards a comprehensive regulatory map of Mammalian Genomes
title_full_unstemmed Towards a comprehensive regulatory map of Mammalian Genomes
title_short Towards a comprehensive regulatory map of Mammalian Genomes
title_sort towards a comprehensive regulatory map of mammalian genomes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10571623/
https://www.ncbi.nlm.nih.gov/pubmed/37841836
http://dx.doi.org/10.21203/rs.3.rs-3294408/v1
work_keys_str_mv AT goncalvestassiamangetti towardsacomprehensiveregulatorymapofmammaliangenomes
AT stewartcaseyl towardsacomprehensiveregulatorymapofmammaliangenomes
AT baxleysamanthad towardsacomprehensiveregulatorymapofmammaliangenomes
AT xujason towardsacomprehensiveregulatorymapofmammaliangenomes
AT lidaofeng towardsacomprehensiveregulatorymapofmammaliangenomes
AT gabelharrisonw towardsacomprehensiveregulatorymapofmammaliangenomes
AT wangting towardsacomprehensiveregulatorymapofmammaliangenomes
AT avrahamoshri towardsacomprehensiveregulatorymapofmammaliangenomes
AT zhaoguoyan towardsacomprehensiveregulatorymapofmammaliangenomes