Cargando…
Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin
BACKGROUND: Evolutionary conservation is an invaluable tool for inferring functional significance in the genome, including regions that are crucial across many species and those that have undergone convergent evolution. Computational methods to test for sequence conservation are dominated by algorit...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8996547/ https://www.ncbi.nlm.nih.gov/pubmed/35410163 http://dx.doi.org/10.1186/s12864-022-08450-7 |
_version_ | 1784684511053742080 |
---|---|
author | Kaplow, Irene M. Schäffer, Daniel E. Wirthlin, Morgan E. Lawler, Alyssa J. Brown, Ashley R. Kleyman, Michael Pfenning, Andreas R. |
author_facet | Kaplow, Irene M. Schäffer, Daniel E. Wirthlin, Morgan E. Lawler, Alyssa J. Brown, Ashley R. Kleyman, Michael Pfenning, Andreas R. |
author_sort | Kaplow, Irene M. |
collection | PubMed |
description | BACKGROUND: Evolutionary conservation is an invaluable tool for inferring functional significance in the genome, including regions that are crucial across many species and those that have undergone convergent evolution. Computational methods to test for sequence conservation are dominated by algorithms that examine the ability of one or more nucleotides to align across large evolutionary distances. While these nucleotide alignment-based approaches have proven powerful for protein-coding genes and some non-coding elements, they fail to capture conservation of many enhancers, distal regulatory elements that control spatial and temporal patterns of gene expression. The function of enhancers is governed by a complex, often tissue- and cell type-specific code that links combinations of transcription factor binding sites and other regulation-related sequence patterns to regulatory activity. Thus, function of orthologous enhancer regions can be conserved across large evolutionary distances, even when nucleotide turnover is high. RESULTS: We present a new machine learning-based approach for evaluating enhancer conservation that leverages the combinatorial sequence code of enhancer activity rather than relying on the alignment of individual nucleotides. We first train a convolutional neural network model that can predict tissue-specific open chromatin, a proxy for enhancer activity, across mammals. Next, we apply that model to distinguish instances where the genome sequence would predict conserved function versus a loss of regulatory activity in that tissue. We present criteria for systematically evaluating model performance for this task and use them to demonstrate that our models accurately predict tissue-specific conservation and divergence in open chromatin between primate and rodent species, vastly out-performing leading nucleotide alignment-based approaches. We then apply our models to predict open chromatin at orthologs of brain and liver open chromatin regions across hundreds of mammals and find that brain enhancers associated with neuron activity have a stronger tendency than the general population to have predicted lineage-specific open chromatin. CONCLUSION: The framework presented here provides a mechanism to annotate tissue-specific regulatory function across hundreds of genomes and to study enhancer evolution using predicted regulatory differences rather than nucleotide-level conservation measurements. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08450-7. |
format | Online Article Text |
id | pubmed-8996547 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-89965472022-04-12 Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin Kaplow, Irene M. Schäffer, Daniel E. Wirthlin, Morgan E. Lawler, Alyssa J. Brown, Ashley R. Kleyman, Michael Pfenning, Andreas R. BMC Genomics Research BACKGROUND: Evolutionary conservation is an invaluable tool for inferring functional significance in the genome, including regions that are crucial across many species and those that have undergone convergent evolution. Computational methods to test for sequence conservation are dominated by algorithms that examine the ability of one or more nucleotides to align across large evolutionary distances. While these nucleotide alignment-based approaches have proven powerful for protein-coding genes and some non-coding elements, they fail to capture conservation of many enhancers, distal regulatory elements that control spatial and temporal patterns of gene expression. The function of enhancers is governed by a complex, often tissue- and cell type-specific code that links combinations of transcription factor binding sites and other regulation-related sequence patterns to regulatory activity. Thus, function of orthologous enhancer regions can be conserved across large evolutionary distances, even when nucleotide turnover is high. RESULTS: We present a new machine learning-based approach for evaluating enhancer conservation that leverages the combinatorial sequence code of enhancer activity rather than relying on the alignment of individual nucleotides. We first train a convolutional neural network model that can predict tissue-specific open chromatin, a proxy for enhancer activity, across mammals. Next, we apply that model to distinguish instances where the genome sequence would predict conserved function versus a loss of regulatory activity in that tissue. We present criteria for systematically evaluating model performance for this task and use them to demonstrate that our models accurately predict tissue-specific conservation and divergence in open chromatin between primate and rodent species, vastly out-performing leading nucleotide alignment-based approaches. We then apply our models to predict open chromatin at orthologs of brain and liver open chromatin regions across hundreds of mammals and find that brain enhancers associated with neuron activity have a stronger tendency than the general population to have predicted lineage-specific open chromatin. CONCLUSION: The framework presented here provides a mechanism to annotate tissue-specific regulatory function across hundreds of genomes and to study enhancer evolution using predicted regulatory differences rather than nucleotide-level conservation measurements. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08450-7. BioMed Central 2022-04-11 /pmc/articles/PMC8996547/ /pubmed/35410163 http://dx.doi.org/10.1186/s12864-022-08450-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Kaplow, Irene M. Schäffer, Daniel E. Wirthlin, Morgan E. Lawler, Alyssa J. Brown, Ashley R. Kleyman, Michael Pfenning, Andreas R. Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin |
title | Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin |
title_full | Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin |
title_fullStr | Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin |
title_full_unstemmed | Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin |
title_short | Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin |
title_sort | inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8996547/ https://www.ncbi.nlm.nih.gov/pubmed/35410163 http://dx.doi.org/10.1186/s12864-022-08450-7 |
work_keys_str_mv | AT kaplowirenem inferringmammaliantissuespecificregulatoryconservationbypredictingtissuespecificdifferencesinopenchromatin AT schafferdaniele inferringmammaliantissuespecificregulatoryconservationbypredictingtissuespecificdifferencesinopenchromatin AT wirthlinmorgane inferringmammaliantissuespecificregulatoryconservationbypredictingtissuespecificdifferencesinopenchromatin AT lawleralyssaj inferringmammaliantissuespecificregulatoryconservationbypredictingtissuespecificdifferencesinopenchromatin AT brownashleyr inferringmammaliantissuespecificregulatoryconservationbypredictingtissuespecificdifferencesinopenchromatin AT kleymanmichael inferringmammaliantissuespecificregulatoryconservationbypredictingtissuespecificdifferencesinopenchromatin AT pfenningandreasr inferringmammaliantissuespecificregulatoryconservationbypredictingtissuespecificdifferencesinopenchromatin |