Cargando…
Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties
Genomic regions with gene regulatory enhancer activity turnover rapidly across mammals. In contrast, gene expression patterns and transcription factor binding preferences are largely conserved between mammalian species. Based on this conservation, we hypothesized that enhancers active in different m...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6191148/ https://www.ncbi.nlm.nih.gov/pubmed/30286077 http://dx.doi.org/10.1371/journal.pcbi.1006484 |
_version_ | 1783363672920293376 |
---|---|
author | Chen, Ling Fish, Alexandra E. Capra, John A. |
author_facet | Chen, Ling Fish, Alexandra E. Capra, John A. |
author_sort | Chen, Ling |
collection | PubMed |
description | Genomic regions with gene regulatory enhancer activity turnover rapidly across mammals. In contrast, gene expression patterns and transcription factor binding preferences are largely conserved between mammalian species. Based on this conservation, we hypothesized that enhancers active in different mammals would exhibit conserved sequence patterns in spite of their different genomic locations. To investigate this hypothesis, we evaluated the extent to which sequence patterns that are predictive of enhancers in one species are predictive of enhancers in other mammalian species by training and testing two types of machine learning models. We trained support vector machine (SVM) and convolutional neural network (CNN) classifiers to distinguish enhancers defined by histone marks from the genomic background based on DNA sequence patterns in human, macaque, mouse, dog, cow, and opossum. The classifiers accurately identified many adult liver, developing limb, and developing brain enhancers, and the CNNs outperformed the SVMs. Furthermore, classifiers trained in one species and tested in another performed nearly as well as classifiers trained and tested on the same species. We observed similar cross-species conservation when applying the models to human and mouse enhancers validated in transgenic assays. This indicates that many short sequence patterns predictive of enhancers are largely conserved. The sequence patterns most predictive of enhancers in each species matched the binding motifs for a common set of TFs enriched for expression in relevant tissues, supporting the biological relevance of the learned features. Thus, despite the rapid change of active enhancer locations between mammals, cross-species enhancer prediction is often possible. Our results suggest that short sequence patterns encoding enhancer activity have been maintained across more than 180 million years of mammalian evolution. |
format | Online Article Text |
id | pubmed-6191148 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-61911482018-10-25 Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties Chen, Ling Fish, Alexandra E. Capra, John A. PLoS Comput Biol Research Article Genomic regions with gene regulatory enhancer activity turnover rapidly across mammals. In contrast, gene expression patterns and transcription factor binding preferences are largely conserved between mammalian species. Based on this conservation, we hypothesized that enhancers active in different mammals would exhibit conserved sequence patterns in spite of their different genomic locations. To investigate this hypothesis, we evaluated the extent to which sequence patterns that are predictive of enhancers in one species are predictive of enhancers in other mammalian species by training and testing two types of machine learning models. We trained support vector machine (SVM) and convolutional neural network (CNN) classifiers to distinguish enhancers defined by histone marks from the genomic background based on DNA sequence patterns in human, macaque, mouse, dog, cow, and opossum. The classifiers accurately identified many adult liver, developing limb, and developing brain enhancers, and the CNNs outperformed the SVMs. Furthermore, classifiers trained in one species and tested in another performed nearly as well as classifiers trained and tested on the same species. We observed similar cross-species conservation when applying the models to human and mouse enhancers validated in transgenic assays. This indicates that many short sequence patterns predictive of enhancers are largely conserved. The sequence patterns most predictive of enhancers in each species matched the binding motifs for a common set of TFs enriched for expression in relevant tissues, supporting the biological relevance of the learned features. Thus, despite the rapid change of active enhancer locations between mammals, cross-species enhancer prediction is often possible. Our results suggest that short sequence patterns encoding enhancer activity have been maintained across more than 180 million years of mammalian evolution. Public Library of Science 2018-10-04 /pmc/articles/PMC6191148/ /pubmed/30286077 http://dx.doi.org/10.1371/journal.pcbi.1006484 Text en © 2018 Chen et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Chen, Ling Fish, Alexandra E. Capra, John A. Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties |
title | Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties |
title_full | Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties |
title_fullStr | Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties |
title_full_unstemmed | Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties |
title_short | Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties |
title_sort | prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6191148/ https://www.ncbi.nlm.nih.gov/pubmed/30286077 http://dx.doi.org/10.1371/journal.pcbi.1006484 |
work_keys_str_mv | AT chenling predictionofgeneregulatoryenhancersacrossspeciesrevealsevolutionarilyconservedsequenceproperties AT fishalexandrae predictionofgeneregulatoryenhancersacrossspeciesrevealsevolutionarilyconservedsequenceproperties AT caprajohna predictionofgeneregulatoryenhancersacrossspeciesrevealsevolutionarilyconservedsequenceproperties |