Cargando…

Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data

Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Quan H, Tellam, Ross L, Naval-Sanchez, Marina, Porto-Neto, Laercio R, Barendse, William, Reverter, Antonio, Hayes, Benjamin, Kijas, James, Dalrymple, Brian P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5838836/
https://www.ncbi.nlm.nih.gov/pubmed/29618048
http://dx.doi.org/10.1093/gigascience/gix136
_version_ 1783304312591482880
author Nguyen, Quan H
Tellam, Ross L
Naval-Sanchez, Marina
Porto-Neto, Laercio R
Barendse, William
Reverter, Antonio
Hayes, Benjamin
Kijas, James
Dalrymple, Brian P
author_facet Nguyen, Quan H
Tellam, Ross L
Naval-Sanchez, Marina
Porto-Neto, Laercio R
Barendse, William
Reverter, Antonio
Hayes, Benjamin
Kijas, James
Dalrymple, Brian P
author_sort Nguyen, Quan H
collection PubMed
description Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets.
format Online
Article
Text
id pubmed-5838836
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58388362018-03-28 Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data Nguyen, Quan H Tellam, Ross L Naval-Sanchez, Marina Porto-Neto, Laercio R Barendse, William Reverter, Antonio Hayes, Benjamin Kijas, James Dalrymple, Brian P Gigascience Research Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets. Oxford University Press 2018-02-16 /pmc/articles/PMC5838836/ /pubmed/29618048 http://dx.doi.org/10.1093/gigascience/gix136 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Nguyen, Quan H
Tellam, Ross L
Naval-Sanchez, Marina
Porto-Neto, Laercio R
Barendse, William
Reverter, Antonio
Hayes, Benjamin
Kijas, James
Dalrymple, Brian P
Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data
title Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data
title_full Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data
title_fullStr Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data
title_full_unstemmed Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data
title_short Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data
title_sort mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5838836/
https://www.ncbi.nlm.nih.gov/pubmed/29618048
http://dx.doi.org/10.1093/gigascience/gix136
work_keys_str_mv AT nguyenquanh mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT tellamrossl mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT navalsanchezmarina mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT portonetolaercior mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT barendsewilliam mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT reverterantonio mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT hayesbenjamin mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT kijasjames mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata
AT dalrymplebrianp mammaliangenomicregulatoryregionspredictedbyutilizinghumangenomicstranscriptomicsandepigeneticsdata