Cargando…

Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits

BACKGROUND: There are an exceedingly large number of sequence variants discovered through whole genome sequencing in most populations, including cattle. Deciphering which of these affect complex traits is a major challenge. In this study we hypothesize that variants in some functional classes, such...

Descripción completa

Detalles Bibliográficos
Autores principales: Koufariotis, Lambros T., Chen, Yi-Ping Phoebe, Stothard, Paul, Hayes, Ben J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5885354/
https://www.ncbi.nlm.nih.gov/pubmed/29618315
http://dx.doi.org/10.1186/s12864-018-4617-x
_version_ 1783311968753418240
author Koufariotis, Lambros T.
Chen, Yi-Ping Phoebe
Stothard, Paul
Hayes, Ben J.
author_facet Koufariotis, Lambros T.
Chen, Yi-Ping Phoebe
Stothard, Paul
Hayes, Ben J.
author_sort Koufariotis, Lambros T.
collection PubMed
description BACKGROUND: There are an exceedingly large number of sequence variants discovered through whole genome sequencing in most populations, including cattle. Deciphering which of these affect complex traits is a major challenge. In this study we hypothesize that variants in some functional classes, such as splice site regions, coding regions, DNA methylated regions and long noncoding RNA will explain more variance in complex traits than others. Two variance component approaches were used to test this hypothesis – the first determines if variants in a functional class capture a greater proportion of the variance, than expected by chance, the second uses the proportion of variance explained when variants in all annotations are fitted simultaneously. RESULTS: Our data set consisted of 28.3 million imputed whole genome sequence variants in 16,581 dairy cattle with records for 6 complex trait phenotypes, including production and fertility. We found that sequence variants in splice site regions and synonymous classes captured the greatest proportion of the variance, explaining up to 50% of the variance across all traits. We also found sequence variants in target sites for DNA methylation (genomic regions that are found be highly methylated in bovine placentas), captured a significant proportion of the variance. Per sequence variant, splice site variants explain the highest proportion of variance in this study. The proportion of variance captured by the missense predicted deleterious (from SIFT) and missense tolerated classes was relatively small. CONCLUSION: The results demonstrate using functional annotations to filter whole genome sequence variants into more informative subsets could be useful for prioritization of the variants that are more likely to be associated with complex traits. In addition to variants found in splice sites and protein coding genes regulatory variants and those found in DNA methylated regions, explained considerable variation in milk production and fertility traits. In our analysis synonymous variants captured a significant proportion of the variance, which raises the possible explanation that synonymous mutations might have some effects, or more likely that these variants are miss-annotated, or alternatively the results reflect imperfect imputation of the actual causative variants. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4617-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5885354
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58853542018-04-09 Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits Koufariotis, Lambros T. Chen, Yi-Ping Phoebe Stothard, Paul Hayes, Ben J. BMC Genomics Research Article BACKGROUND: There are an exceedingly large number of sequence variants discovered through whole genome sequencing in most populations, including cattle. Deciphering which of these affect complex traits is a major challenge. In this study we hypothesize that variants in some functional classes, such as splice site regions, coding regions, DNA methylated regions and long noncoding RNA will explain more variance in complex traits than others. Two variance component approaches were used to test this hypothesis – the first determines if variants in a functional class capture a greater proportion of the variance, than expected by chance, the second uses the proportion of variance explained when variants in all annotations are fitted simultaneously. RESULTS: Our data set consisted of 28.3 million imputed whole genome sequence variants in 16,581 dairy cattle with records for 6 complex trait phenotypes, including production and fertility. We found that sequence variants in splice site regions and synonymous classes captured the greatest proportion of the variance, explaining up to 50% of the variance across all traits. We also found sequence variants in target sites for DNA methylation (genomic regions that are found be highly methylated in bovine placentas), captured a significant proportion of the variance. Per sequence variant, splice site variants explain the highest proportion of variance in this study. The proportion of variance captured by the missense predicted deleterious (from SIFT) and missense tolerated classes was relatively small. CONCLUSION: The results demonstrate using functional annotations to filter whole genome sequence variants into more informative subsets could be useful for prioritization of the variants that are more likely to be associated with complex traits. In addition to variants found in splice sites and protein coding genes regulatory variants and those found in DNA methylated regions, explained considerable variation in milk production and fertility traits. In our analysis synonymous variants captured a significant proportion of the variance, which raises the possible explanation that synonymous mutations might have some effects, or more likely that these variants are miss-annotated, or alternatively the results reflect imperfect imputation of the actual causative variants. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4617-x) contains supplementary material, which is available to authorized users. BioMed Central 2018-04-05 /pmc/articles/PMC5885354/ /pubmed/29618315 http://dx.doi.org/10.1186/s12864-018-4617-x Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Koufariotis, Lambros T.
Chen, Yi-Ping Phoebe
Stothard, Paul
Hayes, Ben J.
Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits
title Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits
title_full Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits
title_fullStr Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits
title_full_unstemmed Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits
title_short Variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits
title_sort variance explained by whole genome sequence variants in coding and regulatory genome annotations for six dairy traits
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5885354/
https://www.ncbi.nlm.nih.gov/pubmed/29618315
http://dx.doi.org/10.1186/s12864-018-4617-x
work_keys_str_mv AT koufariotislambrost varianceexplainedbywholegenomesequencevariantsincodingandregulatorygenomeannotationsforsixdairytraits
AT chenyipingphoebe varianceexplainedbywholegenomesequencevariantsincodingandregulatorygenomeannotationsforsixdairytraits
AT stothardpaul varianceexplainedbywholegenomesequencevariantsincodingandregulatorygenomeannotationsforsixdairytraits
AT hayesbenj varianceexplainedbywholegenomesequencevariantsincodingandregulatorygenomeannotationsforsixdairytraits