Cargando…

Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes

BACKGROUND: Over the last few years, continuous development of high-throughput sequencing platforms and sequence analysis tools has facilitated reliable identification and characterization of genetic variants in many cattle breeds. Deep sequencing of entire genomes within a cattle breed that has not...

Descripción completa

Detalles Bibliográficos
Autores principales: Das, Ashutosh, Panitz, Frank, Gregersen, Vivi Raundahl, Bendixen, Christian, Holm, Lars-Erik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673847/
https://www.ncbi.nlm.nih.gov/pubmed/26645365
http://dx.doi.org/10.1186/s12864-015-2249-y
_version_ 1782404821239726080
author Das, Ashutosh
Panitz, Frank
Gregersen, Vivi Raundahl
Bendixen, Christian
Holm, Lars-Erik
author_facet Das, Ashutosh
Panitz, Frank
Gregersen, Vivi Raundahl
Bendixen, Christian
Holm, Lars-Erik
author_sort Das, Ashutosh
collection PubMed
description BACKGROUND: Over the last few years, continuous development of high-throughput sequencing platforms and sequence analysis tools has facilitated reliable identification and characterization of genetic variants in many cattle breeds. Deep sequencing of entire genomes within a cattle breed that has not been thoroughly investigated would be imagined to discover functional variants that are underlying phenotypic differences. Here, we sequenced to a high coverage the Danish Holstein cattle breed to detect and characterize single nucleotide polymorphisms (SNPs), insertion/deletions (Indels), and loss-of-function (LoF) variants in protein-coding genes in order to provide a comprehensive resource for subsequent detection of causal variants for recessive traits. RESULTS: We sequenced four genetically unrelated Danish Holstein cows with a mean coverage of 27X using an Illumina Hiseq 2000. Multi-sample SNP calling identified 10,796,794 SNPs and 1,295,036 indels whereof 482,835 (4.5 %) SNPs and 231,359 (17.9 %) indels were novel. A comparison between sequencing-derived SNPs and genotyping from the BovineHD BeadChip revealed a concordance rate of 99.6–99.8 % for homozygous SNPs and 93.3–96.5 % for heterozygous SNPs. Annotation of the SNPs discovered 74,886 SNPs and 1937 indels affecting coding sequences with 2145 being LoF mutations. The frequency of LoF variants differed greatly across the genome, a hot spot with a strikingly high density was observed in a 6 Mb region on BTA18. LoF affected genes were enriched for functional categories related to olfactory reception and underrepresented for genes related to key cellular constituents and cellular and biological process regulation. Filtering using sequence derived genotype data for 288 Holstein animals from the 1000 bull genomes project removing variants containing homozygous individuals retained 345 of the LoF variants as putatively deleterious. A substantial number of the putative deleterious LoF variants had a minor allele frequency >0.05 in the 1000 bull genomes data set. CONCLUSIONS: Deep sequencing of Danish Holstein genomes enabled us to identify 12.1 million variants. An investigation into LoF variants discovered a set of variants predicted to disrupt protein-coding genes. This catalog of variants will be a resource for future studies to understand variation underlying important phenotypes, particularly recessively inherited lethal phenotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2249-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4673847
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46738472015-12-10 Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes Das, Ashutosh Panitz, Frank Gregersen, Vivi Raundahl Bendixen, Christian Holm, Lars-Erik BMC Genomics Research Article BACKGROUND: Over the last few years, continuous development of high-throughput sequencing platforms and sequence analysis tools has facilitated reliable identification and characterization of genetic variants in many cattle breeds. Deep sequencing of entire genomes within a cattle breed that has not been thoroughly investigated would be imagined to discover functional variants that are underlying phenotypic differences. Here, we sequenced to a high coverage the Danish Holstein cattle breed to detect and characterize single nucleotide polymorphisms (SNPs), insertion/deletions (Indels), and loss-of-function (LoF) variants in protein-coding genes in order to provide a comprehensive resource for subsequent detection of causal variants for recessive traits. RESULTS: We sequenced four genetically unrelated Danish Holstein cows with a mean coverage of 27X using an Illumina Hiseq 2000. Multi-sample SNP calling identified 10,796,794 SNPs and 1,295,036 indels whereof 482,835 (4.5 %) SNPs and 231,359 (17.9 %) indels were novel. A comparison between sequencing-derived SNPs and genotyping from the BovineHD BeadChip revealed a concordance rate of 99.6–99.8 % for homozygous SNPs and 93.3–96.5 % for heterozygous SNPs. Annotation of the SNPs discovered 74,886 SNPs and 1937 indels affecting coding sequences with 2145 being LoF mutations. The frequency of LoF variants differed greatly across the genome, a hot spot with a strikingly high density was observed in a 6 Mb region on BTA18. LoF affected genes were enriched for functional categories related to olfactory reception and underrepresented for genes related to key cellular constituents and cellular and biological process regulation. Filtering using sequence derived genotype data for 288 Holstein animals from the 1000 bull genomes project removing variants containing homozygous individuals retained 345 of the LoF variants as putatively deleterious. A substantial number of the putative deleterious LoF variants had a minor allele frequency >0.05 in the 1000 bull genomes data set. CONCLUSIONS: Deep sequencing of Danish Holstein genomes enabled us to identify 12.1 million variants. An investigation into LoF variants discovered a set of variants predicted to disrupt protein-coding genes. This catalog of variants will be a resource for future studies to understand variation underlying important phenotypes, particularly recessively inherited lethal phenotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2249-y) contains supplementary material, which is available to authorized users. BioMed Central 2015-12-09 /pmc/articles/PMC4673847/ /pubmed/26645365 http://dx.doi.org/10.1186/s12864-015-2249-y Text en © Das et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Das, Ashutosh
Panitz, Frank
Gregersen, Vivi Raundahl
Bendixen, Christian
Holm, Lars-Erik
Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
title Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
title_full Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
title_fullStr Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
title_full_unstemmed Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
title_short Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
title_sort deep sequencing of danish holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673847/
https://www.ncbi.nlm.nih.gov/pubmed/26645365
http://dx.doi.org/10.1186/s12864-015-2249-y
work_keys_str_mv AT dasashutosh deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes
AT panitzfrank deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes
AT gregersenviviraundahl deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes
AT bendixenchristian deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes
AT holmlarserik deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes