Cargando…
Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes
BACKGROUND: Over the last few years, continuous development of high-throughput sequencing platforms and sequence analysis tools has facilitated reliable identification and characterization of genetic variants in many cattle breeds. Deep sequencing of entire genomes within a cattle breed that has not...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673847/ https://www.ncbi.nlm.nih.gov/pubmed/26645365 http://dx.doi.org/10.1186/s12864-015-2249-y |
_version_ | 1782404821239726080 |
---|---|
author | Das, Ashutosh Panitz, Frank Gregersen, Vivi Raundahl Bendixen, Christian Holm, Lars-Erik |
author_facet | Das, Ashutosh Panitz, Frank Gregersen, Vivi Raundahl Bendixen, Christian Holm, Lars-Erik |
author_sort | Das, Ashutosh |
collection | PubMed |
description | BACKGROUND: Over the last few years, continuous development of high-throughput sequencing platforms and sequence analysis tools has facilitated reliable identification and characterization of genetic variants in many cattle breeds. Deep sequencing of entire genomes within a cattle breed that has not been thoroughly investigated would be imagined to discover functional variants that are underlying phenotypic differences. Here, we sequenced to a high coverage the Danish Holstein cattle breed to detect and characterize single nucleotide polymorphisms (SNPs), insertion/deletions (Indels), and loss-of-function (LoF) variants in protein-coding genes in order to provide a comprehensive resource for subsequent detection of causal variants for recessive traits. RESULTS: We sequenced four genetically unrelated Danish Holstein cows with a mean coverage of 27X using an Illumina Hiseq 2000. Multi-sample SNP calling identified 10,796,794 SNPs and 1,295,036 indels whereof 482,835 (4.5 %) SNPs and 231,359 (17.9 %) indels were novel. A comparison between sequencing-derived SNPs and genotyping from the BovineHD BeadChip revealed a concordance rate of 99.6–99.8 % for homozygous SNPs and 93.3–96.5 % for heterozygous SNPs. Annotation of the SNPs discovered 74,886 SNPs and 1937 indels affecting coding sequences with 2145 being LoF mutations. The frequency of LoF variants differed greatly across the genome, a hot spot with a strikingly high density was observed in a 6 Mb region on BTA18. LoF affected genes were enriched for functional categories related to olfactory reception and underrepresented for genes related to key cellular constituents and cellular and biological process regulation. Filtering using sequence derived genotype data for 288 Holstein animals from the 1000 bull genomes project removing variants containing homozygous individuals retained 345 of the LoF variants as putatively deleterious. A substantial number of the putative deleterious LoF variants had a minor allele frequency >0.05 in the 1000 bull genomes data set. CONCLUSIONS: Deep sequencing of Danish Holstein genomes enabled us to identify 12.1 million variants. An investigation into LoF variants discovered a set of variants predicted to disrupt protein-coding genes. This catalog of variants will be a resource for future studies to understand variation underlying important phenotypes, particularly recessively inherited lethal phenotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2249-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4673847 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-46738472015-12-10 Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes Das, Ashutosh Panitz, Frank Gregersen, Vivi Raundahl Bendixen, Christian Holm, Lars-Erik BMC Genomics Research Article BACKGROUND: Over the last few years, continuous development of high-throughput sequencing platforms and sequence analysis tools has facilitated reliable identification and characterization of genetic variants in many cattle breeds. Deep sequencing of entire genomes within a cattle breed that has not been thoroughly investigated would be imagined to discover functional variants that are underlying phenotypic differences. Here, we sequenced to a high coverage the Danish Holstein cattle breed to detect and characterize single nucleotide polymorphisms (SNPs), insertion/deletions (Indels), and loss-of-function (LoF) variants in protein-coding genes in order to provide a comprehensive resource for subsequent detection of causal variants for recessive traits. RESULTS: We sequenced four genetically unrelated Danish Holstein cows with a mean coverage of 27X using an Illumina Hiseq 2000. Multi-sample SNP calling identified 10,796,794 SNPs and 1,295,036 indels whereof 482,835 (4.5 %) SNPs and 231,359 (17.9 %) indels were novel. A comparison between sequencing-derived SNPs and genotyping from the BovineHD BeadChip revealed a concordance rate of 99.6–99.8 % for homozygous SNPs and 93.3–96.5 % for heterozygous SNPs. Annotation of the SNPs discovered 74,886 SNPs and 1937 indels affecting coding sequences with 2145 being LoF mutations. The frequency of LoF variants differed greatly across the genome, a hot spot with a strikingly high density was observed in a 6 Mb region on BTA18. LoF affected genes were enriched for functional categories related to olfactory reception and underrepresented for genes related to key cellular constituents and cellular and biological process regulation. Filtering using sequence derived genotype data for 288 Holstein animals from the 1000 bull genomes project removing variants containing homozygous individuals retained 345 of the LoF variants as putatively deleterious. A substantial number of the putative deleterious LoF variants had a minor allele frequency >0.05 in the 1000 bull genomes data set. CONCLUSIONS: Deep sequencing of Danish Holstein genomes enabled us to identify 12.1 million variants. An investigation into LoF variants discovered a set of variants predicted to disrupt protein-coding genes. This catalog of variants will be a resource for future studies to understand variation underlying important phenotypes, particularly recessively inherited lethal phenotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2249-y) contains supplementary material, which is available to authorized users. BioMed Central 2015-12-09 /pmc/articles/PMC4673847/ /pubmed/26645365 http://dx.doi.org/10.1186/s12864-015-2249-y Text en © Das et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Das, Ashutosh Panitz, Frank Gregersen, Vivi Raundahl Bendixen, Christian Holm, Lars-Erik Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes |
title | Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes |
title_full | Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes |
title_fullStr | Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes |
title_full_unstemmed | Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes |
title_short | Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes |
title_sort | deep sequencing of danish holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673847/ https://www.ncbi.nlm.nih.gov/pubmed/26645365 http://dx.doi.org/10.1186/s12864-015-2249-y |
work_keys_str_mv | AT dasashutosh deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes AT panitzfrank deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes AT gregersenviviraundahl deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes AT bendixenchristian deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes AT holmlarserik deepsequencingofdanishholsteindairycattleforvariantdetectionandinsightintopotentiallossoffunctionvariantsinproteincodinggenes |