Cargando…
Detection and validation of structural variations in bovine whole-genome sequence data
BACKGROUND: Several examples of structural variation (SV) affecting phenotypic traits have been reported in cattle. Currently the identification of SV from whole-genome sequence data (WGS) suffers from a high false positive rate. Our aim was to construct a high quality set of SV calls in cattle usin...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5267451/ https://www.ncbi.nlm.nih.gov/pubmed/28122487 http://dx.doi.org/10.1186/s12711-017-0286-5 |
_version_ | 1782500642890186752 |
---|---|
author | Chen, Long Chamberlain, Amanda J. Reich, Coralie M. Daetwyler, Hans D. Hayes, Ben J. |
author_facet | Chen, Long Chamberlain, Amanda J. Reich, Coralie M. Daetwyler, Hans D. Hayes, Ben J. |
author_sort | Chen, Long |
collection | PubMed |
description | BACKGROUND: Several examples of structural variation (SV) affecting phenotypic traits have been reported in cattle. Currently the identification of SV from whole-genome sequence data (WGS) suffers from a high false positive rate. Our aim was to construct a high quality set of SV calls in cattle using WGS data. First, we tested two SV detection programs, Breakdancer and Pindel, and the overlap of these methods, on simulated sequence data to determine their precision and sensitivity. We then identified population SV from WGS of 252 Holstein and 64 Jersey bulls based on the overlapping calls from the two programs. In addition, we validated an overlapped SV set in 28 twice-sequenced Holstein individuals, and in another two validated sets (one for each breed) that were transmitted from sire to son. We also tested whether highly conserved gene sets across eukaryotes and recently expanded gene families in bovine were depleted and enriched, respectively, for SV. RESULTS: In empirical WGS data, 17,518 SV covering 27.36 Mb were found in the Holstein population and 4285 SV covering 8.74 Mb in the Jersey population, of which 4.62 Mb of SV overlapped between Holsteins and Jerseys. A total of 11,534 candidate SV covering 5.64 Mb were validated in the 28 twice-sequenced individuals, while 3.49 and 0.67 Mb of SV were validated from Holstein and Jersey sire-son transmission, respectively. Only eight of 237 core eukaryotic genes had at least a 50-bp overlap with an SV from our validated sets, suggesting that conserved genes are depleted for SV (p < 0.05). In addition, we observed that recently expanded gene families were significantly more associated with SV than other genes. Long interspersed nuclear elements-1 were enriched for deletions when compared to the rest of the genome (p = 0.0035). CONCLUSIONS: We reported SV from 252 Holstein and 64 Jersey individuals. A considerable proportion of Jersey population SV (53.5%) were also found in Holstein. In contrast, about 76.90% sire-son transmission validated SV were present in Jerseys and Holsteins. The enrichment of SV in expanding gene families suggests that SV can be a source of genetic variation for evolution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12711-017-0286-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5267451 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52674512017-02-01 Detection and validation of structural variations in bovine whole-genome sequence data Chen, Long Chamberlain, Amanda J. Reich, Coralie M. Daetwyler, Hans D. Hayes, Ben J. Genet Sel Evol Research Article BACKGROUND: Several examples of structural variation (SV) affecting phenotypic traits have been reported in cattle. Currently the identification of SV from whole-genome sequence data (WGS) suffers from a high false positive rate. Our aim was to construct a high quality set of SV calls in cattle using WGS data. First, we tested two SV detection programs, Breakdancer and Pindel, and the overlap of these methods, on simulated sequence data to determine their precision and sensitivity. We then identified population SV from WGS of 252 Holstein and 64 Jersey bulls based on the overlapping calls from the two programs. In addition, we validated an overlapped SV set in 28 twice-sequenced Holstein individuals, and in another two validated sets (one for each breed) that were transmitted from sire to son. We also tested whether highly conserved gene sets across eukaryotes and recently expanded gene families in bovine were depleted and enriched, respectively, for SV. RESULTS: In empirical WGS data, 17,518 SV covering 27.36 Mb were found in the Holstein population and 4285 SV covering 8.74 Mb in the Jersey population, of which 4.62 Mb of SV overlapped between Holsteins and Jerseys. A total of 11,534 candidate SV covering 5.64 Mb were validated in the 28 twice-sequenced individuals, while 3.49 and 0.67 Mb of SV were validated from Holstein and Jersey sire-son transmission, respectively. Only eight of 237 core eukaryotic genes had at least a 50-bp overlap with an SV from our validated sets, suggesting that conserved genes are depleted for SV (p < 0.05). In addition, we observed that recently expanded gene families were significantly more associated with SV than other genes. Long interspersed nuclear elements-1 were enriched for deletions when compared to the rest of the genome (p = 0.0035). CONCLUSIONS: We reported SV from 252 Holstein and 64 Jersey individuals. A considerable proportion of Jersey population SV (53.5%) were also found in Holstein. In contrast, about 76.90% sire-son transmission validated SV were present in Jerseys and Holsteins. The enrichment of SV in expanding gene families suggests that SV can be a source of genetic variation for evolution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12711-017-0286-5) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-25 /pmc/articles/PMC5267451/ /pubmed/28122487 http://dx.doi.org/10.1186/s12711-017-0286-5 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Chen, Long Chamberlain, Amanda J. Reich, Coralie M. Daetwyler, Hans D. Hayes, Ben J. Detection and validation of structural variations in bovine whole-genome sequence data |
title | Detection and validation of structural variations in bovine whole-genome sequence data |
title_full | Detection and validation of structural variations in bovine whole-genome sequence data |
title_fullStr | Detection and validation of structural variations in bovine whole-genome sequence data |
title_full_unstemmed | Detection and validation of structural variations in bovine whole-genome sequence data |
title_short | Detection and validation of structural variations in bovine whole-genome sequence data |
title_sort | detection and validation of structural variations in bovine whole-genome sequence data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5267451/ https://www.ncbi.nlm.nih.gov/pubmed/28122487 http://dx.doi.org/10.1186/s12711-017-0286-5 |
work_keys_str_mv | AT chenlong detectionandvalidationofstructuralvariationsinbovinewholegenomesequencedata AT chamberlainamandaj detectionandvalidationofstructuralvariationsinbovinewholegenomesequencedata AT reichcoraliem detectionandvalidationofstructuralvariationsinbovinewholegenomesequencedata AT daetwylerhansd detectionandvalidationofstructuralvariationsinbovinewholegenomesequencedata AT hayesbenj detectionandvalidationofstructuralvariationsinbovinewholegenomesequencedata |