Cargando…
Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals
Genotype imputation from BeadChip to whole-genome sequencing (WGS) data is a cost-effective method of obtaining genotypes of WGS variants. Beagle, one of the most popular imputation software programs, has been widely used for genotype inference in humans and non-human species. A few studies have sys...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9459117/ https://www.ncbi.nlm.nih.gov/pubmed/36092888 http://dx.doi.org/10.3389/fgene.2022.963654 |
_version_ | 1784786433782841344 |
---|---|
author | Jiang, Yifan Song, Hailiang Gao, Hongding Zhang, Qin Ding, Xiangdong |
author_facet | Jiang, Yifan Song, Hailiang Gao, Hongding Zhang, Qin Ding, Xiangdong |
author_sort | Jiang, Yifan |
collection | PubMed |
description | Genotype imputation from BeadChip to whole-genome sequencing (WGS) data is a cost-effective method of obtaining genotypes of WGS variants. Beagle, one of the most popular imputation software programs, has been widely used for genotype inference in humans and non-human species. A few studies have systematically and comprehensively compared the performance of beagle versions and parameter settings of farm animals. Here, we investigated the imputation performance of three representative versions of Beagle (Beagle 4.1, Beagle 5.0, and Beagle 5.4), and the effective population size (Ne) parameter setting for three species (cattle, pig, and chicken). Six scenarios were investigated to explore the impact of certain key factors on imputation performance. The results showed that the default Ne (1,000,000) is not suitable for livestock and poultry in small reference or low-density arrays of target panels, with 2.47%–10.45% drops in accuracy. Beagle 5 significantly reduced the computation time (4.66-fold–13.24-fold) without an accuracy loss. In addition, using a large combined-reference panel or high-density chip provides greater imputation accuracy, especially for low minor allele frequency (MAF) variants. Finally, a highly significant correlation in the measures of imputation accuracy can be obtained with an MAF equal to or greater than 0.05. |
format | Online Article Text |
id | pubmed-9459117 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-94591172022-09-10 Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals Jiang, Yifan Song, Hailiang Gao, Hongding Zhang, Qin Ding, Xiangdong Front Genet Genetics Genotype imputation from BeadChip to whole-genome sequencing (WGS) data is a cost-effective method of obtaining genotypes of WGS variants. Beagle, one of the most popular imputation software programs, has been widely used for genotype inference in humans and non-human species. A few studies have systematically and comprehensively compared the performance of beagle versions and parameter settings of farm animals. Here, we investigated the imputation performance of three representative versions of Beagle (Beagle 4.1, Beagle 5.0, and Beagle 5.4), and the effective population size (Ne) parameter setting for three species (cattle, pig, and chicken). Six scenarios were investigated to explore the impact of certain key factors on imputation performance. The results showed that the default Ne (1,000,000) is not suitable for livestock and poultry in small reference or low-density arrays of target panels, with 2.47%–10.45% drops in accuracy. Beagle 5 significantly reduced the computation time (4.66-fold–13.24-fold) without an accuracy loss. In addition, using a large combined-reference panel or high-density chip provides greater imputation accuracy, especially for low minor allele frequency (MAF) variants. Finally, a highly significant correlation in the measures of imputation accuracy can be obtained with an MAF equal to or greater than 0.05. Frontiers Media S.A. 2022-08-26 /pmc/articles/PMC9459117/ /pubmed/36092888 http://dx.doi.org/10.3389/fgene.2022.963654 Text en Copyright © 2022 Jiang, Song, Gao, Zhang and Ding. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Jiang, Yifan Song, Hailiang Gao, Hongding Zhang, Qin Ding, Xiangdong Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals |
title | Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals |
title_full | Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals |
title_fullStr | Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals |
title_full_unstemmed | Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals |
title_short | Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals |
title_sort | exploring the optimal strategy of imputation from snp array to whole-genome sequencing data in farm animals |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9459117/ https://www.ncbi.nlm.nih.gov/pubmed/36092888 http://dx.doi.org/10.3389/fgene.2022.963654 |
work_keys_str_mv | AT jiangyifan exploringtheoptimalstrategyofimputationfromsnparraytowholegenomesequencingdatainfarmanimals AT songhailiang exploringtheoptimalstrategyofimputationfromsnparraytowholegenomesequencingdatainfarmanimals AT gaohongding exploringtheoptimalstrategyofimputationfromsnparraytowholegenomesequencingdatainfarmanimals AT zhangqin exploringtheoptimalstrategyofimputationfromsnparraytowholegenomesequencingdatainfarmanimals AT dingxiangdong exploringtheoptimalstrategyofimputationfromsnparraytowholegenomesequencingdatainfarmanimals |