Cargando…
Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data
Genotype imputation is the term used to describe the process of inferring unobserved genotypes in a sample of individuals. It is a key step prior to a genome-wide association study (GWAS) or genomic prediction. The imputation accuracy will directly influence the results from subsequent analyses. In...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762119/ https://www.ncbi.nlm.nih.gov/pubmed/35046990 http://dx.doi.org/10.3389/fgene.2021.704118 |
_version_ | 1784633689948291072 |
---|---|
author | Deng, Tianyu Zhang, Pengfei Garrick, Dorian Gao, Huijiang Wang, Lixian Zhao, Fuping |
author_facet | Deng, Tianyu Zhang, Pengfei Garrick, Dorian Gao, Huijiang Wang, Lixian Zhao, Fuping |
author_sort | Deng, Tianyu |
collection | PubMed |
description | Genotype imputation is the term used to describe the process of inferring unobserved genotypes in a sample of individuals. It is a key step prior to a genome-wide association study (GWAS) or genomic prediction. The imputation accuracy will directly influence the results from subsequent analyses. In this simulation-based study, we investigate the accuracy of genotype imputation in relation to some factors characterizing SNP chip or low-coverage whole-genome sequencing (LCWGS) data. The factors included the imputation reference population size, the proportion of target markers /SNP density, the genetic relationship (distance) between the target population and the reference population, and the imputation method. Simulations of genotypes were based on coalescence theory accounting for the demographic history of pigs. A population of simulated founders diverged to produce four separate but related populations of descendants. The genomic data of 20,000 individuals were simulated for a 10-Mb chromosome fragment. Our results showed that the proportion of target markers or SNP density was the most critical factor affecting imputation accuracy under all imputation situations. Compared with Minimac4, Beagle5.1 reproduced higher-accuracy imputed data in most cases, more notably when imputing from the LCWGS data. Compared with SNP chip data, LCWGS provided more accurate genotype imputation. Our findings provided a relatively comprehensive insight into the accuracy of genotype imputation in a realistic population of domestic animals. |
format | Online Article Text |
id | pubmed-8762119 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-87621192022-01-18 Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data Deng, Tianyu Zhang, Pengfei Garrick, Dorian Gao, Huijiang Wang, Lixian Zhao, Fuping Front Genet Genetics Genotype imputation is the term used to describe the process of inferring unobserved genotypes in a sample of individuals. It is a key step prior to a genome-wide association study (GWAS) or genomic prediction. The imputation accuracy will directly influence the results from subsequent analyses. In this simulation-based study, we investigate the accuracy of genotype imputation in relation to some factors characterizing SNP chip or low-coverage whole-genome sequencing (LCWGS) data. The factors included the imputation reference population size, the proportion of target markers /SNP density, the genetic relationship (distance) between the target population and the reference population, and the imputation method. Simulations of genotypes were based on coalescence theory accounting for the demographic history of pigs. A population of simulated founders diverged to produce four separate but related populations of descendants. The genomic data of 20,000 individuals were simulated for a 10-Mb chromosome fragment. Our results showed that the proportion of target markers or SNP density was the most critical factor affecting imputation accuracy under all imputation situations. Compared with Minimac4, Beagle5.1 reproduced higher-accuracy imputed data in most cases, more notably when imputing from the LCWGS data. Compared with SNP chip data, LCWGS provided more accurate genotype imputation. Our findings provided a relatively comprehensive insight into the accuracy of genotype imputation in a realistic population of domestic animals. Frontiers Media S.A. 2022-01-03 /pmc/articles/PMC8762119/ /pubmed/35046990 http://dx.doi.org/10.3389/fgene.2021.704118 Text en Copyright © 2022 Deng, Zhang, Garrick, Gao, Wang and Zhao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Deng, Tianyu Zhang, Pengfei Garrick, Dorian Gao, Huijiang Wang, Lixian Zhao, Fuping Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data |
title | Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data |
title_full | Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data |
title_fullStr | Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data |
title_full_unstemmed | Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data |
title_short | Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data |
title_sort | comparison of genotype imputation for snp array and low-coverage whole-genome sequencing data |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762119/ https://www.ncbi.nlm.nih.gov/pubmed/35046990 http://dx.doi.org/10.3389/fgene.2021.704118 |
work_keys_str_mv | AT dengtianyu comparisonofgenotypeimputationforsnparrayandlowcoveragewholegenomesequencingdata AT zhangpengfei comparisonofgenotypeimputationforsnparrayandlowcoveragewholegenomesequencingdata AT garrickdorian comparisonofgenotypeimputationforsnparrayandlowcoveragewholegenomesequencingdata AT gaohuijiang comparisonofgenotypeimputationforsnparrayandlowcoveragewholegenomesequencingdata AT wanglixian comparisonofgenotypeimputationforsnparrayandlowcoveragewholegenomesequencingdata AT zhaofuping comparisonofgenotypeimputationforsnparrayandlowcoveragewholegenomesequencingdata |