Cargando…

Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population

BACKGROUND: Genome-wide association studies and genomic predictions are thought to be optimized by using whole-genome sequence (WGS) data. However, sequencing thousands of individuals of interest is expensive. Imputation from SNP panels to WGS data is an attractive and less expensive approach to obt...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ye, Shaopan, Yuan, Xiaolong, Lin, Xiran, Gao, Ning, Luo, Yuanyu, Chen, Zanmou, Li, Jiaqi, Zhang, Xiquan, Zhang, Zhe
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5861640/ https://www.ncbi.nlm.nih.gov/pubmed/29581880 http://dx.doi.org/10.1186/s40104-018-0241-5

_version_	1783308127624495104
author	Ye, Shaopan Yuan, Xiaolong Lin, Xiran Gao, Ning Luo, Yuanyu Chen, Zanmou Li, Jiaqi Zhang, Xiquan Zhang, Zhe
author_facet	Ye, Shaopan Yuan, Xiaolong Lin, Xiran Gao, Ning Luo, Yuanyu Chen, Zanmou Li, Jiaqi Zhang, Xiquan Zhang, Zhe
author_sort	Ye, Shaopan
collection	PubMed
description	BACKGROUND: Genome-wide association studies and genomic predictions are thought to be optimized by using whole-genome sequence (WGS) data. However, sequencing thousands of individuals of interest is expensive. Imputation from SNP panels to WGS data is an attractive and less expensive approach to obtain WGS data. The aims of this study were to investigate the accuracy of imputation and to provide insight into the design and execution of genotype imputation. RESULTS: We genotyped 450 chickens with a 600 K SNP array, and sequenced 24 key individuals by whole genome re-sequencing. Accuracy of imputation from putative 60 K and 600 K array data to WGS data was 0.620 and 0.812 for Beagle, and 0.810 and 0.914 for FImpute, respectively. By increasing the sequencing cost from 24X to 144X, the imputation accuracy increased from 0.525 to 0.698 for Beagle and from 0.654 to 0.823 for FImpute. With fixed sequence depth (12X), increasing the number of sequenced animals from 1 to 24, improved accuracy from 0.421 to 0.897 for FImpute and from 0.396 to 0.777 for Beagle. Using optimally selected key individuals resulted in a higher imputation accuracy compared with using randomly selected individuals as a reference population for re-sequencing. With fixed reference population size (24), imputation accuracy increased from 0.654 to 0.875 for FImpute and from 0.512 to 0.762 for Beagle as the sequencing depth increased from 1X to 12X. With a given total cost of genotyping, accuracy increased with the size of the reference population for FImpute, but the pattern was not valid for Beagle, which showed the highest accuracy at six fold coverage for the scenarios used in this study. CONCLUSIONS: In conclusion, we comprehensively investigated the impacts of several key factors on genotype imputation. Generally, increasing sequencing cost gave a higher imputation accuracy. But with a fixed sequencing cost, the optimal imputation enhance the performance of WGP and GWAS. An optimal imputation strategy should take size of reference population, imputation algorithms, marker density, and population structure of the target population and methods to select key individuals into consideration comprehensively. This work sheds additional light on how to design and execute genotype imputation for livestock populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40104-018-0241-5) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5861640
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-58616402018-03-26 Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population Ye, Shaopan Yuan, Xiaolong Lin, Xiran Gao, Ning Luo, Yuanyu Chen, Zanmou Li, Jiaqi Zhang, Xiquan Zhang, Zhe J Anim Sci Biotechnol Research BACKGROUND: Genome-wide association studies and genomic predictions are thought to be optimized by using whole-genome sequence (WGS) data. However, sequencing thousands of individuals of interest is expensive. Imputation from SNP panels to WGS data is an attractive and less expensive approach to obtain WGS data. The aims of this study were to investigate the accuracy of imputation and to provide insight into the design and execution of genotype imputation. RESULTS: We genotyped 450 chickens with a 600 K SNP array, and sequenced 24 key individuals by whole genome re-sequencing. Accuracy of imputation from putative 60 K and 600 K array data to WGS data was 0.620 and 0.812 for Beagle, and 0.810 and 0.914 for FImpute, respectively. By increasing the sequencing cost from 24X to 144X, the imputation accuracy increased from 0.525 to 0.698 for Beagle and from 0.654 to 0.823 for FImpute. With fixed sequence depth (12X), increasing the number of sequenced animals from 1 to 24, improved accuracy from 0.421 to 0.897 for FImpute and from 0.396 to 0.777 for Beagle. Using optimally selected key individuals resulted in a higher imputation accuracy compared with using randomly selected individuals as a reference population for re-sequencing. With fixed reference population size (24), imputation accuracy increased from 0.654 to 0.875 for FImpute and from 0.512 to 0.762 for Beagle as the sequencing depth increased from 1X to 12X. With a given total cost of genotyping, accuracy increased with the size of the reference population for FImpute, but the pattern was not valid for Beagle, which showed the highest accuracy at six fold coverage for the scenarios used in this study. CONCLUSIONS: In conclusion, we comprehensively investigated the impacts of several key factors on genotype imputation. Generally, increasing sequencing cost gave a higher imputation accuracy. But with a fixed sequencing cost, the optimal imputation enhance the performance of WGP and GWAS. An optimal imputation strategy should take size of reference population, imputation algorithms, marker density, and population structure of the target population and methods to select key individuals into consideration comprehensively. This work sheds additional light on how to design and execute genotype imputation for livestock populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40104-018-0241-5) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-21 /pmc/articles/PMC5861640/ /pubmed/29581880 http://dx.doi.org/10.1186/s40104-018-0241-5 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Ye, Shaopan Yuan, Xiaolong Lin, Xiran Gao, Ning Luo, Yuanyu Chen, Zanmou Li, Jiaqi Zhang, Xiquan Zhang, Zhe Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population
title	Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population
title_full	Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population
title_fullStr	Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population
title_full_unstemmed	Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population
title_short	Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population
title_sort	imputation from snp chip to sequence: a case study in a chinese indigenous chicken population
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5861640/ https://www.ncbi.nlm.nih.gov/pubmed/29581880 http://dx.doi.org/10.1186/s40104-018-0241-5
work_keys_str_mv	AT yeshaopan imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT yuanxiaolong imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT linxiran imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT gaoning imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT luoyuanyu imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT chenzanmou imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT lijiaqi imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT zhangxiquan imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation AT zhangzhe imputationfromsnpchiptosequenceacasestudyinachineseindigenouschickenpopulation

Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population

Ejemplares similares