Cargando…

Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations

BACKGROUND: Genotyping by sequencing (GBS) still has problems with missing genotypes. Imputation is important for using GBS for genomic predictions, especially for low depths, due to the large number of missing genotypes. Minor allele frequency (MAF) is widely used as a marker data editing criteria...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Xiao, Su, Guosheng, Hao, Dan, Lund, Mogens Sandø, Kadarmideen, Haja N.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6947967/ https://www.ncbi.nlm.nih.gov/pubmed/31921417 http://dx.doi.org/10.1186/s40104-019-0407-9

_version_	1783485664077021184
author	Wang, Xiao Su, Guosheng Hao, Dan Lund, Mogens Sandø Kadarmideen, Haja N.
author_facet	Wang, Xiao Su, Guosheng Hao, Dan Lund, Mogens Sandø Kadarmideen, Haja N.
author_sort	Wang, Xiao
collection	PubMed
description	BACKGROUND: Genotyping by sequencing (GBS) still has problems with missing genotypes. Imputation is important for using GBS for genomic predictions, especially for low depths, due to the large number of missing genotypes. Minor allele frequency (MAF) is widely used as a marker data editing criteria for genomic predictions. In this study, three imputation methods (Beagle, IMPUTE2 and FImpute software) based on four MAF editing criteria were investigated with regard to imputation accuracy of missing genotypes and accuracy of genomic predictions, based on simulated data of livestock population. RESULTS: Four MAFs (no MAF limit, MAF ≥ 0.001, MAF ≥ 0.01 and MAF ≥ 0.03) were used for editing marker data before imputation. Beagle, IMPUTE2 and FImpute software were applied to impute the original GBS. Additionally, IMPUTE2 also imputed the expected genotype dosage after genotype correction (GcIM). The reliability of genomic predictions was calculated using GBS and imputed GBS data. The results showed that imputation accuracies were the same for the three imputation methods, except for the data of sequencing read depth (depth) = 2, where FImpute had a slightly lower imputation accuracy than Beagle and IMPUTE2. GcIM was observed to be the best for all of the imputations at depth = 4, 5 and 10, but the worst for depth = 2. For genomic prediction, retaining more SNPs with no MAF limit resulted in higher reliability. As the depth increased to 10, the prediction reliabilities approached those using true genotypes in the GBS loci. Beagle and IMPUTE2 had the largest increases in prediction reliability of 5 percentage points, and FImpute gained 3 percentage points at depth = 2. The best prediction was observed at depth = 4, 5 and 10 using GcIM, but the worst prediction was also observed using GcIM at depth = 2. CONCLUSIONS: The current study showed that imputation accuracies were relatively low for GBS with low depths and high for GBS with high depths. Imputation resulted in larger gains in the reliability of genomic predictions for GBS with lower depths. These results suggest that the application of IMPUTE2, based on a corrected GBS (GcIM) to improve genomic predictions for higher depths, and FImpute software could be a good alternative for routine imputation.
format	Online Article Text
id	pubmed-6947967
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-69479672020-01-09 Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations Wang, Xiao Su, Guosheng Hao, Dan Lund, Mogens Sandø Kadarmideen, Haja N. J Anim Sci Biotechnol Research BACKGROUND: Genotyping by sequencing (GBS) still has problems with missing genotypes. Imputation is important for using GBS for genomic predictions, especially for low depths, due to the large number of missing genotypes. Minor allele frequency (MAF) is widely used as a marker data editing criteria for genomic predictions. In this study, three imputation methods (Beagle, IMPUTE2 and FImpute software) based on four MAF editing criteria were investigated with regard to imputation accuracy of missing genotypes and accuracy of genomic predictions, based on simulated data of livestock population. RESULTS: Four MAFs (no MAF limit, MAF ≥ 0.001, MAF ≥ 0.01 and MAF ≥ 0.03) were used for editing marker data before imputation. Beagle, IMPUTE2 and FImpute software were applied to impute the original GBS. Additionally, IMPUTE2 also imputed the expected genotype dosage after genotype correction (GcIM). The reliability of genomic predictions was calculated using GBS and imputed GBS data. The results showed that imputation accuracies were the same for the three imputation methods, except for the data of sequencing read depth (depth) = 2, where FImpute had a slightly lower imputation accuracy than Beagle and IMPUTE2. GcIM was observed to be the best for all of the imputations at depth = 4, 5 and 10, but the worst for depth = 2. For genomic prediction, retaining more SNPs with no MAF limit resulted in higher reliability. As the depth increased to 10, the prediction reliabilities approached those using true genotypes in the GBS loci. Beagle and IMPUTE2 had the largest increases in prediction reliability of 5 percentage points, and FImpute gained 3 percentage points at depth = 2. The best prediction was observed at depth = 4, 5 and 10 using GcIM, but the worst prediction was also observed using GcIM at depth = 2. CONCLUSIONS: The current study showed that imputation accuracies were relatively low for GBS with low depths and high for GBS with high depths. Imputation resulted in larger gains in the reliability of genomic predictions for GBS with lower depths. These results suggest that the application of IMPUTE2, based on a corrected GBS (GcIM) to improve genomic predictions for higher depths, and FImpute software could be a good alternative for routine imputation. BioMed Central 2020-01-07 /pmc/articles/PMC6947967/ /pubmed/31921417 http://dx.doi.org/10.1186/s40104-019-0407-9 Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Wang, Xiao Su, Guosheng Hao, Dan Lund, Mogens Sandø Kadarmideen, Haja N. Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations
title	Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations
title_full	Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations
title_fullStr	Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations
title_full_unstemmed	Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations
title_short	Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations
title_sort	comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6947967/ https://www.ncbi.nlm.nih.gov/pubmed/31921417 http://dx.doi.org/10.1186/s40104-019-0407-9
work_keys_str_mv	AT wangxiao comparisonsofimprovedgenomicpredictionsgeneratedbydifferentimputationmethodsforgenotypingbysequencingdatainlivestockpopulations AT suguosheng comparisonsofimprovedgenomicpredictionsgeneratedbydifferentimputationmethodsforgenotypingbysequencingdatainlivestockpopulations AT haodan comparisonsofimprovedgenomicpredictionsgeneratedbydifferentimputationmethodsforgenotypingbysequencingdatainlivestockpopulations AT lundmogenssandø comparisonsofimprovedgenomicpredictionsgeneratedbydifferentimputationmethodsforgenotypingbysequencingdatainlivestockpopulations AT kadarmideenhajan comparisonsofimprovedgenomicpredictionsgeneratedbydifferentimputationmethodsforgenotypingbysequencingdatainlivestockpopulations

Comparisons of improved genomic predictions generated by different imputation methods for genotyping by sequencing data in livestock populations

Ejemplares similares