Cargando…

A new approach for efficient genotype imputation using information from relatives

BACKGROUND: Genotype imputation can help reduce genotyping costs particularly for implementation of genomic selection. In applications entailing large populations, recovering the genotypes of untyped loci using information from reference individuals that were genotyped with a higher density panel is...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sargolzaei, Mehdi, Chesnais, Jacques P, Schenkel, Flavio S
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2014
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4076979/ https://www.ncbi.nlm.nih.gov/pubmed/24935670 http://dx.doi.org/10.1186/1471-2164-15-478

_version_	1782323546298515456
author	Sargolzaei, Mehdi Chesnais, Jacques P Schenkel, Flavio S
author_facet	Sargolzaei, Mehdi Chesnais, Jacques P Schenkel, Flavio S
author_sort	Sargolzaei, Mehdi
collection	PubMed
description	BACKGROUND: Genotype imputation can help reduce genotyping costs particularly for implementation of genomic selection. In applications entailing large populations, recovering the genotypes of untyped loci using information from reference individuals that were genotyped with a higher density panel is computationally challenging. Popular imputation methods are based upon the Hidden Markov model and have computational constraints due to an intensive sampling process. A fast, deterministic approach, which makes use of both family and population information, is presented here. All individuals are related and, therefore, share haplotypes which may differ in length and frequency based on their relationships. The method starts with family imputation if pedigree information is available, and then exploits close relationships by searching for long haplotype matches in the reference group using overlapping sliding windows. The search continues as the window size is shrunk in each chromosome sweep in order to capture more distant relationships. RESULTS: The proposed method gave higher or similar imputation accuracy than Beagle and Impute2 in cattle data sets when all available information was used. When close relatives of target individuals were present in the reference group, the method resulted in higher accuracy compared to the other two methods even when the pedigree was not used. Rare variants were also imputed with higher accuracy. Finally, computing requirements were considerably lower than those of Beagle and Impute2. The presented method took 28 minutes to impute from 6 k to 50 k genotypes for 2,000 individuals with a reference size of 64,429 individuals. CONCLUSIONS: The proposed method efficiently makes use of information from close and distant relatives for accurate genotype imputation. In addition to its high imputation accuracy, the method is fast, owing to its deterministic nature and, therefore, it can easily be used in large data sets where the use of other methods is impractical.
format	Online Article Text
id	pubmed-4076979
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-40769792014-07-03 A new approach for efficient genotype imputation using information from relatives Sargolzaei, Mehdi Chesnais, Jacques P Schenkel, Flavio S BMC Genomics Methodology Article BACKGROUND: Genotype imputation can help reduce genotyping costs particularly for implementation of genomic selection. In applications entailing large populations, recovering the genotypes of untyped loci using information from reference individuals that were genotyped with a higher density panel is computationally challenging. Popular imputation methods are based upon the Hidden Markov model and have computational constraints due to an intensive sampling process. A fast, deterministic approach, which makes use of both family and population information, is presented here. All individuals are related and, therefore, share haplotypes which may differ in length and frequency based on their relationships. The method starts with family imputation if pedigree information is available, and then exploits close relationships by searching for long haplotype matches in the reference group using overlapping sliding windows. The search continues as the window size is shrunk in each chromosome sweep in order to capture more distant relationships. RESULTS: The proposed method gave higher or similar imputation accuracy than Beagle and Impute2 in cattle data sets when all available information was used. When close relatives of target individuals were present in the reference group, the method resulted in higher accuracy compared to the other two methods even when the pedigree was not used. Rare variants were also imputed with higher accuracy. Finally, computing requirements were considerably lower than those of Beagle and Impute2. The presented method took 28 minutes to impute from 6 k to 50 k genotypes for 2,000 individuals with a reference size of 64,429 individuals. CONCLUSIONS: The proposed method efficiently makes use of information from close and distant relatives for accurate genotype imputation. In addition to its high imputation accuracy, the method is fast, owing to its deterministic nature and, therefore, it can easily be used in large data sets where the use of other methods is impractical. BioMed Central 2014-06-17 /pmc/articles/PMC4076979/ /pubmed/24935670 http://dx.doi.org/10.1186/1471-2164-15-478 Text en © Sargolzaei et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle	Methodology Article Sargolzaei, Mehdi Chesnais, Jacques P Schenkel, Flavio S A new approach for efficient genotype imputation using information from relatives
title	A new approach for efficient genotype imputation using information from relatives
title_full	A new approach for efficient genotype imputation using information from relatives
title_fullStr	A new approach for efficient genotype imputation using information from relatives
title_full_unstemmed	A new approach for efficient genotype imputation using information from relatives
title_short	A new approach for efficient genotype imputation using information from relatives
title_sort	new approach for efficient genotype imputation using information from relatives
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4076979/ https://www.ncbi.nlm.nih.gov/pubmed/24935670 http://dx.doi.org/10.1186/1471-2164-15-478
work_keys_str_mv	AT sargolzaeimehdi anewapproachforefficientgenotypeimputationusinginformationfromrelatives AT chesnaisjacquesp anewapproachforefficientgenotypeimputationusinginformationfromrelatives AT schenkelflavios anewapproachforefficientgenotypeimputationusinginformationfromrelatives AT sargolzaeimehdi newapproachforefficientgenotypeimputationusinginformationfromrelatives AT chesnaisjacquesp newapproachforefficientgenotypeimputationusinginformationfromrelatives AT schenkelflavios newapproachforefficientgenotypeimputationusinginformationfromrelatives

A new approach for efficient genotype imputation using information from relatives

Ejemplares similares