Cargando…

PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population

Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL...

Descripción completa

Detalles Bibliográficos
Autores principales: Livne, Oren E., Han, Lide, Alkorta-Aranburu, Gorka, Wentworth-Sheilds, William, Abney, Mark, Ober, Carole, Nicolae, Dan L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4348507/
https://www.ncbi.nlm.nih.gov/pubmed/25735005
http://dx.doi.org/10.1371/journal.pcbi.1004139
_version_ 1782359932910174208
author Livne, Oren E.
Han, Lide
Alkorta-Aranburu, Gorka
Wentworth-Sheilds, William
Abney, Mark
Ober, Carole
Nicolae, Dan L.
author_facet Livne, Oren E.
Han, Lide
Alkorta-Aranburu, Gorka
Wentworth-Sheilds, William
Abney, Mark
Ober, Carole
Nicolae, Dan L.
author_sort Livne, Oren E.
collection PubMed
description Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD) segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs), from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.
format Online
Article
Text
id pubmed-4348507
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43485072015-03-06 PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population Livne, Oren E. Han, Lide Alkorta-Aranburu, Gorka Wentworth-Sheilds, William Abney, Mark Ober, Carole Nicolae, Dan L. PLoS Comput Biol Research Article Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD) segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs), from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost. Public Library of Science 2015-03-03 /pmc/articles/PMC4348507/ /pubmed/25735005 http://dx.doi.org/10.1371/journal.pcbi.1004139 Text en © 2015 Livne et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Livne, Oren E.
Han, Lide
Alkorta-Aranburu, Gorka
Wentworth-Sheilds, William
Abney, Mark
Ober, Carole
Nicolae, Dan L.
PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population
title PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population
title_full PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population
title_fullStr PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population
title_full_unstemmed PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population
title_short PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population
title_sort primal: fast and accurate pedigree-based imputation from sequence data in a founder population
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4348507/
https://www.ncbi.nlm.nih.gov/pubmed/25735005
http://dx.doi.org/10.1371/journal.pcbi.1004139
work_keys_str_mv AT livneorene primalfastandaccuratepedigreebasedimputationfromsequencedatainafounderpopulation
AT hanlide primalfastandaccuratepedigreebasedimputationfromsequencedatainafounderpopulation
AT alkortaaranburugorka primalfastandaccuratepedigreebasedimputationfromsequencedatainafounderpopulation
AT wentworthsheildswilliam primalfastandaccuratepedigreebasedimputationfromsequencedatainafounderpopulation
AT abneymark primalfastandaccuratepedigreebasedimputationfromsequencedatainafounderpopulation
AT obercarole primalfastandaccuratepedigreebasedimputationfromsequencedatainafounderpopulation
AT nicolaedanl primalfastandaccuratepedigreebasedimputationfromsequencedatainafounderpopulation