Cargando…

A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations

ABSTRACT: Key message New fast and accurate method for phasing and imputation of SNP chip genotypes within diploid bi-parental plant populations. ABSTRACT: This paper presents a new heuristic method for phasing and imputation of genomic data in diploid plant species. Our method, called AlphaPlantImp...

Descripción completa

Detalles Bibliográficos
Autores principales: Gonen, Serap, Wimmer, Valentin, Gaynor, R. Chris, Byrne, Ed, Gorjanc, Gregor, Hickey, John M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6208939/
https://www.ncbi.nlm.nih.gov/pubmed/30078163
http://dx.doi.org/10.1007/s00122-018-3156-9
_version_ 1783366812959768576
author Gonen, Serap
Wimmer, Valentin
Gaynor, R. Chris
Byrne, Ed
Gorjanc, Gregor
Hickey, John M.
author_facet Gonen, Serap
Wimmer, Valentin
Gaynor, R. Chris
Byrne, Ed
Gorjanc, Gregor
Hickey, John M.
author_sort Gonen, Serap
collection PubMed
description ABSTRACT: Key message New fast and accurate method for phasing and imputation of SNP chip genotypes within diploid bi-parental plant populations. ABSTRACT: This paper presents a new heuristic method for phasing and imputation of genomic data in diploid plant species. Our method, called AlphaPlantImpute, explicitly leverages features of plant breeding programmes to maximise the accuracy of imputation. The features are a small number of parents, which can be inbred and usually have high-density genomic data, and few recombinations separating parents and focal individuals genotyped at low density (i.e. descendants that are the imputation targets). AlphaPlantImpute works roughly in three steps. First, it identifies informative low-density genotype markers in parents. Second, it tracks the inheritance of parental alleles and haplotypes to focal individuals at informative markers. Finally, it uses this low-density information as anchor points to impute focal individuals to high density. We tested the imputation accuracy of AlphaPlantImpute in simulated bi-parental populations across different scenarios. We also compared its accuracy to existing software called PlantImpute. In general, AlphaPlantImpute had better or equal imputation accuracy as PlantImpute. The computational time and memory requirements of AlphaPlantImpute were tiny compared to PlantImpute. For example, accuracy of imputation was 0.96 for a scenario where both parents were inbred and genotyped at 25,000 markers per chromosome and a focal F(2) individual was genotyped with 50 markers per chromosome. The maximum memory requirement for this scenario was 0.08 GB and took 37 s to complete.
format Online
Article
Text
id pubmed-6208939
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-62089392018-11-09 A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations Gonen, Serap Wimmer, Valentin Gaynor, R. Chris Byrne, Ed Gorjanc, Gregor Hickey, John M. Theor Appl Genet Original Article ABSTRACT: Key message New fast and accurate method for phasing and imputation of SNP chip genotypes within diploid bi-parental plant populations. ABSTRACT: This paper presents a new heuristic method for phasing and imputation of genomic data in diploid plant species. Our method, called AlphaPlantImpute, explicitly leverages features of plant breeding programmes to maximise the accuracy of imputation. The features are a small number of parents, which can be inbred and usually have high-density genomic data, and few recombinations separating parents and focal individuals genotyped at low density (i.e. descendants that are the imputation targets). AlphaPlantImpute works roughly in three steps. First, it identifies informative low-density genotype markers in parents. Second, it tracks the inheritance of parental alleles and haplotypes to focal individuals at informative markers. Finally, it uses this low-density information as anchor points to impute focal individuals to high density. We tested the imputation accuracy of AlphaPlantImpute in simulated bi-parental populations across different scenarios. We also compared its accuracy to existing software called PlantImpute. In general, AlphaPlantImpute had better or equal imputation accuracy as PlantImpute. The computational time and memory requirements of AlphaPlantImpute were tiny compared to PlantImpute. For example, accuracy of imputation was 0.96 for a scenario where both parents were inbred and genotyped at 25,000 markers per chromosome and a focal F(2) individual was genotyped with 50 markers per chromosome. The maximum memory requirement for this scenario was 0.08 GB and took 37 s to complete. Springer Berlin Heidelberg 2018-08-04 2018 /pmc/articles/PMC6208939/ /pubmed/30078163 http://dx.doi.org/10.1007/s00122-018-3156-9 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Original Article
Gonen, Serap
Wimmer, Valentin
Gaynor, R. Chris
Byrne, Ed
Gorjanc, Gregor
Hickey, John M.
A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
title A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
title_full A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
title_fullStr A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
title_full_unstemmed A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
title_short A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
title_sort heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6208939/
https://www.ncbi.nlm.nih.gov/pubmed/30078163
http://dx.doi.org/10.1007/s00122-018-3156-9
work_keys_str_mv AT gonenserap aheuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT wimmervalentin aheuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT gaynorrchris aheuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT byrneed aheuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT gorjancgregor aheuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT hickeyjohnm aheuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT gonenserap heuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT wimmervalentin heuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT gaynorrchris heuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT byrneed heuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT gorjancgregor heuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations
AT hickeyjohnm heuristicmethodforfastandaccuratephasingandimputationofsinglenucleotidepolymorphismdatainbiparentalplantpopulations