Cargando…

Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population

BACKGROUND: Genomic selection accuracy increases with the use of high SNP (single nucleotide polymorphism) coverage. However, such gains in coverage come at high costs, preventing their prompt operational implementation by breeders. Low density panels imputed to higher densities offer a cheaper alte...

Descripción completa

Detalles Bibliográficos
Autores principales: Pégard, Marie, Rogier, Odile, Bérard, Aurélie, Faivre-Rampant, Patricia, Paslier, Marie-Christine Le, Bastien, Catherine, Jorge, Véronique, Sánchez, Leopoldo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6471894/
https://www.ncbi.nlm.nih.gov/pubmed/30999856
http://dx.doi.org/10.1186/s12864-019-5660-y
_version_ 1783412129741668352
author Pégard, Marie
Rogier, Odile
Bérard, Aurélie
Faivre-Rampant, Patricia
Paslier, Marie-Christine Le
Bastien, Catherine
Jorge, Véronique
Sánchez, Leopoldo
author_facet Pégard, Marie
Rogier, Odile
Bérard, Aurélie
Faivre-Rampant, Patricia
Paslier, Marie-Christine Le
Bastien, Catherine
Jorge, Véronique
Sánchez, Leopoldo
author_sort Pégard, Marie
collection PubMed
description BACKGROUND: Genomic selection accuracy increases with the use of high SNP (single nucleotide polymorphism) coverage. However, such gains in coverage come at high costs, preventing their prompt operational implementation by breeders. Low density panels imputed to higher densities offer a cheaper alternative during the first stages of genomic resources development. Our study is the first to explore the imputation in a tree species: black poplar. About 1000 pure-breed Populus nigra trees from a breeding population were selected and genotyped with a 12K custom Infinium Bead-Chip. Forty-three of those individuals corresponding to nodal trees in the pedigree were fully sequenced (reference), while the remaining majority (target) was imputed from 8K to 1.4 million SNPs using FImpute. Each SNP and individual was evaluated for imputation errors by leave-one-out cross validation in the training sample of 43 sequenced trees. Some summary statistics such as Hardy-Weinberg Equilibrium exact test p-value, quality of sequencing, depth of sequencing per site and per individual, minor allele frequency, marker density ratio or SNP information redundancy were calculated. Principal component and Boruta analyses were used on all these parameters to rank the factors affecting the quality of imputation. Additionally, we characterize the impact of the relatedness between reference population and target population. RESULTS: During the imputation process, we used 7540 SNPs from the chip to impute 1,438,827 SNPs from sequences. At the individual level, imputation accuracy was high with a proportion of SNPs correctly imputed between 0.84 and 0.99. The variation in accuracies was mostly due to differences in relatedness between individuals. At a SNP level, the imputation quality depended on genotyped SNP density and on the original minor allele frequency. The imputation did not appear to result in an increase of linkage disequilibrium. The genotype densification not only brought a better distribution of markers all along the genome, but also we did not detect any substantial bias in annotation categories. CONCLUSIONS: This study shows that it is possible to impute low-density marker panels to whole genome sequence with good accuracy under certain conditions that could be common to many breeding populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5660-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6471894
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64718942019-04-24 Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population Pégard, Marie Rogier, Odile Bérard, Aurélie Faivre-Rampant, Patricia Paslier, Marie-Christine Le Bastien, Catherine Jorge, Véronique Sánchez, Leopoldo BMC Genomics Research Article BACKGROUND: Genomic selection accuracy increases with the use of high SNP (single nucleotide polymorphism) coverage. However, such gains in coverage come at high costs, preventing their prompt operational implementation by breeders. Low density panels imputed to higher densities offer a cheaper alternative during the first stages of genomic resources development. Our study is the first to explore the imputation in a tree species: black poplar. About 1000 pure-breed Populus nigra trees from a breeding population were selected and genotyped with a 12K custom Infinium Bead-Chip. Forty-three of those individuals corresponding to nodal trees in the pedigree were fully sequenced (reference), while the remaining majority (target) was imputed from 8K to 1.4 million SNPs using FImpute. Each SNP and individual was evaluated for imputation errors by leave-one-out cross validation in the training sample of 43 sequenced trees. Some summary statistics such as Hardy-Weinberg Equilibrium exact test p-value, quality of sequencing, depth of sequencing per site and per individual, minor allele frequency, marker density ratio or SNP information redundancy were calculated. Principal component and Boruta analyses were used on all these parameters to rank the factors affecting the quality of imputation. Additionally, we characterize the impact of the relatedness between reference population and target population. RESULTS: During the imputation process, we used 7540 SNPs from the chip to impute 1,438,827 SNPs from sequences. At the individual level, imputation accuracy was high with a proportion of SNPs correctly imputed between 0.84 and 0.99. The variation in accuracies was mostly due to differences in relatedness between individuals. At a SNP level, the imputation quality depended on genotyped SNP density and on the original minor allele frequency. The imputation did not appear to result in an increase of linkage disequilibrium. The genotype densification not only brought a better distribution of markers all along the genome, but also we did not detect any substantial bias in annotation categories. CONCLUSIONS: This study shows that it is possible to impute low-density marker panels to whole genome sequence with good accuracy under certain conditions that could be common to many breeding populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5660-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-18 /pmc/articles/PMC6471894/ /pubmed/30999856 http://dx.doi.org/10.1186/s12864-019-5660-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Pégard, Marie
Rogier, Odile
Bérard, Aurélie
Faivre-Rampant, Patricia
Paslier, Marie-Christine Le
Bastien, Catherine
Jorge, Véronique
Sánchez, Leopoldo
Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population
title Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population
title_full Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population
title_fullStr Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population
title_full_unstemmed Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population
title_short Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population
title_sort sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6471894/
https://www.ncbi.nlm.nih.gov/pubmed/30999856
http://dx.doi.org/10.1186/s12864-019-5660-y
work_keys_str_mv AT pegardmarie sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation
AT rogierodile sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation
AT berardaurelie sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation
AT faivrerampantpatricia sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation
AT pasliermariechristinele sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation
AT bastiencatherine sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation
AT jorgeveronique sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation
AT sanchezleopoldo sequenceimputationfromlowdensitysinglenucleotidepolymorphismpanelinablackpoplarbreedingpopulation