Cargando…

Phasing quality assessment in a brown layer population through family- and population-based software

BACKGROUND: Haplotype data contains more information than genotype data and provides possibilities such as imputing low frequency variants, inferring points of recombination, detecting recurrent mutations, mapping linkage disequilibrium (LD), studying selection signatures, estimating IBD probabiliti...

Descripción completa

Detalles Bibliográficos
Autores principales: Frioni, N., Cavero, D., Simianer, H., Erbe, M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636125/
https://www.ncbi.nlm.nih.gov/pubmed/31311514
http://dx.doi.org/10.1186/s12863-019-0759-3
_version_ 1783436009468329984
author Frioni, N.
Cavero, D.
Simianer, H.
Erbe, M.
author_facet Frioni, N.
Cavero, D.
Simianer, H.
Erbe, M.
author_sort Frioni, N.
collection PubMed
description BACKGROUND: Haplotype data contains more information than genotype data and provides possibilities such as imputing low frequency variants, inferring points of recombination, detecting recurrent mutations, mapping linkage disequilibrium (LD), studying selection signatures, estimating IBD probabilities, etc. In addition, haplotype structure is used to assess genetic diversity and expected accuracy in genomic selection programs. Nevertheless, the quality and efficiency of phasing has rarely been a subject of thorough study but was assessed mainly as a by-product in imputation quality studies. Moreover, phasing studies based on data of a poultry population are non-existent. The aim of this study was to evaluate the phasing quality of FImpute and Beagle, two of the most used phasing software. RESULTS: We simulated ten replicated samples of a layer population comprising 888 individuals from a real SNP dataset of 580 k and a pedigree of 12 generations. Chromosomes analyzed were 1, 7 and 20. We measured the percentage of SNPs that were phased equally between true and phased haplotypes (Eqp), proportion of individuals completely correctly phased, number of incorrectly phased SNPs or Breakpoints (Bkp) and the length of inverted haplotype segments. Results were obtained for three different groups of individuals, with no parents or offspring genotyped in the dataset, with only one parent, and with both parents, respectively. The phasing was performed with Beagle (v3.3 and v4.1) and FImpute v2.2 (with and without pedigree). Eqp values ranged from 88 to 100%, with the best results from haplotypes phased with Beagle v4.1 and FImpute with pedigree information and at least one parent genotyped. FImpute haplotypes showed a higher number of Bkp than Beagle. As a consequence, switched haplotype segments were longer for Beagle than for FImpute. CONCLUSION: We concluded that for the dataset applied in this study Beagle v4.1 or FImpute with pedigree information and at least one parent genotyped in the data set were the best alternatives for obtaining high quality phased haplotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12863-019-0759-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6636125
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66361252019-07-25 Phasing quality assessment in a brown layer population through family- and population-based software Frioni, N. Cavero, D. Simianer, H. Erbe, M. BMC Genet Research Article BACKGROUND: Haplotype data contains more information than genotype data and provides possibilities such as imputing low frequency variants, inferring points of recombination, detecting recurrent mutations, mapping linkage disequilibrium (LD), studying selection signatures, estimating IBD probabilities, etc. In addition, haplotype structure is used to assess genetic diversity and expected accuracy in genomic selection programs. Nevertheless, the quality and efficiency of phasing has rarely been a subject of thorough study but was assessed mainly as a by-product in imputation quality studies. Moreover, phasing studies based on data of a poultry population are non-existent. The aim of this study was to evaluate the phasing quality of FImpute and Beagle, two of the most used phasing software. RESULTS: We simulated ten replicated samples of a layer population comprising 888 individuals from a real SNP dataset of 580 k and a pedigree of 12 generations. Chromosomes analyzed were 1, 7 and 20. We measured the percentage of SNPs that were phased equally between true and phased haplotypes (Eqp), proportion of individuals completely correctly phased, number of incorrectly phased SNPs or Breakpoints (Bkp) and the length of inverted haplotype segments. Results were obtained for three different groups of individuals, with no parents or offspring genotyped in the dataset, with only one parent, and with both parents, respectively. The phasing was performed with Beagle (v3.3 and v4.1) and FImpute v2.2 (with and without pedigree). Eqp values ranged from 88 to 100%, with the best results from haplotypes phased with Beagle v4.1 and FImpute with pedigree information and at least one parent genotyped. FImpute haplotypes showed a higher number of Bkp than Beagle. As a consequence, switched haplotype segments were longer for Beagle than for FImpute. CONCLUSION: We concluded that for the dataset applied in this study Beagle v4.1 or FImpute with pedigree information and at least one parent genotyped in the data set were the best alternatives for obtaining high quality phased haplotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12863-019-0759-3) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-17 /pmc/articles/PMC6636125/ /pubmed/31311514 http://dx.doi.org/10.1186/s12863-019-0759-3 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Frioni, N.
Cavero, D.
Simianer, H.
Erbe, M.
Phasing quality assessment in a brown layer population through family- and population-based software
title Phasing quality assessment in a brown layer population through family- and population-based software
title_full Phasing quality assessment in a brown layer population through family- and population-based software
title_fullStr Phasing quality assessment in a brown layer population through family- and population-based software
title_full_unstemmed Phasing quality assessment in a brown layer population through family- and population-based software
title_short Phasing quality assessment in a brown layer population through family- and population-based software
title_sort phasing quality assessment in a brown layer population through family- and population-based software
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636125/
https://www.ncbi.nlm.nih.gov/pubmed/31311514
http://dx.doi.org/10.1186/s12863-019-0759-3
work_keys_str_mv AT frionin phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware
AT caverod phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware
AT simianerh phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware
AT erbem phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware