Cargando…
Phasing quality assessment in a brown layer population through family- and population-based software
BACKGROUND: Haplotype data contains more information than genotype data and provides possibilities such as imputing low frequency variants, inferring points of recombination, detecting recurrent mutations, mapping linkage disequilibrium (LD), studying selection signatures, estimating IBD probabiliti...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636125/ https://www.ncbi.nlm.nih.gov/pubmed/31311514 http://dx.doi.org/10.1186/s12863-019-0759-3 |
_version_ | 1783436009468329984 |
---|---|
author | Frioni, N. Cavero, D. Simianer, H. Erbe, M. |
author_facet | Frioni, N. Cavero, D. Simianer, H. Erbe, M. |
author_sort | Frioni, N. |
collection | PubMed |
description | BACKGROUND: Haplotype data contains more information than genotype data and provides possibilities such as imputing low frequency variants, inferring points of recombination, detecting recurrent mutations, mapping linkage disequilibrium (LD), studying selection signatures, estimating IBD probabilities, etc. In addition, haplotype structure is used to assess genetic diversity and expected accuracy in genomic selection programs. Nevertheless, the quality and efficiency of phasing has rarely been a subject of thorough study but was assessed mainly as a by-product in imputation quality studies. Moreover, phasing studies based on data of a poultry population are non-existent. The aim of this study was to evaluate the phasing quality of FImpute and Beagle, two of the most used phasing software. RESULTS: We simulated ten replicated samples of a layer population comprising 888 individuals from a real SNP dataset of 580 k and a pedigree of 12 generations. Chromosomes analyzed were 1, 7 and 20. We measured the percentage of SNPs that were phased equally between true and phased haplotypes (Eqp), proportion of individuals completely correctly phased, number of incorrectly phased SNPs or Breakpoints (Bkp) and the length of inverted haplotype segments. Results were obtained for three different groups of individuals, with no parents or offspring genotyped in the dataset, with only one parent, and with both parents, respectively. The phasing was performed with Beagle (v3.3 and v4.1) and FImpute v2.2 (with and without pedigree). Eqp values ranged from 88 to 100%, with the best results from haplotypes phased with Beagle v4.1 and FImpute with pedigree information and at least one parent genotyped. FImpute haplotypes showed a higher number of Bkp than Beagle. As a consequence, switched haplotype segments were longer for Beagle than for FImpute. CONCLUSION: We concluded that for the dataset applied in this study Beagle v4.1 or FImpute with pedigree information and at least one parent genotyped in the data set were the best alternatives for obtaining high quality phased haplotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12863-019-0759-3) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6636125 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-66361252019-07-25 Phasing quality assessment in a brown layer population through family- and population-based software Frioni, N. Cavero, D. Simianer, H. Erbe, M. BMC Genet Research Article BACKGROUND: Haplotype data contains more information than genotype data and provides possibilities such as imputing low frequency variants, inferring points of recombination, detecting recurrent mutations, mapping linkage disequilibrium (LD), studying selection signatures, estimating IBD probabilities, etc. In addition, haplotype structure is used to assess genetic diversity and expected accuracy in genomic selection programs. Nevertheless, the quality and efficiency of phasing has rarely been a subject of thorough study but was assessed mainly as a by-product in imputation quality studies. Moreover, phasing studies based on data of a poultry population are non-existent. The aim of this study was to evaluate the phasing quality of FImpute and Beagle, two of the most used phasing software. RESULTS: We simulated ten replicated samples of a layer population comprising 888 individuals from a real SNP dataset of 580 k and a pedigree of 12 generations. Chromosomes analyzed were 1, 7 and 20. We measured the percentage of SNPs that were phased equally between true and phased haplotypes (Eqp), proportion of individuals completely correctly phased, number of incorrectly phased SNPs or Breakpoints (Bkp) and the length of inverted haplotype segments. Results were obtained for three different groups of individuals, with no parents or offspring genotyped in the dataset, with only one parent, and with both parents, respectively. The phasing was performed with Beagle (v3.3 and v4.1) and FImpute v2.2 (with and without pedigree). Eqp values ranged from 88 to 100%, with the best results from haplotypes phased with Beagle v4.1 and FImpute with pedigree information and at least one parent genotyped. FImpute haplotypes showed a higher number of Bkp than Beagle. As a consequence, switched haplotype segments were longer for Beagle than for FImpute. CONCLUSION: We concluded that for the dataset applied in this study Beagle v4.1 or FImpute with pedigree information and at least one parent genotyped in the data set were the best alternatives for obtaining high quality phased haplotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12863-019-0759-3) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-17 /pmc/articles/PMC6636125/ /pubmed/31311514 http://dx.doi.org/10.1186/s12863-019-0759-3 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Frioni, N. Cavero, D. Simianer, H. Erbe, M. Phasing quality assessment in a brown layer population through family- and population-based software |
title | Phasing quality assessment in a brown layer population through family- and population-based software |
title_full | Phasing quality assessment in a brown layer population through family- and population-based software |
title_fullStr | Phasing quality assessment in a brown layer population through family- and population-based software |
title_full_unstemmed | Phasing quality assessment in a brown layer population through family- and population-based software |
title_short | Phasing quality assessment in a brown layer population through family- and population-based software |
title_sort | phasing quality assessment in a brown layer population through family- and population-based software |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636125/ https://www.ncbi.nlm.nih.gov/pubmed/31311514 http://dx.doi.org/10.1186/s12863-019-0759-3 |
work_keys_str_mv | AT frionin phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware AT caverod phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware AT simianerh phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware AT erbem phasingqualityassessmentinabrownlayerpopulationthroughfamilyandpopulationbasedsoftware |