Cargando…

Assessment of the performance of hidden Markov models for imputation in animal breeding

BACKGROUND: In this paper, we review the performance of various hidden Markov model-based imputation methods in animal breeding populations. Traditionally, pedigree and heuristic-based imputation methods have been used for imputation in large animal populations due to their computational efficiency,...

Descripción completa

Detalles Bibliográficos
Autores principales: Whalen, Andrew, Gorjanc, Gregor, Ros-Freixedes, Roger, Hickey, John M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6142395/
https://www.ncbi.nlm.nih.gov/pubmed/30223768
http://dx.doi.org/10.1186/s12711-018-0416-8
_version_ 1783355851347591168
author Whalen, Andrew
Gorjanc, Gregor
Ros-Freixedes, Roger
Hickey, John M.
author_facet Whalen, Andrew
Gorjanc, Gregor
Ros-Freixedes, Roger
Hickey, John M.
author_sort Whalen, Andrew
collection PubMed
description BACKGROUND: In this paper, we review the performance of various hidden Markov model-based imputation methods in animal breeding populations. Traditionally, pedigree and heuristic-based imputation methods have been used for imputation in large animal populations due to their computational efficiency, scalability, and accuracy. Recent advances in the area of human genetics have increased the ability of probabilistic hidden Markov model methods to perform accurate phasing and imputation in large populations. These advances may enable these methods to be useful for routine use in large animal populations, particularly in populations where pedigree information is not readily available. METHODS: To test the performance of hidden Markov model-based imputation, we evaluated the accuracy and computational cost of several methods in a series of simulated populations and a real animal population without using a pedigree. First, we tested single-step (diploid) imputation, which performs both phasing and imputation. Second, we tested pre-phasing followed by haploid imputation. Overall, we used four available diploid imputation methods (fastPHASE, Beagle v4.0, IMPUTE2, and MaCH), three phasing methods, (SHAPEIT2, HAPI-UR, and Eagle2), and three haploid imputation methods (IMPUTE2, Beagle v4.1, and Minimac3). RESULTS: We found that performing pre-phasing and haploid imputation was faster and more accurate than diploid imputation. In particular, among all the methods tested, pre-phasing with Eagle2 or HAPI-UR and imputing with Minimac3 or IMPUTE2 gave the highest accuracies with both simulated and real data. CONCLUSIONS: The results of this study suggest that hidden Markov model-based imputation algorithms are an accurate and computationally feasible approach for performing imputation without a pedigree when pre-phasing and haploid imputation are used. Of the algorithms tested, the combination of Eagle2 and Minimac3 gave the highest accuracy across the simulated and real datasets.
format Online
Article
Text
id pubmed-6142395
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61423952018-09-20 Assessment of the performance of hidden Markov models for imputation in animal breeding Whalen, Andrew Gorjanc, Gregor Ros-Freixedes, Roger Hickey, John M. Genet Sel Evol Research Article BACKGROUND: In this paper, we review the performance of various hidden Markov model-based imputation methods in animal breeding populations. Traditionally, pedigree and heuristic-based imputation methods have been used for imputation in large animal populations due to their computational efficiency, scalability, and accuracy. Recent advances in the area of human genetics have increased the ability of probabilistic hidden Markov model methods to perform accurate phasing and imputation in large populations. These advances may enable these methods to be useful for routine use in large animal populations, particularly in populations where pedigree information is not readily available. METHODS: To test the performance of hidden Markov model-based imputation, we evaluated the accuracy and computational cost of several methods in a series of simulated populations and a real animal population without using a pedigree. First, we tested single-step (diploid) imputation, which performs both phasing and imputation. Second, we tested pre-phasing followed by haploid imputation. Overall, we used four available diploid imputation methods (fastPHASE, Beagle v4.0, IMPUTE2, and MaCH), three phasing methods, (SHAPEIT2, HAPI-UR, and Eagle2), and three haploid imputation methods (IMPUTE2, Beagle v4.1, and Minimac3). RESULTS: We found that performing pre-phasing and haploid imputation was faster and more accurate than diploid imputation. In particular, among all the methods tested, pre-phasing with Eagle2 or HAPI-UR and imputing with Minimac3 or IMPUTE2 gave the highest accuracies with both simulated and real data. CONCLUSIONS: The results of this study suggest that hidden Markov model-based imputation algorithms are an accurate and computationally feasible approach for performing imputation without a pedigree when pre-phasing and haploid imputation are used. Of the algorithms tested, the combination of Eagle2 and Minimac3 gave the highest accuracy across the simulated and real datasets. BioMed Central 2018-09-17 /pmc/articles/PMC6142395/ /pubmed/30223768 http://dx.doi.org/10.1186/s12711-018-0416-8 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Whalen, Andrew
Gorjanc, Gregor
Ros-Freixedes, Roger
Hickey, John M.
Assessment of the performance of hidden Markov models for imputation in animal breeding
title Assessment of the performance of hidden Markov models for imputation in animal breeding
title_full Assessment of the performance of hidden Markov models for imputation in animal breeding
title_fullStr Assessment of the performance of hidden Markov models for imputation in animal breeding
title_full_unstemmed Assessment of the performance of hidden Markov models for imputation in animal breeding
title_short Assessment of the performance of hidden Markov models for imputation in animal breeding
title_sort assessment of the performance of hidden markov models for imputation in animal breeding
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6142395/
https://www.ncbi.nlm.nih.gov/pubmed/30223768
http://dx.doi.org/10.1186/s12711-018-0416-8
work_keys_str_mv AT whalenandrew assessmentoftheperformanceofhiddenmarkovmodelsforimputationinanimalbreeding
AT gorjancgregor assessmentoftheperformanceofhiddenmarkovmodelsforimputationinanimalbreeding
AT rosfreixedesroger assessmentoftheperformanceofhiddenmarkovmodelsforimputationinanimalbreeding
AT hickeyjohnm assessmentoftheperformanceofhiddenmarkovmodelsforimputationinanimalbreeding