Cargando…

Study of the optimum haplotype length to build genomic relationship matrices

BACKGROUND: As genomic data becomes more abundant, genomic prediction is more routinely used to estimate breeding values. In genomic prediction, the relationship matrix ([Formula: see text] ), which is traditionally used in genetic evaluations is replaced by the genomic relationship matrix ([Formula...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ferdosi, Mohammad H., Henshall, John, Tier, Bruce
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5043651/ https://www.ncbi.nlm.nih.gov/pubmed/27687320 http://dx.doi.org/10.1186/s12711-016-0253-6

_version_	1782456795611004928
author	Ferdosi, Mohammad H. Henshall, John Tier, Bruce
author_facet	Ferdosi, Mohammad H. Henshall, John Tier, Bruce
author_sort	Ferdosi, Mohammad H.
collection	PubMed
description	BACKGROUND: As genomic data becomes more abundant, genomic prediction is more routinely used to estimate breeding values. In genomic prediction, the relationship matrix ([Formula: see text] ), which is traditionally used in genetic evaluations is replaced by the genomic relationship matrix ([Formula: see text] ). This paper considers alternative ways of building relationship matrices either using single markers or haplotypes of different lengths. We compared the prediction accuracies and log-likelihoods when using these alternative relationship matrices and the traditional [Formula: see text] matrix, for real and simulated data. METHODS: For real data, we built relationship matrices using 50k genotype data for a population of Brahman cattle to analyze three traits: scrotal circumference (SC), age at puberty (AGECL) and weight at first corpus luteum (WTCL). Haplotypes were phased with hsphase and imputed with BEAGLE. The relationship matrices were built using three methods based on haplotypes of different lengths. The log-likelihood was considered to define the optimum haplotype lengths for each trait and each haplotype-based relationship matrix. RESULTS: Based on simulated data, we showed that the inverse of [Formula: see text] matrix and the inverse of the haplotype relationship matrices for methods using one-single nucleotide polymorphism (SNP) phased haplotypes provided coefficients of determination (R(2)) close to 1, although the estimated genetic variances differed across methods. Using real data and multiple SNPs in the haplotype segments to build the relationship matrices provided better results than the [Formula: see text] matrix based on one-SNP haplotypes. However, the optimal haplotype length to achieve the highest log-likelihood depended on the method used and the trait. The optimal haplotype length (7 to 8 SNPs) was similar for SC and AGECL. One of the haplotype-based methods achieved the largest increase in log-likelihood for SC, i.e. from −1330 when using [Formula: see text] to −1325 when using haplotypes with eight SNPs. CONCLUSIONS: Building the relationship matrix by using haplotypes that comprise multiple SNPs will increase the accuracy of estimated breeding values. However, the optimum haplotype length that shows the correct relationship among individuals for each trait can be derived from the data.
format	Online Article Text
id	pubmed-5043651
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-50436512016-10-05 Study of the optimum haplotype length to build genomic relationship matrices Ferdosi, Mohammad H. Henshall, John Tier, Bruce Genet Sel Evol Research Article BACKGROUND: As genomic data becomes more abundant, genomic prediction is more routinely used to estimate breeding values. In genomic prediction, the relationship matrix ([Formula: see text] ), which is traditionally used in genetic evaluations is replaced by the genomic relationship matrix ([Formula: see text] ). This paper considers alternative ways of building relationship matrices either using single markers or haplotypes of different lengths. We compared the prediction accuracies and log-likelihoods when using these alternative relationship matrices and the traditional [Formula: see text] matrix, for real and simulated data. METHODS: For real data, we built relationship matrices using 50k genotype data for a population of Brahman cattle to analyze three traits: scrotal circumference (SC), age at puberty (AGECL) and weight at first corpus luteum (WTCL). Haplotypes were phased with hsphase and imputed with BEAGLE. The relationship matrices were built using three methods based on haplotypes of different lengths. The log-likelihood was considered to define the optimum haplotype lengths for each trait and each haplotype-based relationship matrix. RESULTS: Based on simulated data, we showed that the inverse of [Formula: see text] matrix and the inverse of the haplotype relationship matrices for methods using one-single nucleotide polymorphism (SNP) phased haplotypes provided coefficients of determination (R(2)) close to 1, although the estimated genetic variances differed across methods. Using real data and multiple SNPs in the haplotype segments to build the relationship matrices provided better results than the [Formula: see text] matrix based on one-SNP haplotypes. However, the optimal haplotype length to achieve the highest log-likelihood depended on the method used and the trait. The optimal haplotype length (7 to 8 SNPs) was similar for SC and AGECL. One of the haplotype-based methods achieved the largest increase in log-likelihood for SC, i.e. from −1330 when using [Formula: see text] to −1325 when using haplotypes with eight SNPs. CONCLUSIONS: Building the relationship matrix by using haplotypes that comprise multiple SNPs will increase the accuracy of estimated breeding values. However, the optimum haplotype length that shows the correct relationship among individuals for each trait can be derived from the data. BioMed Central 2016-09-29 /pmc/articles/PMC5043651/ /pubmed/27687320 http://dx.doi.org/10.1186/s12711-016-0253-6 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Ferdosi, Mohammad H. Henshall, John Tier, Bruce Study of the optimum haplotype length to build genomic relationship matrices
title	Study of the optimum haplotype length to build genomic relationship matrices
title_full	Study of the optimum haplotype length to build genomic relationship matrices
title_fullStr	Study of the optimum haplotype length to build genomic relationship matrices
title_full_unstemmed	Study of the optimum haplotype length to build genomic relationship matrices
title_short	Study of the optimum haplotype length to build genomic relationship matrices
title_sort	study of the optimum haplotype length to build genomic relationship matrices
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5043651/ https://www.ncbi.nlm.nih.gov/pubmed/27687320 http://dx.doi.org/10.1186/s12711-016-0253-6
work_keys_str_mv	AT ferdosimohammadh studyoftheoptimumhaplotypelengthtobuildgenomicrelationshipmatrices AT henshalljohn studyoftheoptimumhaplotypelengthtobuildgenomicrelationshipmatrices AT tierbruce studyoftheoptimumhaplotypelengthtobuildgenomicrelationshipmatrices

Study of the optimum haplotype length to build genomic relationship matrices

Ejemplares similares