Cargando…

Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium

BACKGROUND: Measures of linkage disequilibrium (LD) play a key role in a wide range of applications from disease association to demographic history estimation. The true population LD cannot be measured directly and instead can only be inferred from genetic samples, which are unavoidably subject to m...

Descripción completa

Detalles Bibliográficos
Autores principales: Hui, Tin-Yu J., Burt, Austin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7045472/
https://www.ncbi.nlm.nih.gov/pubmed/32102657
http://dx.doi.org/10.1186/s12863-020-0818-9
_version_ 1783501781252177920
author Hui, Tin-Yu J.
Burt, Austin
author_facet Hui, Tin-Yu J.
Burt, Austin
author_sort Hui, Tin-Yu J.
collection PubMed
description BACKGROUND: Measures of linkage disequilibrium (LD) play a key role in a wide range of applications from disease association to demographic history estimation. The true population LD cannot be measured directly and instead can only be inferred from genetic samples, which are unavoidably subject to measurement error. Previous studies of r(2) (a measure of LD), such as the bias due to finite sample size and its variance, were based on the special case that the true population-wise LD is zero. These results generally do not hold for non-zero [Formula: see text] values, which are more common in real genetic data. RESULTS: This work generalises the estimation of r(2) to all levels of LD, and for both phased and unphased data. First, we provide new formulae for the effect of finite sample size on the observed r(2) values. Second, we find a new empirical formula for the variance of the observed r(2), equals to 2E[r(2)](1 − E[r(2)])/n, where n is the diploid sample size. Third, we propose a new routine, Constrained ML, a likelihood-based method to directly estimate haplotype frequencies and r(2) from diploid genotypes under Hardy-Weinberg Equilibrium. While serving the same purpose as the pre-existing Expectation-Maximisation algorithm, the new routine can have better convergence and is simpler to use. A new likelihood-ratio test is also introduced to test for the absence of a particular haplotype. Extensive simulations are run to support these findings. CONCLUSION: Most inferences on LD will benefit from our new findings, from point and interval estimation to hypothesis testing. Genetic analyses utilising r(2) information will become more accurate as a result.
format Online
Article
Text
id pubmed-7045472
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-70454722020-03-03 Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium Hui, Tin-Yu J. Burt, Austin BMC Genet Research Article BACKGROUND: Measures of linkage disequilibrium (LD) play a key role in a wide range of applications from disease association to demographic history estimation. The true population LD cannot be measured directly and instead can only be inferred from genetic samples, which are unavoidably subject to measurement error. Previous studies of r(2) (a measure of LD), such as the bias due to finite sample size and its variance, were based on the special case that the true population-wise LD is zero. These results generally do not hold for non-zero [Formula: see text] values, which are more common in real genetic data. RESULTS: This work generalises the estimation of r(2) to all levels of LD, and for both phased and unphased data. First, we provide new formulae for the effect of finite sample size on the observed r(2) values. Second, we find a new empirical formula for the variance of the observed r(2), equals to 2E[r(2)](1 − E[r(2)])/n, where n is the diploid sample size. Third, we propose a new routine, Constrained ML, a likelihood-based method to directly estimate haplotype frequencies and r(2) from diploid genotypes under Hardy-Weinberg Equilibrium. While serving the same purpose as the pre-existing Expectation-Maximisation algorithm, the new routine can have better convergence and is simpler to use. A new likelihood-ratio test is also introduced to test for the absence of a particular haplotype. Extensive simulations are run to support these findings. CONCLUSION: Most inferences on LD will benefit from our new findings, from point and interval estimation to hypothesis testing. Genetic analyses utilising r(2) information will become more accurate as a result. BioMed Central 2020-02-26 /pmc/articles/PMC7045472/ /pubmed/32102657 http://dx.doi.org/10.1186/s12863-020-0818-9 Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Hui, Tin-Yu J.
Burt, Austin
Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium
title Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium
title_full Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium
title_fullStr Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium
title_full_unstemmed Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium
title_short Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium
title_sort estimating linkage disequilibrium from genotypes under hardy-weinberg equilibrium
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7045472/
https://www.ncbi.nlm.nih.gov/pubmed/32102657
http://dx.doi.org/10.1186/s12863-020-0818-9
work_keys_str_mv AT huitinyuj estimatinglinkagedisequilibriumfromgenotypesunderhardyweinbergequilibrium
AT burtaustin estimatinglinkagedisequilibriumfromgenotypesunderhardyweinbergequilibrium