Cargando…

Are Molecular Haplotypes Worth the Time and Expense? A Cost-Effective Method for Applying Molecular Haplotypes

Because current molecular haplotyping methods are expensive and not amenable to automation, many researchers rely on statistical methods to infer haplotype pairs from multilocus genotypes, and subsequently treat these inferred haplotype pairs as observations. These procedures are prone to haplotype...

Descripción completa

Detalles Bibliográficos
Autores principales: Levenstien, Mark A, Ott, Jürg, Gordon, Derek
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550282/
https://www.ncbi.nlm.nih.gov/pubmed/16933998
http://dx.doi.org/10.1371/journal.pgen.0020127
Descripción
Sumario:Because current molecular haplotyping methods are expensive and not amenable to automation, many researchers rely on statistical methods to infer haplotype pairs from multilocus genotypes, and subsequently treat these inferred haplotype pairs as observations. These procedures are prone to haplotype misclassification. We examine the effect of these misclassification errors on the false-positive rate and power for two association tests. These tests include the standard likelihood ratio test (LRT(std)) and a likelihood ratio test that employs a double-sampling approach to allow for the misclassification inherent in the haplotype inference procedure (LRT(ae)). We aim to determine the cost–benefit relationship of increasing the proportion of individuals with molecular haplotype measurements in addition to genotypes to raise the power gain of the LRT(ae) over the LRT(std). This analysis should provide a guideline for determining the minimum number of molecular haplotypes required for desired power. Our simulations under the null hypothesis of equal haplotype frequencies in cases and controls indicate that (1) for each statistic, permutation methods maintain the correct type I error; (2) specific multilocus genotypes that are misclassified as the incorrect haplotype pair are consistently misclassified throughout each entire dataset; and (3) our simulations under the alternative hypothesis showed a significant power gain for the LRT(ae) over the LRT(std) for a subset of the parameter settings. Permutation methods should be used exclusively to determine significance for each statistic. For fixed cost, the power gain of the LRT(ae) over the LRT(std) varied depending on the relative costs of genotyping, molecular haplotyping, and phenotyping. The LRT(ae) showed the greatest benefit over the LRT(std) when the cost of phenotyping was very high relative to the cost of genotyping. This situation is likely to occur in a replication study as opposed to a whole-genome association study.