Cargando…

Automatic block-wise genotype-phenotype association detection based on hidden Markov model

BACKGROUND: For detecting genotype-phenotype association from case–control single nucleotide polymorphism (SNP) data, one class of methods relies on testing each genomic variant site individually. However, this approach ignores the tendency for associated variant sites to be spatially clustered inst...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Jin, Wang, Chaojie, Wang, Lijun, Mao, Shanjun, Zhu, Bencong, Li, Zheng, Fan, Xiaodan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10082540/
https://www.ncbi.nlm.nih.gov/pubmed/37029361
http://dx.doi.org/10.1186/s12859-023-05265-5
Descripción
Sumario:BACKGROUND: For detecting genotype-phenotype association from case–control single nucleotide polymorphism (SNP) data, one class of methods relies on testing each genomic variant site individually. However, this approach ignores the tendency for associated variant sites to be spatially clustered instead of uniformly distributed along the genome. Therefore, a more recent class of methods looks for blocks of influential variant sites. Unfortunately, existing such methods either assume prior knowledge of the blocks, or rely on ad hoc moving windows. A principled method is needed to automatically detect genomic variant blocks which are associated with the phenotype. RESULTS: In this paper, we introduce an automatic block-wise Genome-Wide Association Study (GWAS) method based on Hidden Markov model. Using case–control SNP data as input, our method detects the number of blocks associated with the phenotype and the locations of the blocks. Correspondingly, the minor allele of each variate site will be classified as having negative influence, no influence or positive influence on the phenotype. We evaluated our method using both datasets simulated from our model and datasets from a block model different from ours, and compared the performance with other methods. These included both simple methods based on the Fisher’s exact test, applied site-by-site, as well as more complex methods built into the recent Zoom-Focus Algorithm. Across all simulations, our method consistently outperformed the comparisons. CONCLUSIONS: With its demonstrated better performance, we expect our algorithm for detecting influential variant sites may help find more accurate signals across a wide range of case–control GWAS.