Cargando…

A multi-threaded approach to genotype pattern mining for detecting digenic disease genes

To locate disease-causing DNA variants on the human gene map, the customary approach has been to carry out a genome-wide association study for one variant after another by testing for genotype frequency differences between individuals affected and unaffected with disease. So-called digenic traits ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Qingrun, Bhatia, Muskan, Park, Taesung, Ott, Jurg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483394/
https://www.ncbi.nlm.nih.gov/pubmed/37693313
http://dx.doi.org/10.3389/fgene.2023.1222517
_version_ 1785102371551969280
author Zhang, Qingrun
Bhatia, Muskan
Park, Taesung
Ott, Jurg
author_facet Zhang, Qingrun
Bhatia, Muskan
Park, Taesung
Ott, Jurg
author_sort Zhang, Qingrun
collection PubMed
description To locate disease-causing DNA variants on the human gene map, the customary approach has been to carry out a genome-wide association study for one variant after another by testing for genotype frequency differences between individuals affected and unaffected with disease. So-called digenic traits are due to the combined effects of two variants, often on different chromosomes, while individual variants may have little or no effect on disease. Machine learning approaches have been developed to find variant pairs underlying digenic traits. However, many of these methods have large memory requirements so that only small datasets can be analyzed. The increasing availability of desktop computers with large numbers of processors and suitable programming to distribute the workload evenly over all processors in a machine make a new and relatively straightforward approach possible, that is, to evaluate all existing variant and genotype pairs for disease association. We present a prototype of such a method with two components, Vpairs and Gpairs, and demonstrate its advantages over existing implementations of such well-known algorithms as Apriori and FP-growth. We apply these methods to published case-control datasets on age-related macular degeneration and Parkinson disease and construct an ROC curve for a large set of genotype patterns.
format Online
Article
Text
id pubmed-10483394
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-104833942023-09-08 A multi-threaded approach to genotype pattern mining for detecting digenic disease genes Zhang, Qingrun Bhatia, Muskan Park, Taesung Ott, Jurg Front Genet Genetics To locate disease-causing DNA variants on the human gene map, the customary approach has been to carry out a genome-wide association study for one variant after another by testing for genotype frequency differences between individuals affected and unaffected with disease. So-called digenic traits are due to the combined effects of two variants, often on different chromosomes, while individual variants may have little or no effect on disease. Machine learning approaches have been developed to find variant pairs underlying digenic traits. However, many of these methods have large memory requirements so that only small datasets can be analyzed. The increasing availability of desktop computers with large numbers of processors and suitable programming to distribute the workload evenly over all processors in a machine make a new and relatively straightforward approach possible, that is, to evaluate all existing variant and genotype pairs for disease association. We present a prototype of such a method with two components, Vpairs and Gpairs, and demonstrate its advantages over existing implementations of such well-known algorithms as Apriori and FP-growth. We apply these methods to published case-control datasets on age-related macular degeneration and Parkinson disease and construct an ROC curve for a large set of genotype patterns. Frontiers Media S.A. 2023-08-24 /pmc/articles/PMC10483394/ /pubmed/37693313 http://dx.doi.org/10.3389/fgene.2023.1222517 Text en Copyright © 2023 Zhang, Bhatia, Park and Ott. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhang, Qingrun
Bhatia, Muskan
Park, Taesung
Ott, Jurg
A multi-threaded approach to genotype pattern mining for detecting digenic disease genes
title A multi-threaded approach to genotype pattern mining for detecting digenic disease genes
title_full A multi-threaded approach to genotype pattern mining for detecting digenic disease genes
title_fullStr A multi-threaded approach to genotype pattern mining for detecting digenic disease genes
title_full_unstemmed A multi-threaded approach to genotype pattern mining for detecting digenic disease genes
title_short A multi-threaded approach to genotype pattern mining for detecting digenic disease genes
title_sort multi-threaded approach to genotype pattern mining for detecting digenic disease genes
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483394/
https://www.ncbi.nlm.nih.gov/pubmed/37693313
http://dx.doi.org/10.3389/fgene.2023.1222517
work_keys_str_mv AT zhangqingrun amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes
AT bhatiamuskan amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes
AT parktaesung amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes
AT ottjurg amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes
AT zhangqingrun multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes
AT bhatiamuskan multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes
AT parktaesung multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes
AT ottjurg multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes