Cargando…
A multi-threaded approach to genotype pattern mining for detecting digenic disease genes
To locate disease-causing DNA variants on the human gene map, the customary approach has been to carry out a genome-wide association study for one variant after another by testing for genotype frequency differences between individuals affected and unaffected with disease. So-called digenic traits ar...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483394/ https://www.ncbi.nlm.nih.gov/pubmed/37693313 http://dx.doi.org/10.3389/fgene.2023.1222517 |
_version_ | 1785102371551969280 |
---|---|
author | Zhang, Qingrun Bhatia, Muskan Park, Taesung Ott, Jurg |
author_facet | Zhang, Qingrun Bhatia, Muskan Park, Taesung Ott, Jurg |
author_sort | Zhang, Qingrun |
collection | PubMed |
description | To locate disease-causing DNA variants on the human gene map, the customary approach has been to carry out a genome-wide association study for one variant after another by testing for genotype frequency differences between individuals affected and unaffected with disease. So-called digenic traits are due to the combined effects of two variants, often on different chromosomes, while individual variants may have little or no effect on disease. Machine learning approaches have been developed to find variant pairs underlying digenic traits. However, many of these methods have large memory requirements so that only small datasets can be analyzed. The increasing availability of desktop computers with large numbers of processors and suitable programming to distribute the workload evenly over all processors in a machine make a new and relatively straightforward approach possible, that is, to evaluate all existing variant and genotype pairs for disease association. We present a prototype of such a method with two components, Vpairs and Gpairs, and demonstrate its advantages over existing implementations of such well-known algorithms as Apriori and FP-growth. We apply these methods to published case-control datasets on age-related macular degeneration and Parkinson disease and construct an ROC curve for a large set of genotype patterns. |
format | Online Article Text |
id | pubmed-10483394 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-104833942023-09-08 A multi-threaded approach to genotype pattern mining for detecting digenic disease genes Zhang, Qingrun Bhatia, Muskan Park, Taesung Ott, Jurg Front Genet Genetics To locate disease-causing DNA variants on the human gene map, the customary approach has been to carry out a genome-wide association study for one variant after another by testing for genotype frequency differences between individuals affected and unaffected with disease. So-called digenic traits are due to the combined effects of two variants, often on different chromosomes, while individual variants may have little or no effect on disease. Machine learning approaches have been developed to find variant pairs underlying digenic traits. However, many of these methods have large memory requirements so that only small datasets can be analyzed. The increasing availability of desktop computers with large numbers of processors and suitable programming to distribute the workload evenly over all processors in a machine make a new and relatively straightforward approach possible, that is, to evaluate all existing variant and genotype pairs for disease association. We present a prototype of such a method with two components, Vpairs and Gpairs, and demonstrate its advantages over existing implementations of such well-known algorithms as Apriori and FP-growth. We apply these methods to published case-control datasets on age-related macular degeneration and Parkinson disease and construct an ROC curve for a large set of genotype patterns. Frontiers Media S.A. 2023-08-24 /pmc/articles/PMC10483394/ /pubmed/37693313 http://dx.doi.org/10.3389/fgene.2023.1222517 Text en Copyright © 2023 Zhang, Bhatia, Park and Ott. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Zhang, Qingrun Bhatia, Muskan Park, Taesung Ott, Jurg A multi-threaded approach to genotype pattern mining for detecting digenic disease genes |
title | A multi-threaded approach to genotype pattern mining for detecting digenic disease genes |
title_full | A multi-threaded approach to genotype pattern mining for detecting digenic disease genes |
title_fullStr | A multi-threaded approach to genotype pattern mining for detecting digenic disease genes |
title_full_unstemmed | A multi-threaded approach to genotype pattern mining for detecting digenic disease genes |
title_short | A multi-threaded approach to genotype pattern mining for detecting digenic disease genes |
title_sort | multi-threaded approach to genotype pattern mining for detecting digenic disease genes |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483394/ https://www.ncbi.nlm.nih.gov/pubmed/37693313 http://dx.doi.org/10.3389/fgene.2023.1222517 |
work_keys_str_mv | AT zhangqingrun amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes AT bhatiamuskan amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes AT parktaesung amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes AT ottjurg amultithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes AT zhangqingrun multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes AT bhatiamuskan multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes AT parktaesung multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes AT ottjurg multithreadedapproachtogenotypepatternminingfordetectingdigenicdiseasegenes |