Cargando…
A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies
The mixed linear model (MLM) has been widely used in genome-wide association study (GWAS) to dissect quantitative traits in human, animal, and plant genetics. Most methodologies consider all single nucleotide polymorphism (SNP) effects as random effects under the MLM framework, which fail to detect...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8041068/ https://www.ncbi.nlm.nih.gov/pubmed/33854527 http://dx.doi.org/10.3389/fgene.2021.649196 |
_version_ | 1783677873988567040 |
---|---|
author | Zhang, Jin Chen, Min Wen, Yangjun Zhang, Yin Lu, Yunan Wang, Shengmeng Chen, Juncong |
author_facet | Zhang, Jin Chen, Min Wen, Yangjun Zhang, Yin Lu, Yunan Wang, Shengmeng Chen, Juncong |
author_sort | Zhang, Jin |
collection | PubMed |
description | The mixed linear model (MLM) has been widely used in genome-wide association study (GWAS) to dissect quantitative traits in human, animal, and plant genetics. Most methodologies consider all single nucleotide polymorphism (SNP) effects as random effects under the MLM framework, which fail to detect the joint minor effect of multiple genetic markers on a trait. Therefore, polygenes with minor effects remain largely unexplored in today’s big data era. In this study, we developed a new algorithm under the MLM framework, which is called the fast multi-locus ridge regression (FastRR) algorithm. The FastRR algorithm first whitens the covariance matrix of the polygenic matrix K and environmental noise, then selects potentially related SNPs among large scale markers, which have a high correlation with the target trait, and finally analyzes the subset variables using a multi-locus deshrinking ridge regression for true quantitative trait nucleotide (QTN) detection. Results from the analyses of both simulated and real data show that the FastRR algorithm is more powerful for both large and small QTN detection, more accurate in QTN effect estimation, and has more stable results under various polygenic backgrounds. Moreover, compared with existing methods, the FastRR algorithm has the advantage of high computing speed. In conclusion, the FastRR algorithm provides an alternative algorithm for multi-locus GWAS in high dimensional genomic datasets. |
format | Online Article Text |
id | pubmed-8041068 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-80410682021-04-13 A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies Zhang, Jin Chen, Min Wen, Yangjun Zhang, Yin Lu, Yunan Wang, Shengmeng Chen, Juncong Front Genet Genetics The mixed linear model (MLM) has been widely used in genome-wide association study (GWAS) to dissect quantitative traits in human, animal, and plant genetics. Most methodologies consider all single nucleotide polymorphism (SNP) effects as random effects under the MLM framework, which fail to detect the joint minor effect of multiple genetic markers on a trait. Therefore, polygenes with minor effects remain largely unexplored in today’s big data era. In this study, we developed a new algorithm under the MLM framework, which is called the fast multi-locus ridge regression (FastRR) algorithm. The FastRR algorithm first whitens the covariance matrix of the polygenic matrix K and environmental noise, then selects potentially related SNPs among large scale markers, which have a high correlation with the target trait, and finally analyzes the subset variables using a multi-locus deshrinking ridge regression for true quantitative trait nucleotide (QTN) detection. Results from the analyses of both simulated and real data show that the FastRR algorithm is more powerful for both large and small QTN detection, more accurate in QTN effect estimation, and has more stable results under various polygenic backgrounds. Moreover, compared with existing methods, the FastRR algorithm has the advantage of high computing speed. In conclusion, the FastRR algorithm provides an alternative algorithm for multi-locus GWAS in high dimensional genomic datasets. Frontiers Media S.A. 2021-03-29 /pmc/articles/PMC8041068/ /pubmed/33854527 http://dx.doi.org/10.3389/fgene.2021.649196 Text en Copyright © 2021 Zhang, Chen, Wen, Zhang, Lu, Wang and Chen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Zhang, Jin Chen, Min Wen, Yangjun Zhang, Yin Lu, Yunan Wang, Shengmeng Chen, Juncong A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies |
title | A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies |
title_full | A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies |
title_fullStr | A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies |
title_full_unstemmed | A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies |
title_short | A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies |
title_sort | fast multi-locus ridge regression algorithm for high-dimensional genome-wide association studies |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8041068/ https://www.ncbi.nlm.nih.gov/pubmed/33854527 http://dx.doi.org/10.3389/fgene.2021.649196 |
work_keys_str_mv | AT zhangjin afastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT chenmin afastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT wenyangjun afastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT zhangyin afastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT luyunan afastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT wangshengmeng afastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT chenjuncong afastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT zhangjin fastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT chenmin fastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT wenyangjun fastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT zhangyin fastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT luyunan fastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT wangshengmeng fastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies AT chenjuncong fastmultilocusridgeregressionalgorithmforhighdimensionalgenomewideassociationstudies |