Cargando…
SLEMM: million-scale genomic predictions with window-based SNP weighting
MOTIVATION: The amount of genomic data is increasing exponentially. Using many genotyped and phenotyped individuals for genomic prediction is appealing yet challenging. RESULTS: We present SLEMM (short for Stochastic-Lanczos-Expedited Mixed Models), a new software tool, to address the computational...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10039786/ https://www.ncbi.nlm.nih.gov/pubmed/36897019 http://dx.doi.org/10.1093/bioinformatics/btad127 |
_version_ | 1784912342875635712 |
---|---|
author | Cheng, Jian Maltecca, Christian VanRaden, Paul M O'Connell, Jeffrey R Ma, Li Jiang, Jicai |
author_facet | Cheng, Jian Maltecca, Christian VanRaden, Paul M O'Connell, Jeffrey R Ma, Li Jiang, Jicai |
author_sort | Cheng, Jian |
collection | PubMed |
description | MOTIVATION: The amount of genomic data is increasing exponentially. Using many genotyped and phenotyped individuals for genomic prediction is appealing yet challenging. RESULTS: We present SLEMM (short for Stochastic-Lanczos-Expedited Mixed Models), a new software tool, to address the computational challenge. SLEMM builds on an efficient implementation of the stochastic Lanczos algorithm for REML in a framework of mixed models. We further implement SNP weighting in SLEMM to improve its predictions. Extensive analyses on seven public datasets, covering 19 polygenic traits in three plant and three livestock species, showed that SLEMM with SNP weighting had overall the best predictive ability among a variety of genomic prediction methods including GCTA’s empirical BLUP, BayesR, KAML, and LDAK’s BOLT and BayesR models. We also compared the methods using nine dairy traits of ∼300k genotyped cows. All had overall similar prediction accuracies, except that KAML failed to process the data. Additional simulation analyses on up to 3 million individuals and 1 million SNPs showed that SLEMM was advantageous over counterparts as for computational performance. Overall, SLEMM can do million-scale genomic predictions with an accuracy comparable to BayesR. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/jiang18/slemm. |
format | Online Article Text |
id | pubmed-10039786 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-100397862023-03-26 SLEMM: million-scale genomic predictions with window-based SNP weighting Cheng, Jian Maltecca, Christian VanRaden, Paul M O'Connell, Jeffrey R Ma, Li Jiang, Jicai Bioinformatics Original Paper MOTIVATION: The amount of genomic data is increasing exponentially. Using many genotyped and phenotyped individuals for genomic prediction is appealing yet challenging. RESULTS: We present SLEMM (short for Stochastic-Lanczos-Expedited Mixed Models), a new software tool, to address the computational challenge. SLEMM builds on an efficient implementation of the stochastic Lanczos algorithm for REML in a framework of mixed models. We further implement SNP weighting in SLEMM to improve its predictions. Extensive analyses on seven public datasets, covering 19 polygenic traits in three plant and three livestock species, showed that SLEMM with SNP weighting had overall the best predictive ability among a variety of genomic prediction methods including GCTA’s empirical BLUP, BayesR, KAML, and LDAK’s BOLT and BayesR models. We also compared the methods using nine dairy traits of ∼300k genotyped cows. All had overall similar prediction accuracies, except that KAML failed to process the data. Additional simulation analyses on up to 3 million individuals and 1 million SNPs showed that SLEMM was advantageous over counterparts as for computational performance. Overall, SLEMM can do million-scale genomic predictions with an accuracy comparable to BayesR. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/jiang18/slemm. Oxford University Press 2023-03-10 /pmc/articles/PMC10039786/ /pubmed/36897019 http://dx.doi.org/10.1093/bioinformatics/btad127 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Cheng, Jian Maltecca, Christian VanRaden, Paul M O'Connell, Jeffrey R Ma, Li Jiang, Jicai SLEMM: million-scale genomic predictions with window-based SNP weighting |
title | SLEMM: million-scale genomic predictions with window-based SNP weighting |
title_full | SLEMM: million-scale genomic predictions with window-based SNP weighting |
title_fullStr | SLEMM: million-scale genomic predictions with window-based SNP weighting |
title_full_unstemmed | SLEMM: million-scale genomic predictions with window-based SNP weighting |
title_short | SLEMM: million-scale genomic predictions with window-based SNP weighting |
title_sort | slemm: million-scale genomic predictions with window-based snp weighting |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10039786/ https://www.ncbi.nlm.nih.gov/pubmed/36897019 http://dx.doi.org/10.1093/bioinformatics/btad127 |
work_keys_str_mv | AT chengjian slemmmillionscalegenomicpredictionswithwindowbasedsnpweighting AT malteccachristian slemmmillionscalegenomicpredictionswithwindowbasedsnpweighting AT vanradenpaulm slemmmillionscalegenomicpredictionswithwindowbasedsnpweighting AT oconnelljeffreyr slemmmillionscalegenomicpredictionswithwindowbasedsnpweighting AT mali slemmmillionscalegenomicpredictionswithwindowbasedsnpweighting AT jiangjicai slemmmillionscalegenomicpredictionswithwindowbasedsnpweighting |