Cargando…

Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models

BACKGROUND: Linear mixed-effects models (LMM) are a leading method in conducting genome-wide association studies (GWAS) but require residual maximum likelihood (REML) estimation of variance components, which is computationally demanding. Previous work has reduced the computational burden of variance...

Descripción completa

Detalles Bibliográficos
Autores principales: Border, Richard, Becker, Stephen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668092/
https://www.ncbi.nlm.nih.gov/pubmed/31362713
http://dx.doi.org/10.1186/s12859-019-2978-z
_version_ 1783440154499743744
author Border, Richard
Becker, Stephen
author_facet Border, Richard
Becker, Stephen
author_sort Border, Richard
collection PubMed
description BACKGROUND: Linear mixed-effects models (LMM) are a leading method in conducting genome-wide association studies (GWAS) but require residual maximum likelihood (REML) estimation of variance components, which is computationally demanding. Previous work has reduced the computational burden of variance component estimation by replacing direct matrix operations with iterative and stochastic methods and by employing loose tolerances to limit the number of iterations in the REML optimization procedure. Here, we introduce two novel algorithms, stochastic Lanczos derivative-free REML (SLDF_REML) and Lanczos first-order Monte Carlo REML (L_FOMC_REML), that exploit problem structure via the principle of Krylov subspace shift-invariance to speed computation beyond existing methods. Both novel algorithms only require a single round of computation involving iterative matrix operations, after which their respective objectives can be repeatedly evaluated using vector operations. Further, in contrast to existing stochastic methods, SLDF_REML can exploit precomputed genomic relatedness matrices (GRMs), when available, to further speed computation. RESULTS: Results of numerical experiments are congruent with theory and demonstrate that interpreted-language implementations of both algorithms match or exceed existing compiled-language software packages in speed, accuracy, and flexibility. CONCLUSIONS: Both the SLDF_REML and L_FOMC_REML algorithms outperform existing methods for REML estimation of variance components for LMM and are suitable for incorporation into existing GWAS LMM software implementations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2978-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6668092
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66680922019-08-05 Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models Border, Richard Becker, Stephen BMC Bioinformatics Methodology Article BACKGROUND: Linear mixed-effects models (LMM) are a leading method in conducting genome-wide association studies (GWAS) but require residual maximum likelihood (REML) estimation of variance components, which is computationally demanding. Previous work has reduced the computational burden of variance component estimation by replacing direct matrix operations with iterative and stochastic methods and by employing loose tolerances to limit the number of iterations in the REML optimization procedure. Here, we introduce two novel algorithms, stochastic Lanczos derivative-free REML (SLDF_REML) and Lanczos first-order Monte Carlo REML (L_FOMC_REML), that exploit problem structure via the principle of Krylov subspace shift-invariance to speed computation beyond existing methods. Both novel algorithms only require a single round of computation involving iterative matrix operations, after which their respective objectives can be repeatedly evaluated using vector operations. Further, in contrast to existing stochastic methods, SLDF_REML can exploit precomputed genomic relatedness matrices (GRMs), when available, to further speed computation. RESULTS: Results of numerical experiments are congruent with theory and demonstrate that interpreted-language implementations of both algorithms match or exceed existing compiled-language software packages in speed, accuracy, and flexibility. CONCLUSIONS: Both the SLDF_REML and L_FOMC_REML algorithms outperform existing methods for REML estimation of variance components for LMM and are suitable for incorporation into existing GWAS LMM software implementations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2978-z) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-30 /pmc/articles/PMC6668092/ /pubmed/31362713 http://dx.doi.org/10.1186/s12859-019-2978-z Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Border, Richard
Becker, Stephen
Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models
title Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models
title_full Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models
title_fullStr Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models
title_full_unstemmed Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models
title_short Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models
title_sort stochastic lanczos estimation of genomic variance components for linear mixed-effects models
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668092/
https://www.ncbi.nlm.nih.gov/pubmed/31362713
http://dx.doi.org/10.1186/s12859-019-2978-z
work_keys_str_mv AT borderrichard stochasticlanczosestimationofgenomicvariancecomponentsforlinearmixedeffectsmodels
AT beckerstephen stochasticlanczosestimationofgenomicvariancecomponentsforlinearmixedeffectsmodels