Cargando…

Fast methods for training Gaussian processes on large datasets

Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large datasets. Here, we derive some simple...

Descripción completa

Detalles Bibliográficos
Autores principales: Moore, C. J., Chua, A. J. K., Berry, C. P. L., Gair, J. R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4892455/
https://www.ncbi.nlm.nih.gov/pubmed/27293793
http://dx.doi.org/10.1098/rsos.160125
_version_ 1782435389715251200
author Moore, C. J.
Chua, A. J. K.
Berry, C. P. L.
Gair, J. R.
author_facet Moore, C. J.
Chua, A. J. K.
Berry, C. P. L.
Gair, J. R.
author_sort Moore, C. J.
collection PubMed
description Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large datasets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.
format Online
Article
Text
id pubmed-4892455
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher The Royal Society Publishing
record_format MEDLINE/PubMed
spelling pubmed-48924552016-06-10 Fast methods for training Gaussian processes on large datasets Moore, C. J. Chua, A. J. K. Berry, C. P. L. Gair, J. R. R Soc Open Sci Mathematics Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large datasets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences. The Royal Society Publishing 2016-05-11 /pmc/articles/PMC4892455/ /pubmed/27293793 http://dx.doi.org/10.1098/rsos.160125 Text en http://creativecommons.org/licenses/by/4.0/ © 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
spellingShingle Mathematics
Moore, C. J.
Chua, A. J. K.
Berry, C. P. L.
Gair, J. R.
Fast methods for training Gaussian processes on large datasets
title Fast methods for training Gaussian processes on large datasets
title_full Fast methods for training Gaussian processes on large datasets
title_fullStr Fast methods for training Gaussian processes on large datasets
title_full_unstemmed Fast methods for training Gaussian processes on large datasets
title_short Fast methods for training Gaussian processes on large datasets
title_sort fast methods for training gaussian processes on large datasets
topic Mathematics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4892455/
https://www.ncbi.nlm.nih.gov/pubmed/27293793
http://dx.doi.org/10.1098/rsos.160125
work_keys_str_mv AT moorecj fastmethodsfortraininggaussianprocessesonlargedatasets
AT chuaajk fastmethodsfortraininggaussianprocessesonlargedatasets
AT berrycpl fastmethodsfortraininggaussianprocessesonlargedatasets
AT gairjr fastmethodsfortraininggaussianprocessesonlargedatasets