GPflux: A Library for Deep Gaussian Processes

Date July 26, 2021
Author Vincent Dutordoir

GPflux, the second library to be open-sourced by Secondmind labs, is a research toolbox dedicated to Deep Gaussian processes. Deep Gaussian processes (DGPs) are the hierarchical extension of Gaussian processes (GPs) that are created by stacking multiple GPs on top of each other.

DGPs have large potential in many applications, but to date, there are no actively maintained open-sourced libraries available that support research activities on DPGs or the reliable deployment of these models. In contrast, Deep Neural Networks have seen an amazing upswing of high-quality libraries and, in turn, the deep learning community has greatly improved the capabilities of deep neural networks.

GPflux was created to fill this need for DGPs. In particular, the goal of GPflux is twofold: 1) enable fundamental research in DPG models and approximate inference therein, and 2) provide reliable, high-quality building blocks for training and deploying state-of-the-art DGPs. For the latter GPflux relies on the powerful deep learning abstractions provided by Keras, while the former is supported by GPflow —another open-sourced toolbox maintained in large part by researchers at Secondmind Labs.

Essentially, GPflux is designed as a deep learning library where functionality is packed into layers, and layers can be stacked on top of each other to form a hierarchical (i.e. deep) model. The aim is for the library to be as easy to use as any modern neural network toolbox.

What are Deep Gaussian Processes?

At Secondmind, GPs are a defining part of our modelling solutions. Their versatile usage and strong generalisation capabilities in difficult data regimes make them prime candidates for many of the problems tackled at Secondmind.

However, the performance of a GP is largely mandated by the suitability of the chosen kernel, and several factors like periodicity, stationarity, wiggliness, dynamic range, among others, of the dataset should be considered in designing a suitable kernel. This makes it a very hard task, even for GP experts.

Deep Gaussian processes alleviate this problem by stacking multiple GP models on top of each other, where the output of one GP is fed as the input to the next. Typically, each GP in a DGP model has a simple kernel, like the squared exponential or Matern, but the hierarchical structure of the model makes it more flexible and thus more appropriate for a larger range of problems.

In general, when using DGPs we allow for a more broad solution space, and we let the data decide on what kind of modelling assumptions to make. In technical jargon we call this feature learning, which is a defining property of hierarchical models, such as the DGP. This can lead to drastic improvements in the overall performance of the system.

As an example, consider the simple one-dimensional dataset below. The data is depicted by the black crosses and the task is to infer the underlying function that generated the data. On the left we see the prediction of a single layer GP and on the right that of a two-layer deep GP. Both models give their prediction for the mean by a solid orange line and their prediction for the variance by the orange shaded area. We notice how the single model GP model, using a standard squared exponential kernel, is struggling to perform this task. This is because the data is showing varying behaviour across the input domain, for which the squared exponential kernel is not designed. On the contrary, the two-layer DGP using two simple GPs each configured with a squared exponential kernel, is able to model this data more accurately: we can see how the mean is closely modelling the data, while the variance is growing and shrinking in the right areas. This simple toy example already shows the power of DGPs.

Single layer GP with Squared Exponential kernel. The data is given by the black crosses and the mean and variance estimates for the model is given by the orange line and shaded area, respectively.
Two layer DGP with Squared Exponential kernel. The data is given by the black crosses and the mean and variance estimates for the model is given by the orange line and shaded area, respectively.

Want to get involved?

GPflux is an open-source project. If you have relevant skills and are interested in contributing then please do so through our GitHub page, where you can find the complete source code. To get started, have a look at our Documentation: we have multiple Tutorials showing the basic functionality of the toolbox, a benchmark implementation and a comprehensive API reference. For more information, have a look at the accompanying paper, recently accepted at the International Conference of Probabilistic Programming.