Skip to content

Proposal: separate surrogate settings from x, y points #238

@platawiec

Description

@platawiec

Motivation

Although the surrogate type defines the surrogate construction, it cannot act as a generic algorithmic dispatch if the user already has a set of points (x, y).

The proposed design tracks with the DiffEq ecosystem closer, as a bonus.

Current State

Currently, each AbstractSurrogate includes both the surrogate algorithm and the x, y points the surrogate is generated over. The surrogate hyperparameters are therefore mixed in the struct definition with the x, y points, which exist independently. Some surrogates also get passed an lb, ub pair, which isn't necessarily used in the surrogate generation. Furthermore, the constructor for the surrogate does a lot of work in order to return the final type.

Proposal

One way to resolve this is to separate out the fitted surrogate from the surrogate algorithm used.

mutable struct FittedSurrogate{X, Y, S <: AbstractSurrogate, SS}
    x::X
    y::Y
    surrogate::S
    surrogate_state::SS
end

A new fit method can be defined which generates the surrogate based on the passed algorithm. Any internal parameters can be held within a surrogate_cache type which would remain unspecified.

Looking at a more concrete example, taking the NueralSurrogate example from the documentation:

using Surrogates
using Flux
using Statistics

f = x -> x[1]^2 + x[2]^2
bounds = Float32[-1.0, -1.0], Float32[1.0, 1.0]
# Flux models are in single precision by default.
# Thus, single precision will also be used here for our training samples.

x_train = sample(100, bounds..., SobolSample())
y_train = f.(x_train)

# Perceptron with one hidden layer of 20 neurons.
model = Chain(Dense(2, 20, relu), Dense(20, 1))
loss(x, y) = Flux.mse(model(x), y)

# Training of the neural network
learning_rate = 0.1
optimizer = Descent(learning_rate)  # Simple gradient descent. See Flux documentation for other options.
n_epochs = 50
sgt_model = NeuralSurrogate(model=model, loss=loss, opt=optimizer, n_echos=n_epochs)
sgt = fit(x_train, y_train, sgt_model)

# Testing the new model
x_test = sample(30, bounds..., SobolSample())
test_error = mean(abs2, sgt(x)[1] - f(x) for x in x_test)

A linear surrogate would just be

struct LinearSurrogate <: AbstractSurrogate end

my_linear_surr_1D = fit(x, y, LinearSurrogate())

Potential Issues

Here I proposed a design where the surrogate state (= the result of fitting) is stored with the surrogate in the FittedSurrogate type. Another way we could do this is to store the surrogate state in the algorithmic dispatch type. However, we would need to specify the types of the x, y points ahead of time. In this design the surrogate state is separate, kind of like ODEIntegrator, where there is an alg and a separate cache for the alg. As a pathological example, in the case of the NeuralSurrogate, the internal model state would be updated during the fitting process because the model structure (currently) is held as a hyperparameter for the surrogate. I'm not sure how to cleanly solve the issue without a deepcopy of the model into the surrogate state type.

Interested to hear what you think!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions