-
-
Notifications
You must be signed in to change notification settings - Fork 76
Description
Motivation
Although the surrogate type defines the surrogate construction, it cannot act as a generic algorithmic dispatch if the user already has a set of points (x, y).
The proposed design tracks with the DiffEq ecosystem closer, as a bonus.
Current State
Currently, each AbstractSurrogate includes both the surrogate algorithm and the x, y points the surrogate is generated over. The surrogate hyperparameters are therefore mixed in the struct definition with the x, y points, which exist independently. Some surrogates also get passed an lb, ub pair, which isn't necessarily used in the surrogate generation. Furthermore, the constructor for the surrogate does a lot of work in order to return the final type.
Proposal
One way to resolve this is to separate out the fitted surrogate from the surrogate algorithm used.
mutable struct FittedSurrogate{X, Y, S <: AbstractSurrogate, SS}
x::X
y::Y
surrogate::S
surrogate_state::SS
end
A new fit method can be defined which generates the surrogate based on the passed algorithm. Any internal parameters can be held within a surrogate_cache type which would remain unspecified.
Looking at a more concrete example, taking the NueralSurrogate example from the documentation:
using Surrogates
using Flux
using Statistics
f = x -> x[1]^2 + x[2]^2
bounds = Float32[-1.0, -1.0], Float32[1.0, 1.0]
# Flux models are in single precision by default.
# Thus, single precision will also be used here for our training samples.
x_train = sample(100, bounds..., SobolSample())
y_train = f.(x_train)
# Perceptron with one hidden layer of 20 neurons.
model = Chain(Dense(2, 20, relu), Dense(20, 1))
loss(x, y) = Flux.mse(model(x), y)
# Training of the neural network
learning_rate = 0.1
optimizer = Descent(learning_rate) # Simple gradient descent. See Flux documentation for other options.
n_epochs = 50
sgt_model = NeuralSurrogate(model=model, loss=loss, opt=optimizer, n_echos=n_epochs)
sgt = fit(x_train, y_train, sgt_model)
# Testing the new model
x_test = sample(30, bounds..., SobolSample())
test_error = mean(abs2, sgt(x)[1] - f(x) for x in x_test)
A linear surrogate would just be
struct LinearSurrogate <: AbstractSurrogate end
my_linear_surr_1D = fit(x, y, LinearSurrogate())
Potential Issues
Here I proposed a design where the surrogate state (= the result of fitting) is stored with the surrogate in the FittedSurrogate type. Another way we could do this is to store the surrogate state in the algorithmic dispatch type. However, we would need to specify the types of the x, y points ahead of time. In this design the surrogate state is separate, kind of like ODEIntegrator, where there is an alg and a separate cache for the alg. As a pathological example, in the case of the NeuralSurrogate, the internal model state would be updated during the fitting process because the model structure (currently) is held as a hyperparameter for the surrogate. I'm not sure how to cleanly solve the issue without a deepcopy of the model into the surrogate state type.
Interested to hear what you think!