Proposal: separate surrogate settings from x, y points

**Motivation**

Although the surrogate type defines the surrogate construction, it cannot act as a generic algorithmic dispatch if the user already has a set of points `(x, y)`.

The proposed design tracks with the `DiffEq` ecosystem closer, as a bonus.

**Current State**

Currently, each `AbstractSurrogate` includes both the surrogate algorithm and the `x, y` points the surrogate is generated over. The surrogate hyperparameters are therefore mixed in the struct definition with the `x, y` points, which exist independently. Some surrogates also get passed an `lb, ub` pair, which isn't necessarily used in the surrogate generation. Furthermore, the constructor for the surrogate does a lot of work in order to return the final type.

**Proposal**

One way to resolve this is to separate out the fitted surrogate from the surrogate algorithm used.

```
mutable struct FittedSurrogate{X, Y, S <: AbstractSurrogate, SS}
    x::X
    y::Y
    surrogate::S
    surrogate_state::SS
end
```

A new `fit` method can be defined which generates the surrogate based on the passed algorithm. Any internal parameters can be held within a `surrogate_cache` type which would remain unspecified.

Looking at a more concrete example, taking the `NueralSurrogate` example from the documentation:
```
using Surrogates
using Flux
using Statistics

f = x -> x[1]^2 + x[2]^2
bounds = Float32[-1.0, -1.0], Float32[1.0, 1.0]
# Flux models are in single precision by default.
# Thus, single precision will also be used here for our training samples.

x_train = sample(100, bounds..., SobolSample())
y_train = f.(x_train)

# Perceptron with one hidden layer of 20 neurons.
model = Chain(Dense(2, 20, relu), Dense(20, 1))
loss(x, y) = Flux.mse(model(x), y)

# Training of the neural network
learning_rate = 0.1
optimizer = Descent(learning_rate)  # Simple gradient descent. See Flux documentation for other options.
n_epochs = 50
sgt_model = NeuralSurrogate(model=model, loss=loss, opt=optimizer, n_echos=n_epochs)
sgt = fit(x_train, y_train, sgt_model)

# Testing the new model
x_test = sample(30, bounds..., SobolSample())
test_error = mean(abs2, sgt(x)[1] - f(x) for x in x_test)
```

A linear surrogate would just be
```
struct LinearSurrogate <: AbstractSurrogate end

my_linear_surr_1D = fit(x, y, LinearSurrogate())
```

**Potential Issues**

Here I proposed a design where the surrogate state (= the result of fitting) is stored with the surrogate in the `FittedSurrogate` type. Another way we could do this is to store the surrogate state in the algorithmic dispatch type. However, we would need to specify the types of the `x, y` points ahead of time. In this design the surrogate state is separate, kind of like `ODEIntegrator`, where there is an `alg` and a separate `cache` for the `alg`. As a pathological example, in the case of the `NeuralSurrogate`, the internal model state would be updated during the fitting process because the model structure (currently) is held as a hyperparameter for the surrogate. I'm not sure how to cleanly solve the issue without a `deepcopy` of the model into the surrogate state type.

Interested to hear what you think!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Proposal: separate surrogate settings from x, y points #238

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Proposal: separate surrogate settings from x, y points #238

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions