Skip to content

Commit 9182073

Browse files
Merge pull request #435 from dreycenfoiles/Ask-Tell
Parallel Surrogate Models Through Ask-tell Interface
2 parents 1ee0a5c + ea4eef2 commit 9182073

File tree

6 files changed

+442
-0
lines changed

6 files changed

+442
-0
lines changed

docs/pages.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ pages = ["index.md"
1616
"Gradient Enhanced Kriging" => "gek.md",
1717
"GEKPLS" => "gekpls.md",
1818
"MOE" => "moe.md",
19+
"Parallel Optimization" => "parallel.md"
1920
]
2021
"User guide" => [
2122
"Samples" => "samples.md",

docs/src/parallel.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Parallel Optimization
2+
3+
There are some situations where it can be beneficial to run multiple optimizations in parallel. For example, if your objective function is very expensive to evaluate, you may want to run multiple evaluations in parallel.
4+
5+
## Ask-Tell Interface
6+
7+
To enable parallel optimization, we make use of an Ask-Tell interface. The user will construct the initial surrogate model the same way as for non-parallel surrogate models, but instead of using `surrogate_optimize`, the user will use `potential_optimal_points`. This will return the coordinates of points that the optimizer has determined are most useful to evaluate next. How the user evaluates these points is up to them. The Ask-Tell interface requires more manual control than `surrogate_optimize`, but it allows for more flexibility. After the point has been evaluated, the user will *tell* the surrogate model the new points with the `add_point!` function.
8+
9+
## Virtual Points
10+
11+
To ensure that points of interest returned by `potential_optimal_points` are sufficiently far from each other, the function makes use of *virtual points*. They are used as follows:
12+
1. `potential_optimal_points` is told to return `n` points.
13+
2. The point with the highest merit function value is selected.
14+
3. This point is now treated as a virtual point and is assigned a temporary value that changes the landscape of the merit function. How the the temporary value is chosen depends on the strategy used. (see below)
15+
4. The point with the new highest merit is selected.
16+
5. The process is repeated until `n` points have been selected.
17+
18+
The following strategies are available for virtual point selection for all optimization algorithms:
19+
20+
- "Minimum Constant Liar (CLmin)":
21+
- The virtual point is assigned using the lowest known value of the merit function across all evaluated points.
22+
- "Mean Constant Liar (CLmean)":
23+
- The virtual point is assigned using the mean of the merit function across all evaluated points.
24+
- "Maximum Constant Liar (CLmax)":
25+
- The virtual point is assigned using the great known value of the merit function across all evaluated points.
26+
27+
For Kriging surrogates, specifically, the above and follow strategies are available:
28+
29+
- "Kriging Believer (KB)":
30+
- The virtual point is assigned using the mean of the Kriging surrogate at the virtual point.
31+
- "Kriging Believer Upper Bound (KBUB)":
32+
- The virtual point is assigned using 3$\sigma$ above the mean of the Kriging surrogate at the virtual point.
33+
- "Kriging Believer Lower Bound (KBLB)":
34+
- The virtual point is assigned using 3$\sigma$ below the mean of the Kriging surrogate at the virtual point.
35+
36+
37+
In general, CLmin and KBLB tend to favor exploitation while CLmax and KBUB tend to favor exploration. CLmean and KB tend to be a compromise between the two.
38+
39+
## Examples
40+
41+
```@example
42+
using Surrogates
43+
44+
lb = 0.0
45+
ub = 10.0
46+
f = x -> log(x) * exp(x)
47+
x = sample(5, lb, ub, SobolSample())
48+
y = f.(x)
49+
50+
my_k = Kriging(x, y, lb, ub)
51+
52+
for _ in 1:10
53+
new_x, eis = potential_optimal_points(EI(), lb, ub, my_k, SobolSample(), 3, CLmean!)
54+
add_point!(my_k, new_x, f.(new_x))
55+
end
56+
```

src/Optimization.jl

100644100755
Lines changed: 283 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,14 @@ using Distributions
33
using Zygote
44

55
abstract type SurrogateOptimizationAlgorithm end
6+
abstract type ParallelStrategy end
7+
8+
struct KrigingBeliever <: ParallelStrategy end
9+
struct KrigingBelieverUpperBound <: ParallelStrategy end
10+
struct KrigingBelieverLowerBound <: ParallelStrategy end
11+
struct MinimumConstantLiar <: ParallelStrategy end
12+
struct MaximumConstantLiar <: ParallelStrategy end
13+
struct MeanConstantLiar <: ParallelStrategy end
614

715
#single objective optimization
816
struct SRBF <: SurrogateOptimizationAlgorithm end
@@ -375,6 +383,214 @@ function surrogate_optimize(obj::Function, ::SRBF, lb::Number, ub::Number,
375383
end
376384
end
377385

386+
# Ask SRBF ND
387+
function potential_optimal_points(::SRBF, strategy, lb, ub, surr::AbstractSurrogate, sample_type::SamplingAlgorithm, n_parallel;
388+
num_new_samples = 500)
389+
390+
scale = 0.2
391+
w_range = [0.3, 0.5, 0.7, 0.95]
392+
w_cycle = Iterators.cycle(w_range)
393+
394+
w, state = iterate(w_cycle)
395+
396+
#Vector containing size in each direction
397+
box_size = lb - ub
398+
dtol = 1e-3 * norm(ub - lb)
399+
d = length(surr.x)
400+
incumbent_x = surr.x[argmin(surr.y)]
401+
402+
new_lb = incumbent_x .- 3 * scale * norm(incumbent_x .- lb)
403+
new_ub = incumbent_x .+ 3 * scale * norm(incumbent_x .- ub)
404+
405+
@inbounds for i in 1:length(new_lb)
406+
if new_lb[i] < lb[i]
407+
new_lb = collect(new_lb)
408+
new_lb[i] = lb[i]
409+
end
410+
if new_ub[i] > ub[i]
411+
new_ub = collect(new_ub)
412+
new_ub[i] = ub[i]
413+
end
414+
end
415+
416+
new_sample = sample(num_new_samples, new_lb, new_ub, sample_type)
417+
s = zeros(eltype(surr.x[1]), num_new_samples)
418+
for j in 1:num_new_samples
419+
s[j] = surr(new_sample[j])
420+
end
421+
s_max = maximum(s)
422+
s_min = minimum(s)
423+
424+
d_min = norm(box_size .+ 1)
425+
d_max = 0.0
426+
for r in 1:length(surr.x)
427+
for c in 1:num_new_samples
428+
distance_rc = norm(surr.x[r] .- new_sample[c])
429+
if distance_rc > d_max
430+
d_max = distance_rc
431+
end
432+
if distance_rc < d_min
433+
d_min = distance_rc
434+
end
435+
end
436+
end
437+
438+
tmp_surr = deepcopy(surr)
439+
440+
441+
new_addition = 0
442+
diff_x = zeros(eltype(surr.x[1]), d)
443+
444+
evaluation_of_merit_function = zeros(float(eltype(surr.x[1])), num_new_samples)
445+
proposed_points_x = Vector{typeof(surr.x[1])}(undef, n_parallel)
446+
merit_of_proposed_points = zeros(Float64, n_parallel)
447+
448+
while new_addition < n_parallel
449+
#find minimum
450+
451+
@inbounds for r in eachindex(evaluation_of_merit_function)
452+
evaluation_of_merit_function[r] = merit_function(new_sample[r], w, tmp_surr,
453+
s_max, s_min, d_max, d_min,
454+
box_size)
455+
end
456+
457+
min_index = argmin(evaluation_of_merit_function)
458+
new_min_x = new_sample[min_index]
459+
min_x_merit = evaluation_of_merit_function[min_index]
460+
461+
for l in 1:d
462+
diff_x[l] = norm(surr.x[l] .- new_min_x)
463+
end
464+
bit_x = diff_x .> dtol
465+
#new_min_x has to have some distance from krig.x
466+
if false in bit_x
467+
#The new_point is not actually that new, discard it!
468+
469+
deleteat!(evaluation_of_merit_function, min_index)
470+
deleteat!(new_sample, min_index)
471+
472+
if length(new_sample) == 0
473+
println("Out of sampling points")
474+
index = argmin(surr.y)
475+
return (surr.x[index], surr.y[index])
476+
end
477+
else
478+
new_addition += 1
479+
proposed_points_x[new_addition] = new_min_x
480+
merit_of_proposed_points[new_addition] = min_x_merit
481+
482+
# Update temporary surrogate using provided strategy
483+
calculate_liars(strategy, tmp_surr, surr, new_min_x)
484+
end
485+
486+
#4) Update w
487+
w, state = iterate(w_cycle, state)
488+
end
489+
490+
return (proposed_points_x, merit_of_proposed_points)
491+
end
492+
493+
# Ask SRBF 1D
494+
function potential_optimal_points(::SRBF, strategy, lb::Number, ub::Number, surr::AbstractSurrogate,
495+
sample_type::SamplingAlgorithm, n_parallel;
496+
num_new_samples = 500)
497+
scale = 0.2
498+
success = 0
499+
w_range = [0.3, 0.5, 0.7, 0.95]
500+
w_cycle = Iterators.cycle(w_range)
501+
502+
w, state = iterate(w_cycle)
503+
504+
box_size = lb - ub
505+
success = 0
506+
failures = 0
507+
dtol = 1e-3 * norm(ub - lb)
508+
num_of_iterations = 0
509+
510+
#1) Sample near incumbent (the 2 fraction is arbitrary here)
511+
incumbent_x = surr.x[argmin(surr.y)]
512+
513+
new_lb = incumbent_x - scale * norm(incumbent_x - lb)
514+
new_ub = incumbent_x + scale * norm(incumbent_x - ub)
515+
if new_lb < lb
516+
new_lb = lb
517+
end
518+
if new_ub > ub
519+
new_ub = ub
520+
end
521+
522+
new_sample = sample(num_new_samples, new_lb, new_ub, sample_type)
523+
524+
#2) Create merit function
525+
s = zeros(eltype(surr.x[1]), num_new_samples)
526+
for j in 1:num_new_samples
527+
s[j] = surr(new_sample[j])
528+
end
529+
s_max = maximum(s)
530+
s_min = minimum(s)
531+
532+
d_min = box_size + 1
533+
d_max = 0.0
534+
for r in 1:length(surr.x)
535+
for c in 1:num_new_samples
536+
distance_rc = norm(surr.x[r] - new_sample[c])
537+
if distance_rc > d_max
538+
d_max = distance_rc
539+
end
540+
if distance_rc < d_min
541+
d_min = distance_rc
542+
end
543+
end
544+
end
545+
546+
new_addition = 0
547+
proposed_points_x = zeros(eltype(new_sample[1]), n_parallel)
548+
merit_of_proposed_points = zeros(eltype(new_sample[1]), n_parallel)
549+
550+
# Temporary surrogate for virtual points
551+
tmp_surr = deepcopy(surr)
552+
553+
# Loop until we have n_parallel new points
554+
while new_addition < n_parallel
555+
556+
#3) Evaluate merit function at the sampled points in parallel
557+
evaluation_of_merit_function = merit_function.(new_sample, w, tmp_surr, s_max,
558+
s_min, d_max, d_min, box_size)
559+
560+
#find minimum
561+
min_index = argmin(evaluation_of_merit_function)
562+
new_min_x = new_sample[min_index]
563+
min_x_merit = evaluation_of_merit_function[min_index]
564+
565+
diff_x = abs.(tmp_surr.x .- new_min_x)
566+
bit_x = diff_x .> dtol
567+
#new_min_x has to have some distance from krig.x
568+
if false in bit_x
569+
#The new_point is not actually that new, discard it!
570+
deleteat!(evaluation_of_merit_function, min_index)
571+
deleteat!(new_sample, min_index)
572+
if length(new_sample) == 0
573+
println("Out of sampling points")
574+
return (proposed_points_x[1:new_addition],
575+
merit_of_proposed_points[1:new_addition])
576+
end
577+
else
578+
new_addition += 1
579+
proposed_points_x[new_addition] = new_min_x
580+
merit_of_proposed_points[new_addition] = min_x_merit
581+
582+
# Update temporary surrogate using provided strategy
583+
calculate_liars(strategy, tmp_surr, surr, new_min_x)
584+
end
585+
586+
#4) Update w
587+
w, state = iterate(w_cycle, state)
588+
end
589+
590+
return (proposed_points_x, merit_of_proposed_points)
591+
end
592+
593+
378594
"""
379595
This is an implementation of Lower Confidence Bound (LCB),
380596
a popular acquisition function in Bayesian optimization.
@@ -569,6 +785,73 @@ function surrogate_optimize(obj::Function, ::EI, lb::Number, ub::Number, krig,
569785
println("Completed maximum number of iterations")
570786
end
571787

788+
# Ask EI 1D & ND
789+
function potential_optimal_points(::EI, strategy, lb, ub, krig, sample_type::SamplingAlgorithm, n_parallel::Number;
790+
num_new_samples = 100)
791+
792+
lb = krig.lb
793+
ub = krig.ub
794+
795+
dtol = 1e-3 * norm(ub - lb)
796+
eps = 0.01
797+
798+
tmp_krig = deepcopy(krig) # Temporary copy of the kriging model to store virtual points
799+
800+
new_x_max = Vector{typeof(tmp_krig.x[1])}(undef, n_parallel) # New x point
801+
new_EI_max = zeros(eltype(tmp_krig.x[1]), n_parallel) # EI at new x point
802+
803+
for i in 1:n_parallel
804+
# Sample lots of points from the design space -- we will evaluate the EI function at these points
805+
new_sample = sample(num_new_samples, lb, ub, sample_type)
806+
807+
# Find the best point so far
808+
f_min = minimum(tmp_krig.y)
809+
810+
# Allocate some arrays
811+
evaluations = zeros(eltype(tmp_krig.x[1]), num_new_samples) # Holds EI function evaluations
812+
point_found = false # Whether we have found a new point to test
813+
while point_found == false
814+
# For each point in the sample set, evaluate the Expected Improvement function
815+
for j in eachindex(new_sample)
816+
std = std_error_at_point(tmp_krig, new_sample[j])
817+
u = tmp_krig(new_sample[j])
818+
if abs(std) > 1e-6
819+
z = (f_min - u - eps) / std
820+
else
821+
z = 0
822+
end
823+
# Evaluate EI at point new_sample[j]
824+
evaluations[j] = (f_min - u - eps) * cdf(Normal(), z) +
825+
std * pdf(Normal(), z)
826+
end
827+
# find the sample which maximizes the EI function
828+
index_max = argmax(evaluations)
829+
x_new = new_sample[index_max] # x point which maximized EI
830+
y_new = maximum(evaluations) # EI at the new point
831+
diff_x = [norm(prev_point .- x_new) for prev_point in tmp_krig.x]
832+
bit_x = [diff_x_point .> dtol for diff_x_point in diff_x]
833+
#new_min_x has to have some distance from tmp_krig.x
834+
if false in bit_x
835+
#The new_point is not actually that new, discard it!
836+
deleteat!(evaluations, index_max)
837+
deleteat!(new_sample, index_max)
838+
if length(new_sample) == 0
839+
println("Out of sampling points")
840+
index = argmin(tmp_krig.y)
841+
return (tmp_krig.x[index], tmp_krig.y[index])
842+
end
843+
else
844+
point_found = true
845+
new_x_max[i] = x_new
846+
new_EI_max[i] = y_new
847+
calculate_liars(strategy, tmp_krig, krig, x_new)
848+
end
849+
end
850+
end
851+
852+
return (new_x_max, new_EI_max)
853+
end
854+
572855
"""
573856
This is an implementation of Expected Improvement (EI),
574857
arguably the most popular acquisition function in Bayesian optimization.

src/Surrogates.jl

100644100755
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ include("VariableFidelity.jl")
1818
include("Earth.jl")
1919
include("GEK.jl")
2020
include("GEKPLS.jl")
21+
include("VirtualStrategy.jl")
2122

2223
current_surrogates = ["Kriging", "LinearSurrogate", "LobachevskySurrogate",
2324
"NeuralSurrogate",
@@ -88,6 +89,11 @@ export LobachevskyStructure, NeuralStructure, RandomForestStructure,
8889
export WendlandStructure
8990
export AbstractSurrogate, SamplingAlgorithm
9091
export Kriging, RadialBasis, add_point!, current_estimate, std_error_at_point
92+
# Parallelization Strategies
93+
export potential_optimal_points
94+
export MinimumConstantLiar, MaximumConstantLiar, MeanConstantLiar, KrigingBeliever,
95+
KrigingBelieverUpperBound, KrigingBelieverLowerBound
96+
9197
# radial basis functions
9298
export linearRadial, cubicRadial, multiquadricRadial, thinplateRadial
9399

0 commit comments

Comments
 (0)