Skip to content

Support for GPU codes? #1002

@renatobellotti

Description

@renatobellotti

Hi,

I wonder whether Optim.jl supports efficient optimisations on the GPU. For me this is essential because each function evaluation is quite expensive and I have a big design vector (length ~10^5) that should stay on on the GPU throughout the optimisation to avoid unnecessary communication between host/device.

Here is a minimum example of a simple optimisation that does not seem to work:

using Optim

function test(x)
    return sum(x.^2)
end

function ∇test!(gradient, x)
    gradient[:] = (2 .* x)[:]
end

# This works:
result = optimize(test, ∇test!, [1., 2.])
# This does not:
result = optimize(test, ∇test!, cu([1., 2.]))

Error message:

CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}

DivideError: integer division error

Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/libcublas.jl:231 [inlined]
  [2] macro expansion
    @ ~/.julia/packages/CUDA/DfvRa/src/pool.jl:232 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/error.jl:61 [inlined]
  [4] cublasSdot_v2(handle::Ptr{Nothing}, n::Int64, x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, incx::Int64, y::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, incy::Int64, result::Base.RefValue{Float32})
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/DfvRa/lib/utils/call.jl:26
  [5] dot
    @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/wrappers.jl:142 [inlined]
  [6] dot(x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, y::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/DfvRa/lib/cublas/linalg.jl:18
  [7] dot
    @ ~/.julia/packages/Optim/rpjtl/src/multivariate/precon.jl:20 [inlined]
  [8] perform_linesearch!(state::Optim.LBFGSState{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, method::LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#19#21"}, d::Optim.ManifoldObjective{OnceDifferentiable{Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}})
    @ Optim ~/.julia/packages/Optim/rpjtl/src/utilities/perform_linesearch.jl:43
  [9] update_state!(d::OnceDifferentiable{Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, state::Optim.LBFGSState{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, method::LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#19#21"})
    @ Optim ~/.julia/packages/Optim/rpjtl/src/multivariate/solvers/first_order/l_bfgs.jl:204
 [10] optimize(d::OnceDifferentiable{Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, method::LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#19#21"}, options::Optim.Options{Float64, Nothing}, state::Optim.LBFGSState{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
    @ Optim ~/.julia/packages/Optim/rpjtl/src/multivariate/optimize/optimize.jl:54
 [11] optimize(d::OnceDifferentiable{Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, method::LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#19#21"}, options::Optim.Options{Float64, Nothing})
    @ Optim ~/.julia/packages/Optim/rpjtl/src/multivariate/optimize/optimize.jl:36
 [12] optimize(f::Function, g::Function, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}; inplace::Bool, autodiff::Symbol, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Optim ~/.julia/packages/Optim/rpjtl/src/multivariate/optimize/interface.jl:100
 [13] optimize(f::Function, g::Function, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Optim ~/.julia/packages/Optim/rpjtl/src/multivariate/optimize/interface.jl:94
 [14] top-level scope
    @ In[128]:1
 [15] eval
    @ ./boot.jl:373 [inlined]
 [16] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
    @ Base ./loading.jl:1196

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions