Make jacobian work with CuArrays #33

tkf · 2019-05-27T22:24:25Z

It looks like Tracker.jacobian does not support CuArrays at the moment:

julia> Tracker.jacobian(identity, gpu([0.0, 0.0]))
ERROR: GPU compilation of #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Bool,1},Tuple{Bool},Tuple{Int64}}}}) failed KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Bool,1},Tuple{Bool},Tuple{Int64}}}}.
That type is not isbits, and such arguments are only allowed when they are unused by the kernel.  .args is of type Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Bool,1},Tuple{Bool},Tuple{Int64}}} which is not isbits.
    .2 is of type Base.Broadcast.Extruded{Array{Bool,1},Tuple{Bool},Tuple{Int64}} which is not isbits.
      .x is of type Array{Bool,1} which is not isbits.

This is because ȳ(i) always return Vector{Bool}:

Tracker.jl/src/back.jl

Lines 174 to 178 in 593aba6

    
           function jacobian(f, x::AbstractVector) 
        
             y::AbstractVector, back = forward(f, x) 
        
             ȳ(i) = [i == j for j = 1:length(y)] 
        
             vcat([transpose(back(ȳ(i))[1]) for i = 1:length(y)]...) 
        
           end

A fix I found (implemented in this PR) was to create the output array using similar(y, Bool):

julia> using Tracker: forward

julia> function jacobian(f, x::AbstractVector)
           y::AbstractVector, back = forward(f, x)
           function ȳ(i)
               δ = fill!(similar(y, Bool), false)
               δ[i] = true
               return δ
           end
           vcat([transpose(back(ȳ(i))[1]) for i = 1:length(y)]...)
       end
jacobian (generic function with 1 method)

julia> jacobian(identity, gpu([0.0, 0.0]))
2×2 CuArray{Float32,2}:
 1.0  0.0
 0.0  1.0

I didn't added any tests (sorry) because Tracker's test suite does not load CuArrays.

MikeInnes · 2019-05-28T11:53:03Z

So the reason Jacobian is written this way is so that nested AD works, and this PR will likely break that (ideally we'd test it explicitly). I think you can get the best of both if you find a way to write this as map.

You can add GPU tests in the cuda folder and they'll run on bors, or get loaded when you have CuArrays installed locally.

tkf · 2019-05-28T20:06:47Z

map didn't work but broadcast worked. However, I couldn't construct an example where nested AD fail with the fill! approach (see below). Also, I have a vague idea how in-place operation breaks the AD but I don't see why it matters in ȳ because its output does not depend on the values of x (so it does not participate in the computation graph). Or maybe the function I tried was too easy for the AD to handle?

julia> function jacobian_with_fill(f, x::AbstractVector)
         y::AbstractVector, back = forward(f, x)
         function ȳ(i)
           δ = fill!(similar(y, Bool), false)
           δ[i] = true
           return δ
         end
         vcat([transpose(back(ȳ(i))[1]) for i = 1:length(y)]...)
       end
jacobian_with_fill (generic function with 1 method)

julia> Tracker.gradient(x -> sum(jacobian_with_fill(y -> y .^ 2, x) .^ 2), [1.0, 2.0, 3.0])
([8.0, 16.0, 24.0] (tracked),)

julia> Tracker.gradient(x -> sum(jacobian_with_fill(y -> sum(y) .* y, x) .^ 2), [1.0, 2.0, 3.0])
([66.0, 72.0, 78.0] (tracked),)

julia> function jacobian_with_bc(f, x::AbstractVector)
         y::AbstractVector, back = forward(f, x)
         ȳ(i) = ((j, _) -> i == j).(1:length(x), x)
         vcat([transpose(back(ȳ(i))[1]) for i = 1:length(y)]...)
       end
jacobian_with_bc (generic function with 1 method)

julia> Tracker.gradient(x -> sum(jacobian_with_bc(y -> y .^ 2, x) .^ 2), [1.0, 2.0, 3.0])
([8.0, 16.0, 24.0] (tracked),)

julia> Tracker.gradient(x -> sum(jacobian_with_bc(y -> sum(y) .* y, x) .^ 2), [1.0, 2.0, 3.0])
([66.0, 72.0, 78.0] (tracked),)

You can add GPU tests in the cuda folder

I don't see cuda directory here: https://github.com/FluxML/Tracker.jl/tree/v0.2.2/test Maybe you are mixing it Flux's test suite?

MikeInnes · 2019-05-29T13:47:02Z

Of course, I always forget we split tracker out of Flux. In that case it'd still be nice to have a test on the Flux side once this is in.

You are probably right that this can be done with the similar approach; the main issue I was thinking of was similar returning a tracked array or array of tracked reals, which has been proposed at some point, but isn't actually the case yet (and could be avoided with a call to data(y)). That said, the broadcast version avoids a warning about scalar indexing and does the job with a single kernel, which is always good.

tkf · 2019-05-30T01:37:40Z

Thanks for the clarification. The latest commit in this PR uses broadcasting approach. So it's good to go?

In that case it'd still be nice to have a test on the Flux side once this is in.

See FluxML/Flux.jl#782

tkf · 2019-05-30T04:55:40Z

I just found a related problem: jacobian(softmax, ones(3)) doesn't work because NNlib.jl defines the derivative by ∇softmax(Δ, xs) = ∇softmax!(similar(Δ), Δ, xs). Would it be safe to convert ȳ's element type like this?

z = float(zero(eltype(data(y))))
ȳ(i) = ((j, _) -> i == j).(1:length(y), y) .+ z

Or should it be fixed in NNlib.jl?

Edit: implemented in 9c2c05e

tkf · 2019-05-30T07:40:45Z

It looks like the tests in Flux failed because of allowscalar(false). I suppose indexing is disabled in Flux's test suite because indexing on GPU is expensive? Also, from the stack trace, it looks like indexing into GPU array happens because it hits the generic indexing-based broadcasting in Base.

If these are the case, wouldn't it be better to use the fill!-based approach so that there are only N indexing rather than N^2 indexing that would happen in the broadcasting approach? (N = length(y))

ChrisRackauckas · 2019-08-27T19:14:00Z

src/back.jl

@@ -173,7 +173,11 @@ Calculate the output jacobian `J = d/dx m(x)` such that each row `i` of `J` corr
 """
 function jacobian(f, x::AbstractVector)
  y::AbstractVector, back = forward(f, x)
-  ȳ(i) = [i == j for j = 1:length(y)]
+  function ȳ(i)
+    δ = fill!(float(similar(data(y))), false)


δ = false .* y .+ (1:length(y) .== i). Though you should then make it be color_i and loop through colors for it to be sparse, but this is good for now.

I'll open a fresh PR for this.

tkf · 2019-08-27T19:54:55Z

Closing in favor of #44 (thank you @ChrisRackauckas!)

Make jacobian work with CuArrays

36c5256

tkf mentioned this pull request May 30, 2019

[DON'T MERGE] Test Tracker.jacobian on GPU FluxML/Flux.jl#782

Closed

tkf added 2 commits May 29, 2019 22:05

Use broadcasting to construct the Kronecker's delta

aa4e64a

Make sure that ȳ's output is a compatible float type

9c2c05e

tkf force-pushed the jacobian-gpu branch from d5c9763 to 9c2c05e Compare May 30, 2019 05:15

Use fill!/similar-based approach to reduce number of indexing

f6b0ad4

ChrisRackauckas reviewed Aug 27, 2019

View reviewed changes

tkf closed this Aug 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make jacobian work with CuArrays #33

Make jacobian work with CuArrays #33

tkf commented May 27, 2019

MikeInnes commented May 28, 2019

tkf commented May 28, 2019

MikeInnes commented May 29, 2019

tkf commented May 30, 2019

tkf commented May 30, 2019 •

edited

Loading

tkf commented May 30, 2019

ChrisRackauckas Aug 27, 2019 •

edited

Loading

ChrisRackauckas Aug 27, 2019

tkf commented Aug 27, 2019

	function jacobian(f, x::AbstractVector)
	y::AbstractVector, back = forward(f, x)
	ȳ(i) = [i == j for j = 1:length(y)]
	vcat([transpose(back(ȳ(i))[1]) for i = 1:length(y)]...)
	end

Make jacobian work with CuArrays #33

Make jacobian work with CuArrays #33

Conversation

tkf commented May 27, 2019

MikeInnes commented May 28, 2019

tkf commented May 28, 2019

MikeInnes commented May 29, 2019

tkf commented May 30, 2019

tkf commented May 30, 2019 • edited Loading

tkf commented May 30, 2019

ChrisRackauckas Aug 27, 2019 • edited Loading

Choose a reason for hiding this comment

ChrisRackauckas Aug 27, 2019

Choose a reason for hiding this comment

tkf commented Aug 27, 2019

tkf commented May 30, 2019 •

edited

Loading

ChrisRackauckas Aug 27, 2019 •

edited

Loading