-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SamePad() with even kernel dimensions does not work (only in CUDA) #1605
Comments
It appears to be related to specifying all four padding dimensions:
|
Isn't this just because CUDA (cuDNN) does not support asymetric padding. Even kernel size + SamePad => Asymetric padding. What happens is that there is an attempt to change the padding to be symmetric (which imo is the wrong thing to do), but when this happens the caller has already supplied an output which assumes the output size for the asymetric case. The right way imo, and what I think all other libs are doing is to manually pad the input so and at the same time decrease the padding parameter so that it is symmetric. I think there has been a little bit of controversy as to where this should be done which seems to be the reason why it has fallen between the chairs. For instance, if one just does it in NNlibCUDA\src\cudnn\conv.jl:41 the caller might be surprised that x (the input features) has changed size. One can of course just copy x and pad it but it is also not the best solution as in a vast majority of cases the caller does not care whether x is changed. julia> cc = Conv((1,1), 1=>1, pad=(1,0,1,0)) |> gpu
Conv((1, 1), 1=>1)
julia> cc(ones(Float32, 1,1,1,1) |> gpu)
┌ Warning: cuDNN does not support asymmetric padding; defaulting to symmetric choice
└ @ NNlibCUDA E:\Programs\julia\.julia\dev\NNlibCUDA\src\cudnn\cudnn.jl:10
┌ Warning: No valid algorithm found, probably bad params for convolution.
└ @ CUDA.CUDNN E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\cudnn\convolution.jl:222
ERROR: CUDNNError: CUDNN_STATUS_BAD_PARAM (code 3)
Stacktrace:
[1] throw_api_error(res::CUDA.CUDNN.cudnnStatus_t)
@ CUDA.CUDNN E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\cudnn\error.jl:22
[2] macro expansion
@ E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\cudnn\error.jl:39 [inlined]
[3] cudnnConvolutionForward(handle::Ptr{Nothing}, alpha::Base.RefValue{Float32}, xDesc::CUDA.CUDNN.cudnnTensorDescriptor, x::CUDA.CuArray{Float32, 4}, wDesc::CUDA.CUDNN.cudnnFilterDescriptor, w::CUDA.CuArray{Float32, 4}, convDesc::CUDA.CUDNN.cudnnConvolutionDescriptor, algo::CUDA.CUDNN.cudnnConvolutionFwdAlgo_t, workSpace::CUDA.CuArray{UInt8, 1}, workSpaceSizeInBytes::Int64, beta::Base.RefValue{Float32}, yDesc::CUDA.CUDNN.cudnnTensorDescriptor, y::CUDA.CuArray{Float32, 4})
@ CUDA.CUDNN E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\utils\call.jl:26
[4] macro expansion
@ E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\cudnn\convolution.jl:105 [inlined]
[5] macro expansion
@ E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\utils\call.jl:144 [inlined]
[6] cudnnConvolutionForwardAD(w::CUDA.CuArray{Float32, 4}, x::CUDA.CuArray{Float32, 4}, bias::Nothing, z::CUDA.CuArray{Float32, 4}; y::CUDA.CuArray{Float32, 4}, activation::CUDA.CUDNN.cudnnActivationMode_t, convDesc::CUDA.CUDNN.cudnnConvolutionDescriptor, wDesc::CUDA.CUDNN.cudnnFilterDescriptor, xDesc::CUDA.CUDNN.cudnnTensorDescriptor, yDesc::CUDA.CUDNN.cudnnTensorDescriptor, zDesc::CUDA.CUDNN.cudnnTensorDescriptor, biasDesc::Nothing, alpha::Base.RefValue{Float32}, beta::Base.RefValue{Float32}, dw::Base.RefValue{Any}, dx::Base.RefValue{Any}, dz::Base.RefValue{Any}, dbias::Base.RefValue{Any}, dready::Base.RefValue{Bool})
@ CUDA.CUDNN E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\cudnn\convolution.jl:103
[7] #cudnnConvolutionForwardWithDefaults#626
@ E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\cudnn\convolution.jl:96 [inlined]
[8] #cudnnConvolutionForward!#625
@ E:\Programs\julia\.julia\packages\CUDA\3VnCC\lib\cudnn\convolution.jl:53 [inlined]
[9] conv!(y::CUDA.CuArray{Float32, 4}, x::CUDA.CuArray{Float32, 4}, w::CUDA.CuArray{Float32, 4}, cdims::DenseConvDims{2, (1, 1), 1, 1, (1, 1), (1, 0, 1, 0), (1, 1), false}; alpha::Int64, beta::Int64, algo::Int64)
@ NNlibCUDA E:\Programs\julia\.julia\dev\NNlibCUDA\src\cudnn\conv.jl:41
[10] conv!
@ E:\Programs\julia\.julia\dev\NNlibCUDA\src\cudnn\conv.jl:34 [inlined]
[11] conv(x::CUDA.CuArray{Float32, 4}, w::CUDA.CuArray{Float32, 4}, cdims::DenseConvDims{2, (1, 1), 1, 1, (1, 1), (1, 0, 1, 0), (1, 1), false}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ NNlib E:\Programs\julia\.julia\packages\NNlib\ev8gq\src\conv.jl:91
[12] conv(x::CUDA.CuArray{Float32, 4}, w::CUDA.CuArray{Float32, 4}, cdims::DenseConvDims{2, (1, 1), 1, 1, (1, 1), (1, 0, 1, 0), (1, 1), false})
@ NNlib E:\Programs\julia\.julia\packages\NNlib\ev8gq\src\conv.jl:89
[13] (::Conv{2, 4, typeof(identity), CUDA.CuArray{Float32, 4}, CUDA.CuArray{Float32, 1}})(x::CUDA.CuArray{Float32, 4})
@ Flux E:\Programs\julia\.julia\packages\Flux\6o4DQ\src\layers\conv.jl:157
[14] top-level scope
@ REPL[13]:1 |
That should be it then @DrChainsaw. I remember seeing something like that some time ago (years ago already???). |
I'm pretty sure both Keras and PyTorch does the manual padding trick and the dilemma between copying or surprising mutation should exist there as well afaik. I did look into the tensorflow c++ code long ago but alas I don't remember what I found. Chances are I would have remembered it if it was something else than a copy though. |
To my knowledge, PyTorch doesn't have a concept of automatically calculated "same" padding and requires specifying the amount for each dimension manually. I had a skim through some of the C++ code mentioned, and AFAICT allocating a enlarged buffer is necessary to get uneven same padding working with cuDNN. |
Pytorch 1.9 now supports same padding and warns when the calculated padding is asymmetric: pytorch/pytorch#45667. Should we do the same? |
It's a bit niche, so perhaps leave it at the backlog in #1431 |
Tested on Flux v0.12.3
It works with CPU, but with CUDA the following happens:
MWE:
This works:
The text was updated successfully, but these errors were encountered: