Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuDNN launch failure #96

Open
Tianyu97 opened this issue Apr 14, 2019 · 1 comment
Open

cuDNN launch failure #96

Tianyu97 opened this issue Apr 14, 2019 · 1 comment

Comments

@Tianyu97
Copy link

When I run the Single-Person demo code, it shows a problem.
The cuda is 9.0.176 and cudnn is 7.4.2.24

(tensorflow) bfs@zty2:~/pose-tensorflow-master$ TF_CUDNN_USE_AUTOTUNE=0 python3 demo/singleperson.py
2019-04-14 12:04:06.515875: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-04-14 12:04:08.638915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6325
pciBusID: 0000:04:00.0
totalMemory: 10.92GiB freeMemory: 10.32GiB
2019-04-14 12:04:08.638992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)
2019-04-14 12:04:14.960564: E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7402 (compatibility version 7400) but source was compiled with 7004 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2019-04-14 12:04:14.962004: W ./tensorflow/stream_executor/stream.h:1988] attempting to perform DNN operation using StreamExecutor without DNN support
Traceback (most recent call last):
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_call
return fn(*args)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1329, in _run_fn
status, run_metadata)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape([1,3,518,280]) filter shape([7,7,3,64])
[[Node: resnet_v1_101/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/Pad, resnet_v1_101/conv1/weights/read)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "demo/singleperson.py", line 26, in
outputs_np = sess.run(outputs, feed_dict={inputs: image_batch})
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape([1,3,518,280]) filter shape([7,7,3,64])
[[Node: resnet_v1_101/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/Pad, resnet_v1_101/conv1/weights/read)]]

Caused by op 'resnet_v1_101/conv1/Conv2D', defined at:
File "demo/singleperson.py", line 17, in
sess, inputs, outputs = predict.setup_pose_prediction(cfg)
File "demo/../nnet/predict.py", line 11, in setup_pose_prediction
outputs = pose_net(cfg).test(inputs)
File "demo/../nnet/pose_net.py", line 89, in test
heads = self.get_net(inputs)
File "demo/../nnet/pose_net.py", line 85, in get_net
net, end_points = self.extract_features(inputs)
File "demo/../nnet/pose_net.py", line 55, in extract_features
net, end_points = net_fun(im_centered, global_pool=False, output_stride=16, is_training=False)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 300, in resnet_v1_101
scope=scope)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 205, in resnet_v1
net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 146, in conv2d_same
scope=scope)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1057, in convolution
outputs = layer.apply(inputs)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 762, in apply
return self.call(inputs, *args, **kwargs)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 652, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/layers/convolutional.py", line 167, in call
outputs = self._convolution_op(inputs, self.kernel)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 838, in call
return self.conv_op(inp, filter)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 502, in call
return self.call(inp, filter)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 190, in call
name=self.name)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 639, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InternalError (see above for traceback): cuDNN launch failure : input shape([1,3,518,280]) filter shape([7,7,3,64])
[[Node: resnet_v1_101/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/Pad, resnet_v1_101/conv1/weights/read)]]

@Tianyu97
Copy link
Author

Could you please tell me the version of your cuda, cudnn and tensorflow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant