You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I used nvidia 1080ti, I was able to compile gym_tensorflow.so and run the exp. The env is tensorflow-gpu 1.8.0 and cuda version is 9.0. But when I switch to 2080ti, the exp run into trouble as follow:
2019-06-02 18:03:10.727082: E tensorflow/stream_executor/cuda/cuda_blas.cc:654] failed to run cuBLAS routine cublasSgemmBatched: CUBLAS_STATUS_EXECUTION_FAILED
2019-06-02 18:03:10.727109: E tensorflow/stream_executor/cuda/cuda_blas.cc:2413] Internal: failed BLAS call, see log for details
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMMBatched launch failed : a.shape=[1,441,256], b.shape=[64,256,16], idx.shape=[1], m=441, n=16, k=256, batch_size=1
[[Node: model/Model/conv1/IndexedBatchMatMul = IndexedBatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model/Model/conv1/Reshape_2, model/Model/conv1/Reshape, _arg_model/Placeholder_0_0/_37)]]
[[Node: model/Identity_1/_59 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_210_model/Identity_1", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/root/hz/deep-neuroevolution/gpu_implementation/neuroevolution/concurrent_worker.py", line 94, in _loop
rews, is_done, _ = self.sess.run([self.rew_op, self.done_op, self.incr_counter], {self.placeholder_indices: indices})
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
I tried to upgrade tensorflow-gpu version to 1. 12,1.13, but under that env , gym_tensorflow.so could not be compiled. I also found a similar issue in https://github.com/qqwweee/keras-yolo3/issues/332 , but still no luck after I take action to Install patchs for cuda9,
I wonder if there is a solution.
The text was updated successfully, but these errors were encountered:
I think 2080ti requires CUDA 10, and we haven't tested this code with CUDA 10 yet so there might be some problems.
I'll see if we can fix it and I'll let you know if we find a solution.
When I used nvidia 1080ti, I was able to compile gym_tensorflow.so and run the exp. The env is tensorflow-gpu 1.8.0 and cuda version is 9.0. But when I switch to 2080ti, the exp run into trouble as follow:
2019-06-02 18:03:10.727082: E tensorflow/stream_executor/cuda/cuda_blas.cc:654] failed to run cuBLAS routine cublasSgemmBatched: CUBLAS_STATUS_EXECUTION_FAILED
2019-06-02 18:03:10.727109: E tensorflow/stream_executor/cuda/cuda_blas.cc:2413] Internal: failed BLAS call, see log for details
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMMBatched launch failed : a.shape=[1,441,256], b.shape=[64,256,16], idx.shape=[1], m=441, n=16, k=256, batch_size=1
[[Node: model/Model/conv1/IndexedBatchMatMul = IndexedBatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model/Model/conv1/Reshape_2, model/Model/conv1/Reshape, _arg_model/Placeholder_0_0/_37)]]
[[Node: model/Identity_1/_59 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_210_model/Identity_1", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/root/hz/deep-neuroevolution/gpu_implementation/neuroevolution/concurrent_worker.py", line 94, in _loop
rews, is_done, _ = self.sess.run([self.rew_op, self.done_op, self.incr_counter], {self.placeholder_indices: indices})
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
I tried to upgrade tensorflow-gpu version to 1. 12,1.13, but under that env , gym_tensorflow.so could not be compiled. I also found a similar issue in https://github.com/qqwweee/keras-yolo3/issues/332 , but still no luck after I take action to Install patchs for cuda9,
I wonder if there is a solution.
The text was updated successfully, but these errors were encountered: