Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train.py ERROR #172

Open
shudongW opened this issue Oct 21, 2024 · 0 comments
Open

train.py ERROR #172

shudongW opened this issue Oct 21, 2024 · 0 comments

Comments

@shudongW
Copy link

Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda/lib/libcudnn_cnn_train.so.8: symbol _ZN5cudnn3cnn34layerNormFwd_execute_internal_implERKNS_7backend11VariantPackEP11CUstream_stRNS0_18LayerNormFwdParamsERKNS1_20NormForwardOperationEmb, version libcudnn_cnn_infer.so.8 not defined in file libcudnn_cnn_infer.so.8 with link time reference
[rank0]: Traceback (most recent call last):
[rank0]: File "/root/Style-Bert-VITS2/train_ms_jp_extra.py", line 1130, in
[rank0]: run()
[rank0]: File "/root/Style-Bert-VITS2/train_ms_jp_extra.py", line 557, in run
[rank0]: train_and_evaluate(
[rank0]: File "/root/Style-Bert-VITS2/train_ms_jp_extra.py", line 819, in train_and_evaluate
[rank0]: scaler.scale(loss_slm).backward()
[rank0]: File "/root/Style-Bert-VITS2/venv/lib/python3.10/site-packages/torch/_tensor.py", line 525, in backward
[rank0]: torch.autograd.backward(
[rank0]: File "/root/Style-Bert-VITS2/venv/lib/python3.10/site-packages/torch/autograd/init.py", line 267, in backward
[rank0]: _engine_run_backward(
[rank0]: File "/root/Style-Bert-VITS2/venv/lib/python3.10/site-packages/torch/autograd/graph.py", line 744, in _engine_run_backward
[rank0]: return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
[rank0]: RuntimeError: GET was unable to find an engine to execute this computation

ls /usr/local/cuda/lib/
libcudnn_adv_infer.so libcudnn_adv_train_static.a libcudnn_cnn_train.so.8 libcudnn_ops_infer_static_v8.a libcudnn.so.8.9.5
libcudnn_adv_infer.so.8 libcudnn_adv_train_static_v8.a libcudnn_cnn_train.so.8.9.5 libcudnn_ops_train.so libnccl.so
libcudnn_adv_infer.so.8.9.5 libcudnn_cnn_infer.so libcudnn_cnn_train_static.a libcudnn_ops_train.so.8 libnccl.so.2
libcudnn_adv_infer_static.a libcudnn_cnn_infer.so.8 libcudnn_cnn_train_static_v8.a libcudnn_ops_train.so.8.9.5 libnccl.so.2.22.3
libcudnn_adv_infer_static_v8.a libcudnn_cnn_infer.so.8.9.5 libcudnn_ops_infer.so libcudnn_ops_train_static.a libnccl_static.a
libcudnn_adv_train.so libcudnn_cnn_infer_static.a libcudnn_ops_infer.so.8 libcudnn_ops_train_static_v8.a pkgconfig
libcudnn_adv_train.so.8 libcudnn_cnn_infer_static_v8.a libcudnn_ops_infer.so.8.9.5 libcudnn.so
libcudnn_adv_train.so.8.9.5 libcudnn_cnn_train.so libcudnn_ops_infer_static.a libcudnn.so.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant