Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA bug #48

Closed
marcelclaro opened this issue Nov 7, 2023 · 1 comment
Closed

CUDA bug #48

marcelclaro opened this issue Nov 7, 2023 · 1 comment
Assignees
Labels
Bug Report Something isn't working

Comments

@marcelclaro
Copy link

marcelclaro commented Nov 7, 2023

Describe the bug

Trace
.//set.*
.//set.*
.//set.*
initial rotate H or S func.
Traceback (most recent call last):
File "/home/marcel/.local/bin/dptb", line 8, in
sys.exit(main())
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/entrypoints/main.py", line 317, in main
train(**dict_args)
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/entrypoints/train.py", line 276, in train
trainer.run(trainer.num_epoch)
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/nnops/base_trainer.py", line 52, in run
self.train()
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/nnops/train_nnsk.py", line 245, in train
self.optimizer.step(closure)
File "/home/marcel/.local/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
return wrapped(*args, **kwargs)
File "/home/marcel/.local/lib/python3.10/site-packages/torch/optim/optimizer.py", line 373, in wrapper
out = func(*args, **kwargs)
File "/home/marcel/.local/lib/python3.10/site-packages/torch/optim/optimizer.py", line 76, in _use_grad
ret = func(self, *args, **kwargs)
File "/home/marcel/.local/lib/python3.10/site-packages/torch/optim/adam.py", line 143, in step
loss = closure()
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/nnops/train_nnsk.py", line 229, in closure
pred, label = self.calc(*data, decompose=self.decompose)
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/nnops/train_nnsk.py", line 179, in calc
self.hamileig.get_hs_blocks(bonds_onsite=bond_onsites,
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/hamiltonian/hamil_eig_sk_crt.py", line 276, in get_hs_blocks
onsiteH, onsiteS, bonds_onsite = self.get_hs_onsite(bonds_onsite=bonds_onsite, onsite_envs=onsite_envs)
File "/home/marcel/.local/lib/python3.10/site-packages/dptb/hamiltonian/hamil_eig_sk_crt.py", line 161, in get_hs_onsite
sub_hamil_block[ist:ist+norbi, ist:ist+norbi] = th.eye(norbi, dtype=self.dtype, device=self.device) * self.onsiteEs[ib][indx]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Expected behavior

Run using cuda

To Reproduce

input.json:
"device": "cuda",
"dtype": "float32",

Environment

Ubuntu 22.04
pyTorch 2.1

Additional Context

No response

@QG-phy QG-phy added the Bug Report Something isn't working label Nov 8, 2023
@floatingCatty
Copy link
Member

Hello!
Thanks for the report, we are currently fixing the cuda issue. The new features, which can accelerate the computation vastly via refactoring the SKTB Hamiltonian construction will be launched soon. We will keep you informed when this is completed.

So a quick fix is just to use run deeptb on cpu for now.

Cheers !

@QG-phy QG-phy closed this as completed Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Report Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants