Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory using your default model. #31

Open
Anurag-Swarnim-Yadav opened this issue Jul 7, 2021 · 1 comment
Open

CUDA out of memory using your default model. #31

Anurag-Swarnim-Yadav opened this issue Jul 7, 2021 · 1 comment

Comments

@Anurag-Swarnim-Yadav
Copy link

Anurag-Swarnim-Yadav commented Jul 7, 2021

I am trying to run your golden data set with default model and parameters with a batch size of ' 1 ' on sequencer-train. sh. In the paper, you have mentioned using K80 GPU which has 24GB of memory. I am using GeForce 2080 Ti which has 12GB of memory. So I am using 2 of GeForce and changed -world_size 2 and -gpu_ranks 0 1 but still getting CUDA out of memory. Could you please guide us on what can be the possible issue?

Traceback (most recent call last):
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/train.py", line 63, in run
    single_main(opt, device_id)
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/train_single.py", line 132, in main
    model = build_model(model_opt, opt, fields, checkpoint)
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/model_builder.py", line 301, in build_model
    model = build_base_model(model_opt, fields, use_gpu(opt), checkpoint)
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/model_builder.py", line 294, in build_base_model
    model.to(device)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 607, in to
    return self._apply(convert)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
    module._apply(fn)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
    module._apply(fn)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 376, in _apply
    param_applied = fn(param)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 605, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory
@SamraMehboob
Copy link

Hi, did you find the reason behind this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants