CUDA out of memory using your default model. #31

Anurag-Swarnim-Yadav · 2021-07-07T23:17:51Z

I am trying to run your golden data set with default model and parameters with a batch size of ' 1 ' on sequencer-train. sh. In the paper, you have mentioned using K80 GPU which has 24GB of memory. I am using GeForce 2080 Ti which has 12GB of memory. So I am using 2 of GeForce and changed -world_size 2 and -gpu_ranks 0 1 but still getting CUDA out of memory. Could you please guide us on what can be the possible issue?

Traceback (most recent call last):
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/train.py", line 63, in run
    single_main(opt, device_id)
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/train_single.py", line 132, in main
    model = build_model(model_opt, opt, fields, checkpoint)
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/model_builder.py", line 301, in build_model
    model = build_base_model(model_opt, fields, use_gpu(opt), checkpoint)
  File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/model_builder.py", line 294, in build_base_model
    model.to(device)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 607, in to
    return self._apply(convert)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
    module._apply(fn)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
    module._apply(fn)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 376, in _apply
    param_applied = fn(param)
  File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 605, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory

The text was updated successfully, but these errors were encountered:

SamraMehboob · 2022-03-27T12:08:07Z

Hi, did you find the reason behind this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memory using your default model. #31

CUDA out of memory using your default model. #31

Anurag-Swarnim-Yadav commented Jul 7, 2021 •

edited

Loading

SamraMehboob commented Mar 27, 2022

CUDA out of memory using your default model. #31

CUDA out of memory using your default model. #31

Comments

Anurag-Swarnim-Yadav commented Jul 7, 2021 • edited Loading

SamraMehboob commented Mar 27, 2022

Anurag-Swarnim-Yadav commented Jul 7, 2021 •

edited

Loading