Max Input and Output length #22

shashank140195 · 2023-08-05T17:53:36Z

In the finetine_for_summarization.py, why max_source_length, train_max_target_length, and eval_max_target_length is set to default 510? Is this the max the BioMedLM can take as Input and only generate max 510 tokens? As soon as I increase the value above this default value, I get the error.

max_source_length: Optional[int] = field(
default=510, metadata={"help": "the max source length of summarization data. "}
)
train_max_target_length: Optional[int] = field(
default=510, metadata={"help": "the max target length for training data. "}
)
eval_max_target_length: Optional[int] = field(
default=510, metadata={"help": "the max target length for dev data. "}

Error:
Traceback (most recent call last): File "finetune_for_summarization.py", line 168, in <module> finetune() File "finetune_for_summarization.py", line 162, in finetune trainer.train() File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1534, in train return inner_training_loop( File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1807, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2649, in training_step loss = self.compute_loss(model, inputs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2674, in compute_loss outputs = model(**inputs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1769, in forward loss = self.module(*inputs, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 1075, in forward transformer_outputs = self.transformer( File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 843, in forward position_embeds = self.wpe(position_ids) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 162, in forward return F.embedding( File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 2210, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: CUDA error: device-side assert triggered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

The text was updated successfully, but these errors were encountered:

J38 · 2023-08-06T05:34:07Z

The model was trained with a fixed context length of 1024, so the source, target and extra tokens have to fit within that size.

shashank140195 · 2023-08-27T01:14:39Z

Makes Sense. thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Max Input and Output length #22

Max Input and Output length #22

shashank140195 commented Aug 5, 2023

J38 commented Aug 6, 2023

shashank140195 commented Aug 27, 2023

Max Input and Output length #22

Max Input and Output length #22

Comments

shashank140195 commented Aug 5, 2023

J38 commented Aug 6, 2023

shashank140195 commented Aug 27, 2023