Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perplexity validation on PG19 error and Passkey Retrieval error #19

Open
khfs opened this issue Jul 1, 2024 · 8 comments
Open

Perplexity validation on PG19 error and Passkey Retrieval error #19

khfs opened this issue Jul 1, 2024 · 8 comments

Comments

@khfs
Copy link

khfs commented Jul 1, 2024

I followed the environment setup in the readme exactly. When performing Perplexity validation on PG19, the only difference from the original code is that I loaded the model from a local path and set the device to 'cpu' to see the exact error messages. My command line was:

python test_ppl.py --seq_len 16384 --scale 7b --data_path pg19_llama2.validation.bin

The terminal output was:

The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.88it/s]
Test PPL on seq length 16384
0%| | 0/9446 [00:00<?, ?it/s]
Traceback (most recent call last):
File "test_ppl.py", line 102, in
evaluate_ppl_all(seq_length=args.seq_len, sliding_window=256, args=args, model=model, data=data)
File "test_ppl.py", line 58, in evaluate_ppl_all
outputs = model(
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1183, in forward
outputs = self.model(
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1027, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 163, in forward
return F.embedding(
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/functional.py", line 2264, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

When performing Passkey Retrieval, the only difference from the original code is that I loaded the model from a local path. My command line was:

python test_passkey.py --seq_len 16384 --scale 7b

The terminal output was:
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 11.13it/s]
Traceback (most recent call last):
File "test_passkey.py", line 123, in
main(args)
File "test_passkey.py", line 77, in main
model = load_checkpoint_and_dispatch(model, checkpoint=model_path,
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/accelerate/big_modeling.py", line 607, in load_checkpoint_and_dispatch
load_checkpoint_in_model(
File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1705, in load_checkpoint_in_model
raise ValueError(
ValueError: /data3/xylu/checkpoints/NousResearch/Llama-2-7b-hf containing more than one .index.json file, delete the irrelevant ones.

@ChenxinAn-fdu
Copy link
Contributor

Hi ! Have you verified your code without the chunkLlama monkey patch?

@khfs
Copy link
Author

khfs commented Jul 2, 2024

Yes, I commented out line 85 in test_ppl.py and got the same error message. Later, after comparing this code with the eval.py code of LongLoRA, I found that changing np.uint32 to np.uint16 on line 98 of test_ppl.py allowed the code to run. However, the result of running CHUNKLLAMA2 7B was {"seq_len": 16384, "gpu": "1", "data_path": "pg19_llama2.validation.bin", "scale": "7b", "pretraining_length": 4096, "ppl": 1803.4413082318101}. Obviously, this result is not correct, and I don't know what other issues exist in the code.

@ChenxinAn-fdu
Copy link
Contributor

ChenxinAn-fdu commented Jul 3, 2024

Thank you for letting me know! I think this issue is caused by mistakenly uploading the files using Llama3 tokenizer. I will check it right now.

@ChenxinAn-fdu
Copy link
Contributor

Hi! Changing data = {'val': np.memmap(data_path, dtype=np.uint32, mode='r')} -> data = {'val': np.memmap(data_path, dtype=np.uint16, mode='r')} works for me. Remember not to comment out this line: replace_with_chunkllama(args.pretraining_length, args.pretraining_length//4)

@khfs
Copy link
Author

khfs commented Jul 3, 2024

Although this allows the code to run, the result I obtained is {"seq_len": 16384, "gpu": "1", "data_path": "pg19_llama2.validation.bin", "scale": "7b", "pretraining_length": 4096, "ppl": 1803.4413082318101}, where the PPL is too high. Therefore, I believe there is still an issue with the code. I am curious about your results.

@ChenxinAn-fdu
Copy link
Contributor

I have updated the code. Plz try the newest version🤣.

@khfs
Copy link
Author

khfs commented Jul 3, 2024

Thank you for your response regarding the validation on PG19 and I am currently testing the latest version of the code. How can I resolve the error related to the passkey retrieval?

@ChenxinAn-fdu
Copy link
Contributor

test_passkey.py has also been updated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants