Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added mamba and test without set env #1623

Open
wants to merge 3 commits into
base: v1.15-release
Choose a base branch
from

Conversation

zzhang37
Copy link

What does this PR do?

This is to merge 1.15-release, all issues fixed.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@zzhang37 zzhang37 requested a review from regisss as a code owner December 17, 2024 17:55
@libinta libinta added the run-test Run CI for PRs from external contributors label Dec 17, 2024
@jiminha
Copy link
Collaborator

jiminha commented Dec 17, 2024

Ok, looks good. Validated on all these case.

  1. GC_KERNEL_PATH=/root/.cache/huggingface/hub/models--Habana--mamba/blobs/libcustom_tpc_perf_lib.so:$GC_KERNEL_PATH python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

Perf, good,

  1. python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

No crash anymore, fall back to old way. Perf is half of the 1)

  1. PYTEST -all pass
    GAUDI2_CI=1 RUN_SLOW=1 python -m pytest tests/test_text_generation_example.py -s -v -k "mamba"

@regisss
Copy link
Collaborator

regisss commented Dec 20, 2024

Ok, looks good. Validated on all these case.

  1. GC_KERNEL_PATH=/root/.cache/huggingface/hub/models--Habana--mamba/blobs/libcustom_tpc_perf_lib.so:$GC_KERNEL_PATH python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

Perf, good,

  1. python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

No crash anymore, fall back to old way. Perf is half of the 1)

  1. PYTEST -all pass
    GAUDI2_CI=1 RUN_SLOW=1 python -m pytest tests/test_text_generation_example.py -s -v -k "mamba"

When I run these commands with GC_KERNEL_PATH=/data/hub/models--Habana--mamba/blobs/libcustom_tpc_perf_lib.so:$GC_KERNEL_PATH (my HF cache is in /data/hub), I always get

Traceback (most recent call last):
  File "/root/workspace/fork/examples/text-generation/run_generation.py", line 779, in <module>
    main()
  File "/root/workspace/fork/examples/text-generation/run_generation.py", line 387, in main
    model, assistant_model, tokenizer, generation_config = initialize_model(args, logger)
  File "/root/workspace/fork/examples/text-generation/utils.py", line 686, in initialize_model
    setup_env(args)
  File "/root/workspace/fork/examples/text-generation/utils.py", line 152, in setup_env
    from optimum.habana.transformers.modeling_utils import adapt_transformers_to_gaudi
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/modeling_utils.py", line 30, in <module>
    from .models import (
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/__init__.py", line 160, in <module>
    from .mamba import (
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/mamba/__init__.py", line 1, in <module>
    from .modeling_mamba import (
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/mamba/modeling_mamba.py", line 34, in <module>
    torch.ops.load_library(custom_op_lib_path)
  File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 1350, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /data/hub/models--Habana--mamba/blobs/bd24bf3235fbc8766779d7cbc78d9a842d1b775ca07f50f90cb515b935153ce1: failed to map segment from shared object

Any idea why this happens?

@zzhang37 This code should be conditional to whether we use Mamba or not. We don't want this piece of code to be executed if another model is run as it will error out since it is likely the kernel has not been downloaded.

@zzhang37
Copy link
Author

@regisss I saw this issue. We se this when we do in the docker
-v /checkpoint/huggingface/hub:/root/.cache/huggingface/hub
This is due to you map the folder to docker /root/.cache. If you remove the mount in the docker, this issue will be gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants