Added mamba and test without set env #1623

zzhang37 · 2024-12-17T17:55:21Z

What does this PR do?

This is to merge 1.15-release, all issues fixed.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

jiminha · 2024-12-17T19:58:37Z

Ok, looks good. Validated on all these case.

GC_KERNEL_PATH=/root/.cache/huggingface/hub/models--Habana--mamba/blobs/libcustom_tpc_perf_lib.so:$GC_KERNEL_PATH python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

Perf, good,

python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

No crash anymore, fall back to old way. Perf is half of the 1)

PYTEST -all pass
GAUDI2_CI=1 RUN_SLOW=1 python -m pytest tests/test_text_generation_example.py -s -v -k "mamba"

regisss · 2024-12-20T08:54:24Z

Ok, looks good. Validated on all these case.

GC_KERNEL_PATH=/root/.cache/huggingface/hub/models--Habana--mamba/blobs/libcustom_tpc_perf_lib.so:$GC_KERNEL_PATH python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

Perf, good,

python run_generation.py --model_name_or_path state-spaces/mamba-130m-hf --max_input_tokens 128 --max_new_tokens 128 --bf16 --use_hpu_graphs --use_kv_cache --batch_size 1024

No crash anymore, fall back to old way. Perf is half of the 1)

PYTEST -all pass
GAUDI2_CI=1 RUN_SLOW=1 python -m pytest tests/test_text_generation_example.py -s -v -k "mamba"

When I run these commands with GC_KERNEL_PATH=/data/hub/models--Habana--mamba/blobs/libcustom_tpc_perf_lib.so:$GC_KERNEL_PATH (my HF cache is in /data/hub), I always get

Traceback (most recent call last):
  File "/root/workspace/fork/examples/text-generation/run_generation.py", line 779, in <module>
    main()
  File "/root/workspace/fork/examples/text-generation/run_generation.py", line 387, in main
    model, assistant_model, tokenizer, generation_config = initialize_model(args, logger)
  File "/root/workspace/fork/examples/text-generation/utils.py", line 686, in initialize_model
    setup_env(args)
  File "/root/workspace/fork/examples/text-generation/utils.py", line 152, in setup_env
    from optimum.habana.transformers.modeling_utils import adapt_transformers_to_gaudi
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/modeling_utils.py", line 30, in <module>
    from .models import (
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/__init__.py", line 160, in <module>
    from .mamba import (
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/mamba/__init__.py", line 1, in <module>
    from .modeling_mamba import (
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/mamba/modeling_mamba.py", line 34, in <module>
    torch.ops.load_library(custom_op_lib_path)
  File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 1350, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /data/hub/models--Habana--mamba/blobs/bd24bf3235fbc8766779d7cbc78d9a842d1b775ca07f50f90cb515b935153ce1: failed to map segment from shared object

Any idea why this happens?

@zzhang37 This code should be conditional to whether we use Mamba or not. We don't want this piece of code to be executed if another model is run as it will error out since it is likely the kernel has not been downloaded.

zzhang37 · 2024-12-20T22:20:03Z

@regisss I saw this issue. We se this when we do in the docker
-v /checkpoint/huggingface/hub:/root/.cache/huggingface/hub
This is due to you map the folder to docker /root/.cache. If you remove the mount in the docker, this issue will be gone.

Added mamba and test without set env

3148796

zzhang37 requested a review from regisss as a code owner December 17, 2024 17:55

zzhang37 added 2 commits December 17, 2024 19:11

Added mamba and test without set env

bd54bcc

Added mamba and test without set env

707f169

libinta added the run-test Run CI for PRs from external contributors label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added mamba and test without set env #1623

Added mamba and test without set env #1623

zzhang37 commented Dec 17, 2024

jiminha commented Dec 17, 2024

regisss commented Dec 20, 2024 •

edited

Loading

zzhang37 commented Dec 20, 2024

Added mamba and test without set env #1623

Are you sure you want to change the base?

Added mamba and test without set env #1623

Conversation

zzhang37 commented Dec 17, 2024

What does this PR do?

Before submitting

jiminha commented Dec 17, 2024

regisss commented Dec 20, 2024 • edited Loading

zzhang37 commented Dec 20, 2024

regisss commented Dec 20, 2024 •

edited

Loading