You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MLCEngine model parameter gets converted to a tuple internally but not handled correctly.
To Reproduce
Steps to reproduce the behavior:
Create a script as shown in the code sample
Run the script
from mlc_llm import MLCEngine
# Create engine
model = "./dist/prebuilt/Llama-3-8B-Instruct-q4f16_1-MLC",
model_lib="./dist/prebuilt/lib/Llama-3-8b-Instruct/Llama-3-8B-Instruct-q4f16_1-mali.so",
device="opencl"
engine = MLCEngine(model=model, model_lib=model_lib, device=device)
# Run chat completion in OpenAI API.
for response in engine.chat.completions.create(
messages=[{"role": "user", "content": "What is the meaning of life?"}],
stream=True,
):
for choice in response.choices:
print(choice.delta.content, end="", flush=True)
print("\n")
engine.terminate()
Output:
Traceback (most recent call last):
File "/home/hank/repos/mlc-llm/test.py", line 7, in <module>
engine = MLCEngine(model=model, model_lib=model_lib, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hank/repos/mlc-llm/python/mlc_llm/serve/engine.py", line 1466, in __init__
super().__init__(
File "/home/hank/repos/mlc-llm/python/mlc_llm/serve/engine_base.py", line 590, in __init__
) = _process_model_args(models, device, engine_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hank/repos/mlc-llm/python/mlc_llm/serve/engine_base.py", line 171, in _process_model_args
model_args: List[Tuple[str, str]] = [_convert_model_info(model) for model in models]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hank/repos/mlc-llm/python/mlc_llm/serve/engine_base.py", line 171, in <listcomp>
model_args: List[Tuple[str, str]] = [_convert_model_info(model) for model in models]
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hank/repos/mlc-llm/python/mlc_llm/serve/engine_base.py", line 124, in _convert_model_info
model_path = download_cache.get_or_download_model(model.model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hank/repos/mlc-llm/python/mlc_llm/support/download_cache.py", line 226, in get_or_download_model
if model.startswith("HF://"):
^^^^^^^^^^^^^^^^
AttributeError: 'tuple' object has no attribute 'startswith'
PDB run:
> /home/hank/repos/mlc-llm/python/mlc_llm/support/download_cache.py(227)get_or_download_model()
-> if model.startswith("HF://"):
(Pdb) model
('./dist/prebuilt/Llama-3-8B-Instruct-q4f16_1-MLC',)
Expected behavior
Expected the local model to be loaded correctly as this use of MLCEngine directly matches the documentation.
🐛 Bug
MLCEngine model parameter gets converted to a tuple internally but not handled correctly.
To Reproduce
Steps to reproduce the behavior:
Output:
PDB run:
Expected behavior
Expected the local model to be loaded correctly as this use of MLCEngine directly matches the documentation.
Environment
conda
, source): sourcepip
, source): sourceThanks! The model does work when using HF:// URLs.
The text was updated successfully, but these errors were encountered: