-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting errors when trying to run llama.cpp example. #977
Comments
Thanks for reporting! I'm not able to reproduce this one. Did you make any changes to the code in the example? |
PS: I noticed that the model download in this one was pretty slow -- just CURLing Hugging Face. I switched it over to the faster That should make it easier to iterate on the Image definition. You can also |
Same error when I tried to use it for the first time. Errored on Mac, switched to linux, same error. Full error message: uvx modal run 06_gpu_and_ml/llm-serving/llama_cpp.py
✓ Initialized. View run at https://modal.com/apps/mpr1255/main/ap-B34Wncw2b37KobBEXJFjCg
✓ Created objects.
├── 🔨 Created mount /home/ubuntu/modal_test/modal-examples/06_gpu_and_ml/llm-serving/llama_cpp.py
├── 🔨 Created function download_model.
└── 🔨 Created function llama_cpp_inference.
/build/bin/llama-cli: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /build/bin/llama-cli)
/build/bin/llama-cli: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.13' not found (required by /build/bin/llama-cli)
/build/bin/llama-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /build/bin/llama-cli)
/build/bin/llama-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /build/bin/llama-cli)
/build/bin/llama-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /build/bin/llama-cli)
Traceback (most recent call last):
File "/pkg/modal/_runtime/container_io_manager.py", line 741, in handle_input_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 240, in run_input_sync
res = io_context.call_finalized_function()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/pkg/modal/_runtime/container_io_manager.py", line 180, in call_finalized_function
res = self.finalized_function.callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/llama_cpp.py", line 80, in llama_cpp_inference
subprocess.run(
File "/usr/local/lib/python3.11/subprocess.py", line 569, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/build/bin/llama-cli', '-m', '/Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf', '-n', '128', '-p', 'Write a poem about New York City.\n']' returned non-zero exit status 1.
Stopping app - uncaught exception raised locally: CalledProcessError(1, ['/build/bin/llama-cli', '-m', '/Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf', '-n', '128', '-p', 'Write a poem about New York City.\n']).
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/ubuntu/modal_test/modal-examples/06_gpu_and_ml/llm-serving/llama_cpp.py:96 in main │
│ │
│ 95 def main(prompt: str = None, num_output_tokens: int = None): │
│ ❱ 96 │ llama_cpp_inference.remote(prompt, num_output_tokens) │
│ 97 │
│ │
│ ...Remote call to Modal Function (ta-01JHMPKBM99D9GWFX1MKQ3Z8CB)... │
│ │
│ /root/llama_cpp.py:80 in llama_cpp_inference │
│ │
│ ❱ 80 subprocess.run( │
│ │
│ │
│ /usr/local/lib/python3.11/subprocess.py:569 in run │
│ │
│ ❱ 569 raise CalledProcessError(retcode, process.args, │
│ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/build/bin/llama-cli', '-m', '/Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf', '-n', '128', '-p', 'Write a poem about New York City.\n']' returned non-zero exit status 1. |
We updated the llama.cpp example to run DeepSeek-R1 on GPU. There's also a (new) code path for running Phi-4 on CPU. If the same error recurs there, please re-open and I'll investigate! |
The text was updated successfully, but these errors were encountered: