Skip to content

Commit

Permalink
update README
Browse files Browse the repository at this point in the history
  • Loading branch information
baishihao committed Oct 22, 2024
1 parent e1e7e0a commit 31ec2cb
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,11 +162,14 @@ python -m lightllm.server.api_server --model_dir /path/llama-7B \
--max_total_token_num 120000
~~~

The parameter `max_total_token_num` is influenced by the GPU memory of the deployment environment. Use the following script to get the recommended values
The parameter `max_total_token_num` is influenced by the GPU memory of the deployment environment. You can also specify --mem_faction to have it calculated automatically.

~~~shell
python -m lightllm.utils.profile_max_tokens --model_dir /path/llama-7B \
--tp 1
python -m lightllm.server.api_server --model_dir /path/llama-7B \
--host 0.0.0.0 \
--port 8080 \
--tp 1 \
--mem_faction 0.9
~~~

To initiate a query in the shell:
Expand Down

0 comments on commit 31ec2cb

Please sign in to comment.