Trust_remote_code not pass to router, TGI launcher get stuck if model tokenizer has custom code #2649

tanyinyan · 2024-10-15T08:53:47Z

System Info

TGI version: v2.3.1
Model: Baichuan2-7B
The full command line (has set trust-remote-code):

export USE_PREFIX_CACHING=0
export ATTENTION=paged
export ASCEND_RT_VISIBLE_DEVICES=4
model_id=/home/data/models/baichuan2-7b

text-generation-launcher \
--model-id $model_id \
--port 12359 \
--max-input-length 2048 \
--max-total-tokens 2560 \
--max-batch-prefill-tokens 4096 \
--max-waiting-tokens 20 \
--max-concurrent-requests 200 \
--waiting-served-ratio 1.2 \
--trust-remote-code

From log ,we can see trust_remote-code is set:

2024-10-15T07:47:47.879952Z  WARN text_generation_launcher: `trust_remote_code` is set. Trusting that model `/home/data/models/baichuan2-7b` do not contain malicious code.
2024-10-15T07:47:47.880230Z  INFO download: text_generation_launcher: Starting check and download process for /home/data/models/baichuan2-7b
2024-10-15T07:47:52.691488Z  INFO text_generation_launcher: Files are already present on the host. Skipping download.
2024-10-15T07:47:53.696712Z  INFO download: text_generation_launcher: Successfully downloaded weights for /home/data/models/baichuan2-7b

But after warmup, it gets stuck, and suggest trust_remote_code=True

2024-10-15T07:48:19.445156Z  INFO text_generation_router_v3: backends/v3/src/lib.rs:102: Setting max batch total tokens to 25344
2024-10-15T07:48:19.445326Z  INFO text_generation_router_v3: backends/v3/src/lib.rs:127: Using backend V3
The repository for /home/data/models/baichuan2-7b contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//home/data/models/baichuan2-7b.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

Add , ("trust_remote_code","True".to_string()) as AutoTokenizer.from_pretrained kwargs
text-generation-inference/router/src/server.rs
Lines L1624-L1625:

    let tokenizer: Option<Tokenizer> = tokenizer_filename.and_then(|filename| {
        use pyo3::prelude::*;
        let convert = pyo3::Python::with_gil(|py| -> PyResult<()> {
            let transformers = py.import_bound("transformers")?;
            let auto = transformers.getattr("AutoTokenizer")?;
            let from_pretrained = auto.getattr("from_pretrained")?;
            let args = (tokenizer_name.to_string(),);
            let kwargs = [(
                "revision",
                revision.clone().unwrap_or_else(|| "main".to_string()),
            )
            , ("trust_remote_code","True".to_string())
            ]
            .into_py_dict_bound(py);
            let tokenizer = from_pretrained.call(args, Some(&kwargs))?;
            let save = tokenizer.getattr("save_pretrained")?;
            let args = ("out".to_string(),);
            save.call1(args)?;
            Ok(())
        })

It runs OK:

2024-10-15T08:44:56.377562Z  INFO text_generation_router_v3: backends/v3/src/lib.rs:102: Setting max batch total tokens to 25344
2024-10-15T08:44:56.377648Z  INFO text_generation_router_v3: backends/v3/src/lib.rs:127: Using backend V3
2024-10-15T08:45:01.690286Z  INFO text_generation_router::server: router/src/server.rs:1671: Using config Some(Baichuan)
2024-10-15T08:45:01.690369Z  WARN text_generation_router::server: router/src/server.rs:1716: no pipeline tag found for model /home/data/models/baichuan2-7b
2024-10-15T08:45:01.690379Z  WARN text_generation_router::server: router/src/server.rs:1818: Invalid hostname, defaulting to 0.0.0.0
2024-10-15T08:45:01.801597Z  INFO text_generation_router::server: router/src/server.rs:2211: Connected
2024-10-15T08:45:10.554155Z  INFO generate{parameters=GenerateParameters { best_of: None, temperature: None, repetition_penalty: None, frequency_penalty: None, top_k: None, top_p: None, typical_p: None, do_sample: false, max_new_tokens: Some(20), return_full_text: None, stop: [], truncate: None, watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None, grammar: None, adapter_id: None } total_time="576.72759ms" validation_time="570.99µs" queue_time="91.36µs" inference_time="576.06533ms" time_per_token="28.803266ms" seed="None"}: text_generation_router::server: router/src/server.rs:402: Success

Expected behavior

Text-generation-launcher argument "trust-remote-code" should be passed to text-generation-router for constucting a fast tokenizer!

The text was updated successfully, but these errors were encountered:

tanyinyan changed the title ~~Trust_remote_code not pass to router tokenizer, TGI launcher get stuck if model tokenizer has custom code~~ Trust_remote_code not pass to router, TGI launcher get stuck if model tokenizer has custom code Oct 15, 2024

Narsil linked a pull request Oct 17, 2024 that will close this issue

Fixing "deadlock" when python prompts for trust_remote_code by always #2664

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trust_remote_code not pass to router, TGI launcher get stuck if model tokenizer has custom code #2649

Trust_remote_code not pass to router, TGI launcher get stuck if model tokenizer has custom code #2649

tanyinyan commented Oct 15, 2024 •

edited

Loading

Trust_remote_code not pass to router, TGI launcher get stuck if model tokenizer has custom code #2649

Trust_remote_code not pass to router, TGI launcher get stuck if model tokenizer has custom code #2649

Comments

tanyinyan commented Oct 15, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

tanyinyan commented Oct 15, 2024 •

edited

Loading