-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
response_format with regex does not seem to work #2423
Comments
Hi @aymeric-roucher, I've made some tests with the reproducible example you've shared. I do think this is a cache issue that has to be fixed either in Inference API or TGI directly. The only difference between I also tested the failing case with I then tried to reproduce the error by sending a random string twice. First time without a regex (to warm-up the cache) and second time with the regex (to test if it would reuse the cache). I did not manage to reproduce the bug with this technique. I don't know what is specific with the |
The cache key is computed by hashing the entire input (including parameters so including the regex). This is unlikely to be a cache issue. I think it's possible that some regex can be ignored in some circumstances (basically to avoid critical failure). |
@Narsil I've been able to reproduce it with cache disabled by repeating the exact same request until it fails: from huggingface_hub import InferenceClient
client = InferenceClient("meta-llama/Meta-Llama-3.1-8B-Instruct", headers={"x-use-cache": "0"})
for i in range(50):
output = client.chat_completion(
[{"role": "user", "content": "ok"}],
response_format={"type": "regex", "value": ".+?\n\nCode:+?"},
)
answer = output.choices[0].message.content
if "Code:" in answer:
print(f"Iteration {i}: OK")
else:
print(f"Iteration {i}: NOT OK\n{answer}")
break which outputs:
Though it's not reproducing the error 100% of the time, it's still happening once every few requests. |
Calling in @drbh on this. I know it can happen, I didn't expect 6 iteration would be enough to trigger. |
Describe the bug
When passing a
response_format
of typeregex
tochat_completion
, the output does not always respect the format.Reproduction
This does not follow the regex:
But going through OpenAI Messages API does work:
Logs
No response
System info
The text was updated successfully, but these errors were encountered: