[BUG]OpenAI the chatClient.CompleteChatAsync(..) Method After triggering the Token limit of the TPM, the system is kept waiting. #47640

qideqian · 2024-12-23T06:21:41Z

Library name and version

Azure.AI.OpenAI 2.1.0

Describe the bug

 ClientResult<ChatCompletion> completion = await chatClient.CompleteChatAsync( chatMessages , new ChatCompletionOptions() { }, cancellationTokenSource.Token);

Will be in the waiting, do not throw exceptions, get no result.

Expected behavior

A message is displayed indicating that the exception that triggers the TPM is not kept waiting, the exception is not thrown, and no result is displayed.

Actual behavior

Has been waiting, no exceptions, no results.
API results for Microsoft Case

{

  "error": {

    "code": "429",

    "message": "Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 49 seconds. Please go here: [https://aka.ms/oai/quotaincrease](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2Foai%2Fquotaincrease&data=05%7C02%7Csupportmail3%40microsoft.com%7C779941d5de1543bb69c908dd1db48cf3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638699383566043223%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=knjGHBsdrbANiHIulxbSa4nqRfZfCgYBJlQIq2AcZXQ%3D&reserved=0) if you would like to further increase the default rate limit."

  }

}

Reproduction Steps

ClientResult completion = await chatClient.CompleteChatAsync(
chatMessages
, new ChatCompletionOptions()
{
}, cancellationTokenSource.Token);

Environment

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2024-12-23T06:22:13Z

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @ralph-msft @trrwilson.

ArthurMa1978 · 2024-12-24T03:44:08Z

Thanks for your feedback @qideqian , @chunyu3 please look into this issue.

chunyu3 · 2024-12-25T02:22:51Z

@qideqian when trigger the Token limit, the service will return 429 response, .NET will regard the 429 as retriable response, and will retry to send the request. default retry 3 times.

Would you please help me to check it is because the re-try cause the delay and hang?
we can skip retry by provide an retry policy in clientOptions as following, and see if the request return as expected. Thanks

AzureOpenAIClient azureClient = new(
    new Uri("<endpoint>"),
    credential,
    new AzureOpenAIClientOptions()
    {
        RetryPolicy = new ClientRetryPolicy(0) // retry 0 
    });

qideqian · 2024-12-25T06:10:24Z

@chunyu3 Yes, the exception of 429 can be directly captured after adding RetryPolicy = new ClientRetryPolicy(0) // retry 0 Meanwhile, I have set the expired task of 90s. If retryPolicy=0 is not set, the timeout task of 90s will be triggered in priority. The client will not get any results. What is the current 3-try policy? The total time should not exceed 90s, is there any problem in setting here?

chunyu3 · 2024-12-25T06:28:37Z

@qideqian you can set the timeout when create client . which will set the timeout for each round of http request.

AzureOpenAIClient azureClient = new(
    new Uri("<endpoint>"),
    credential,
    new AzureOpenAIClientOptions()
    {
        NetworkTimeout = TimeSpan.FromMilliseconds(90);
    });

qideqian · 2024-12-25T06:37:46Z

@chunyu3 Ok, thank you. Later, I will try to use the way you mentioned to set timeout. For the problem of waiting all the time, if there is a retry policy, can an exception be thrown after multiple failures or there is a default expiration time? Otherwise, it may not be easy to find out what the problem is.

ArthurMa1978 assigned chunyu3 Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]OpenAI the chatClient.CompleteChatAsync(..) Method After triggering the Token limit of the TPM, the system is kept waiting. #47640

[BUG]OpenAI the chatClient.CompleteChatAsync(..) Method After triggering the Token limit of the TPM, the system is kept waiting. #47640

qideqian commented Dec 23, 2024

github-actions bot commented Dec 23, 2024

ArthurMa1978 commented Dec 24, 2024

chunyu3 commented Dec 25, 2024 •

edited

Loading

qideqian commented Dec 25, 2024 •

edited

Loading

chunyu3 commented Dec 25, 2024 •

edited

Loading

qideqian commented Dec 25, 2024

[BUG]OpenAI the chatClient.CompleteChatAsync(..) Method After triggering the Token limit of the TPM, the system is kept waiting. #47640

[BUG]OpenAI the chatClient.CompleteChatAsync(..) Method After triggering the Token limit of the TPM, the system is kept waiting. #47640

Comments

qideqian commented Dec 23, 2024

Library name and version

Describe the bug

Expected behavior

Actual behavior

Reproduction Steps

Environment

github-actions bot commented Dec 23, 2024

ArthurMa1978 commented Dec 24, 2024

chunyu3 commented Dec 25, 2024 • edited Loading

qideqian commented Dec 25, 2024 • edited Loading

chunyu3 commented Dec 25, 2024 • edited Loading

qideqian commented Dec 25, 2024

chunyu3 commented Dec 25, 2024 •

edited

Loading

qideqian commented Dec 25, 2024 •

edited

Loading

chunyu3 commented Dec 25, 2024 •

edited

Loading