Set up a token (or words) limit to be sent to the LLM #577

Gaket · 2024-07-31T20:12:10Z

Is your feature request related to a problem? Please describe.
I just ran into a problem when Vanna.ai tried to make a pretty big request to answer a single question:

Extracted SQL: SELECT BARBER_ID, COUNT(*) AS schedule_count
FROM SCHEDULES
GROUP BY BARBER_ID
Using model gpt-4o for **1,349,982.75** tokens (approx)

That's more than a million tokens!

Luckily, OpenAi sent me 429 error:
openai.RateLimitError: Error code: 429 - {'error': {'message': 'Request too large for gpt-4o in organization org-2.... on tokens per min (TPM): Limit 30000, Requested 1349985.

Describe the solution you'd like
I'd like to be able to pass a parameter specifying the max size in tokens that I'm comfortable with.

Describe alternatives you've considered
Setting up a limit at OpenAi - I'm not an admin there.

Additional context
I tried to connect Vanna AI to our prod db with 100+ tables and a lot of data. I believe the error could have appeared due to "let Vanna ai send your data to LLM" flag.

The text was updated successfully, but these errors were encountered:

zainhoda · 2024-08-06T00:55:17Z

This is likely due to the vn.generate_summary call. That's the only method that will send the entire dataframe to the LLM.

If you'd like to disable this functionality you can set summarization=False for the web app:
https://vanna.ai/docs/web-app/#vanna.flask.VannaFlaskApp

There probably does need to be some kind of option to limit the number of rows sent to the LLM for summarization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up a token (or words) limit to be sent to the LLM #577

Set up a token (or words) limit to be sent to the LLM #577

Gaket commented Jul 31, 2024

zainhoda commented Aug 6, 2024

Set up a token (or words) limit to be sent to the LLM #577

Set up a token (or words) limit to be sent to the LLM #577

Comments

Gaket commented Jul 31, 2024

zainhoda commented Aug 6, 2024