You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The first time I run the chat in main, the shape of input is [1,5]. After producing an output token, the next shape of input is [1,1] since I use key-value caching.
When I later enter a new prompt and run the chat, the input shape is [1,3] (which includes the EOS token from the previous run). The error disappears if drop some tokens so the shape becomes [1,1]. Is there something that says the shape must be [1,1] when we use key-value caching?
The text was updated successfully, but these errors were encountered:
I tried to modify the code in https://github.com/huggingface/candle/blob/main/candle-examples/examples/llama/main.rs to become a chatbot where each new prompt considers the history of all previous prompts. This is my code:
When I run, I get this error:
The error happens the second time I call
Chat::run
inmain
and is thrown from this statement.The first time I run the chat in main, the shape of
input
is[1,5]
. After producing an output token, the next shape ofinput
is[1,1]
since I use key-value caching.When I later enter a new prompt and run the chat, the
input
shape is[1,3]
(which includes the EOS token from the previous run). The error disappears if drop some tokens so the shape becomes[1,1]
. Is there something that says the shape must be[1,1]
when we use key-value caching?The text was updated successfully, but these errors were encountered: