You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tabby API currently only handles text. Many vision models have released. Exllama dev supports qwen2-vl
Solution
Support vision through openAI api. Hopefully in text completion too.
Alternatives
No response
Explanation
More multi-modal models will be created over time. Would be cool to have a fully integrated experience. i.e Creating an image with a model and having it iteratively use the image gen tool after seeing what it got back.
Examples
No response
Additional context
No response
Acknowledgements
I have looked for similar requests before submitting this one.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will make my requests politely.
The text was updated successfully, but these errors were encountered:
I understand the excitement over vision models and would like to implement support once there's a proper pipeline on how to do so via exllamav2.
According to turbo, the current dev branch is experimental and only works with the image part of llava. There still needs to be support for Qwen-2 VL and other models which is most likely being worked on at this time:
It should work for Qwen2-VL as well, although that will require some updates to the RoPE since they have multidimensional positional embeddings for images. Even a time dimension for video, just to make it that much harder. :P
I'd keep an eye on this issue turboderp/exllamav2#658 for the time being as that's a blocking issue.
Problem
Tabby API currently only handles text. Many vision models have released. Exllama dev supports qwen2-vl
Solution
Support vision through openAI api. Hopefully in text completion too.
Alternatives
No response
Explanation
More multi-modal models will be created over time. Would be cool to have a fully integrated experience. i.e Creating an image with a model and having it iteratively use the image gen tool after seeing what it got back.
Examples
No response
Additional context
No response
Acknowledgements
The text was updated successfully, but these errors were encountered: