Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tiny-random-llava-next in VLM Pipeline #1660

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

yatarkan
Copy link
Contributor

@yatarkan yatarkan commented Jan 31, 2025

Ticket: CVS-160437

llava models store patch size parameter in config.json, not in preprocessor_config.json like other vlm models. In GenAI we have default value for patch size (14) set in processor config.
However, tiny versions of llava models have different patch size value (2).
In this PR the patch size value from vlm config is propagated to vision encoder.

@yatarkan yatarkan added the bug Something isn't working label Jan 31, 2025
@yatarkan yatarkan added this to the 2025.1 milestone Jan 31, 2025
@github-actions github-actions bot added the category: visual language Visual language pipeline label Jan 31, 2025
@yatarkan yatarkan changed the title Fix tiny-random-llava-next fails in VLM Pipeline Fix tiny-random-llava-next in VLM Pipeline Jan 31, 2025
@@ -19,6 +19,9 @@ ov::genai::VLMConfig::VLMConfig(const std::filesystem::path& json_path) {

// Setting llava_next specific config params
read_json_param(parsed, "image_newline", image_newline);
if (parsed.contains("vision_config")) {
read_json_param(parsed.at("vision_config"), "patch_size", vision_config_patch_size);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProcessorConfig already has patch_size field. Why doesn't ProcessorConfig check for vision_config in json?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: visual language Visual language pipeline
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants