[Question]: How to pass images into llama index query engine. #17309

AyushParikh · 2024-12-18T04:52:26Z

Question Validation

I have searched both the documentation and discord for an answer.

Question

Hey, I want to use multi modal for llama index query. How can I pass in the images into my query to get a response. I checked the documentation, but couldn't find a clear a answer.

    query_engine = index.as_query_engine()
    response = query_engine.query(query)

The text was updated successfully, but these errors were encountered:

dosubot · 2024-12-18T04:54:59Z

Hey @AyushParikh, I'm here to help you with any bugs, questions, or contributions while you wait for a human maintainer. Let's tackle this challenge together!

To pass images into the Llama Index query engine for multi-modal functionality and get a response, you can use the image_query method. Here's a basic example of how you can achieve this:

from llama_index.multi_modal_llms.openai import OpenAIMultiModal
from llama_index.core import PromptTemplate

# Define the prompt template
qa_tmpl_str = (
    "Given the images provided, "
    "answer the query.\n"
    "Query: {query_str}\n"
    "Answer: "
)
qa_tmpl = PromptTemplate(qa_tmpl_str)

# Initialize the OpenAI multi-modal model
openai_mm_llm = OpenAIMultiModal(
    model="gpt-4o", api_key=OPENAI_API_KEY, max_new_tokens=1500
)

# Create the query engine
query_engine = index.as_query_engine(
    llm=openai_mm_llm, image_qa_template=qa_tmpl
)

# Define the query string
query_str = "Tell me more about the relationship between those paintings."

# Perform the image query
response = query_engine.image_query("./mixed_wiki/2.jpg", query_str)

# Print the response
print(response)

This setup allows you to provide an image path and a prompt string to query the engine. Make sure to replace "./mixed_wiki/2.jpg" with the path to your image and adjust the query_str as needed for your specific query ^[1].

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

logan-markewich · 2024-12-18T05:11:40Z

Images cab be inputs to query engines, at least not directly

There is a multi modal retriever that can take in images and return images. And from there, you can call an llm to synthesize a response

Feel free to check out the many multimodal pages in the docs

https://docs.llamaindex.ai/en/stable/use_cases/multimodal/

AyushParikh added the question Further information is requested label Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: How to pass images into llama index query engine. #17309

[Question]: How to pass images into llama index query engine. #17309

AyushParikh commented Dec 18, 2024

dosubot bot commented Dec 18, 2024

logan-markewich commented Dec 18, 2024

[Question]: How to pass images into llama index query engine. #17309

[Question]: How to pass images into llama index query engine. #17309

Comments

AyushParikh commented Dec 18, 2024

Question Validation

Question

dosubot bot commented Dec 18, 2024

logan-markewich commented Dec 18, 2024