Welcome to the Qwen 2.5 Coder repository! This project demonstrates the capabilities of the Qwen 2.5 Coder model to automate the transformation of UI elements into structured HTML and CSS, leveraging the power of vision-language models (VLMs). With this tool, developers and designers can streamline the UI design process and rapidly prototype web components.
Qwen 2.5 Coder by Alibaba is a cutting-edge vision-language model that interprets visual inputs and translates them into code, helping convert UI layouts into HTML/CSS formats. It’s designed to understand complex visual features and generate accurate, ready-to-use code snippets. This repository provides a Streamlit application for interactive UI-to-code transformation.
- Automated Code Generation: Transform images of UI layouts into structured HTML/CSS with a few clicks.
- Flexible Model Options: Choose between multiple powerful models like Qwen 2.5, Pixtral, GPT-4o, and LLaMA for diverse coding and layout needs.
- Simple Deployment: Run the code transformation tool locally with minimal setup.
- Streamlit Application: A web app interface for uploading UI images and generating HTML/CSS.
- Code Scripts: Functions to encode images, call the model API, and process code outputs.
- HTML/CSS Extraction: Automated extraction of code blocks for easy integration into your projects.
git clone https://github.com/aryankargwal/genai-tutorials
cd genai-tutorials
cd qwen-coder
pip install -r requirements.txt
Ensure you have an API key from Tune Studio to access the Qwen model.
To start the interactive UI for uploading images and generating HTML/CSS:
streamlit run app.py
- Image Upload: Upload any UI image, such as a wireframe or a screenshot of a software layout.
- Model Selection: Choose from Qwen, GPT-4o, Pixtral, and LLaMA models for generating code based on your specific requirements.
- Generate HTML/CSS: Get a detailed description and HTML/CSS code output for the UI elements in your image.
- Multimodal Parsing: Qwen and other models accurately interpret both visual and text elements.
- User-Friendly Interface: Easily upload, process, and view outputs in a streamlined app.
- Versatile Model Choices: Experiment with different models to get the best results for diverse UI scenarios.
- Model Fine-Tuning: Tutorials for optimizing Qwen 2.5 on unique data sets.
- Extended Use Cases: Explore applications in automated prototyping, document parsing, and screen parsing.
This project is licensed under the MIT License. See the LICENSE file for more details.