Skip to content

you only click once :you can do anything by sota AI with prompt ,auto AI tools , VL larger model fine and project

License

Notifications You must be signed in to change notification settings

hukaick/Prompt-Can-Anything

 
 

Repository files navigation

Prompt-Can-Anything

A fully automated toolkit: You just give prompt !you only click once! you can do anything by sota model with prompt and creativity

Motivation

Current: Making a fully automated AI tool for engineering and research to create Data engines may require the use of more CLIP models

Target: Plan to generate high-quality data annotation data and train our models.

So it's just a tool for prompt any thing(YOCO)

  1. Auto-label tool ,current structure (YOCO)

    In addition, we will introduce video, audio, and 3D annotation in the future.

structure

  1. Semi-automatic interaction UI tool (coming soon)

Feature

  • 🔥Data Engine

    Provide fully automated data annotation with one-click export (detection, segmentation, text, and nerf reconstruction results) and refine these through engineering optimization, ,through the correlation models of stable diffusion and gpt, we can create more data source power for downstream tasks.

  • Extended one-click annotation training for the use of three-party projects, such as Yolo, Lora modes. (coming soon)

  • Accelerated processing of videos and datasets(coming soon)

⭐ Research🚀 project🔥 Inspiration(In preparation)
  At research level, Zero-shot comparative learning is research trend, we hope to understand as much as possible the model design details of the project we are applying, so that we want to combine text, images, and audio to design a strong aligned backbone.
  At project level, Tensorrt acceleration of the basic model accelerates efficiency.

⭐[news list]

-【2023/5/4】   add  semantic segmentatio label, add args(--color-flag --save-mask )

-【2023/4/26】  YOCO,Automatic annotation TOOLS:Commit preliminary code ,For the input image or folder, you can obtain the results of detection, segmentation, and text annotation , optional chatgpt api.

Preliminary-Works

  • Segment Anything : Strong segmentation model. But it needs prompts (like boxes/points) to generate masks.

  • Grounding DINO : Strong zero-shot detector which is capable of to generate high quality boxes and labels with free-form text.

  • Stable-Diffusion : Amazing strong text-to-image diffusion model.

  • Tag2text : Efficient and controllable vision-language model which can simultaneously output superior image captioning and image tagging.

  • lama : Resolution-robust large mask Inpainting with Fourier Convolutions

🛠️ YOCO:Quick Start

First, Make sure you have a basic gpu deep learning environment.

(Linux is recommended, Windows may have problems compiling Grounded-DINO Deformable- transformer operator, see Grounding DINO )

gir clone https://github.com/positive666/Prompt-Can-Anything
cd Prompt-Can-Anything

Install environment:

pip install -e .

Install diffusers(Optional)

pip install --upgrade diffusers[torch]

more ,you can see "pip install < your missing packages>"

Run

  1. downloads models weights

    name backbone Data Checkpoint model-config
    1 Tag2Text-Swin Swin-Base COCO, VG, SBU, CC-3M, CC-12M Download link
    2 Segment-anything vit Download link| Download link| Download link
    3 Lama Download link
    4 GroundingDINO-T Swin-T O365,GoldG,Cap4M Github link | HF link link
    5 GroundingDINO-B Swin-B COCO,O365,GoldG,Cap4M,OpenImage,ODinW-35,RefCOCO Github link | HF link link
    1. set config file and args in utils/conf.py ,add your download weights to " MODEL_xxxx_PATH“ ,if need chatgpt,configure the "PROXIES", "API_KEY "
    2. run demo
    "--tag2text" :  provide images tage , you can use chatgpt to merge or filter words
    "--input_prompt" :  Select the detection target noun you are interested in, and you can turn off Tag2text
    '--color-flag': Give your semantic segmentation MASK the same category the same color
python demo.py  --source <data path>  --save-txt  --save-mask --save-xml  --save_caption 

**🏃Demo **

image-20230427093103453

image-20230508075845259

🔨To Do list

  • Release demo and code(2 days within).
  • web ui demo
  • support video ,chatgpt, add inpainting model demo
  • add 3d nerf demo
  • fintune sam and ground??
  • Release training datasets.

💘 Acknowledgements

About

you only click once :you can do anything by sota AI with prompt ,auto AI tools , VL larger model fine and project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 77.4%
  • Python 21.2%
  • Other 1.4%