The TEOChatlas dataset and external evaluation datasets are available for download here.
You can download all of the data using the following code:
from datasets import load_dataset
# Optionally specify a cache directory if you have limited space in your home directory
# Or if you want to place the data somewhere else.
cache_dir = None
# Optionally specify a split if you only want to download a subset of the data
# The splits are defined in the hugingface hub page for the dataset
split = None
dataset = load_dataset("jirvin16/TEOChatlas", split=split, cache_dir=cache_dir, trust_remote_code=True)
This will download the data to the machine where the code is run. Running load_dataset
again will not re-download the data, unless the cache directory is changed. The training code uses load_dataset
to load the data.
Navigate to the Video-LLaVA-Pretrain-7B model on the Hugging Face model hub and download the mm_projector.bin
file. This file contains the weights for the Video-LLaVA projector, which will be used to initialize the TEOChat projector.
You need to make the following changes in order to train TEOChat:
- Set the
--pretrain_mm_mlp_adapter
to the path of themm_projector.bin
file you downloaded in step 1. - Set the
--output_dir
to the directory where you want to save the model checkpoints and logs. The prefix should bevideo-llava-7b-8bit-lora
otherwise there may be issues evaluating the model. - (Optional) Set the
--cache_dir
to the directory where you want to cache the pretrained models used for initialization (like Video-LLaVA). - (Optional) Set the
--data_cache_dir
to the directory where you stored the TEOChatlas dataset if you specified a cache directory in the data preparation step.
sh scripts/train_teochat.sh
Instructions for validating TEOChat will be provided here.
Instructions for fine-tuning TEOChat will be provided here.