English | 简体中文

Semantic segmentation toolkit based on Visual Transformers

Semantic segmentation aims at classifying each pixel in an image to a specified semantic category, including objects (e.g., bicycle, car, people) and stuff (e.g., road, bench, sky).

Environment

This code is developed under the following configurations:

Hardware: 1/2/4/8 GPU for training and testing Software: Centos 6.10, CUDA=10.2 Python=3.8, Paddle=2.1.0

Installation

Create a conda virtual environment and activate it.

conda create -n paddlevit python=3.8
conda activate ppvit

Install PaddlePaddle following the official instructions, e.g.,

conda install paddlepaddle-gpu==2.1.0 cudatoolkit=10.2 --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/

Install PaddleViT

git clone https://github.com/BR-IDL/PaddleViT.git
cd PaddleViT/semantic_segmentation
pip3 install -r requirements.txt

Demo

We provide a demo script demo.py. This script performs inference on single images. You can put the input images in ./demo/img.

cd demo
CUDA_VISIBLE_DEVICES=0 python3 demo.py \
    --config ${CONFIG_FILE} \
    --model_path ${MODEL_PATH} \
    --pretrained_backbone ${PRETRAINED_BACKBONE} \
    --img_dir ${IMAGE_DIRECTORY} \
    --results_dir ${RESULT_DIRECTRORY}

Examples:

cd demo
CUDA_VISIBLE_DEVICES=0 python3 demo.py \
    --config ../configs/setr/SETR_PUP_Large_768x768_80k_cityscapes_bs_8.yaml \
    --model_path ../pretrain_models/setr/SETR_PUP_cityscapes_b8_80k.pdparams \
    --pretrained_backbone ../pretrain_models/backbones/vit_large_patch16_224.pdparams \
    --img_dir ./img/ \
    --results_dir ./results/

Quick start: training and testing models

1. Preparing data

Pascal-Context dataset

Download Pascal-Context dataset. "pascal_context/SegmentationClassContext" is generated by running the script voc2010_to_pascalcontext.py. Specifically, downloading the PASCAL VOC2010 from http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar, and annotation file from https://codalabuser.blob.core.windows.net/public/trainval_merged.json. It should have this basic structure:

pascal_context
|-- Annotations
|-- ImageSets
|-- JPEGImages
|-- SegmentationClass
|-- SegmentationClassContext
|-- SegmentationObject
|-- trainval_merged.json
|-- voc2010_to_pascalcontext.py

ADE20K dataset

Download ADE20K dataset from http://sceneparsing.csail.mit.edu/. It should have this basic structure:

ADEChallengeData2016
|-- annotations
|   |-- training
|   `-- validation
|-- images
|   |-- training
|   `-- validation
|-- objectInfo150.txt
`-- sceneCategories.txt

Cityscapes dataset

Download Cityscapes dataset from https://www.cityscapes-dataset.com/. **labelTrainIds.png are used for cityscapes training, which are generated by the script convert_cityscapes.py. It should have this basic structure:

cityscapes
|-- gtFine
|   |-- test
|   |-- train
|   `-- val
|-- leftImg8bit
|   |-- test
|   |-- train
|   `-- val

Trans10kV2 dataset

Download Trans10kV2 dataset from Google Drive. or Baidu Drive. code: oqms . It should have this basic structure:

Trans10K_cls12
|-- test
|   |-- images
|   `-- masks_12
|-- train
|   |-- images
|   `-- masks_12
|-- validation
|   |-- images
|   `-- masks_12

2. Testing

Single-scale testing on single GPU

CUDA_VISIBLE_DEVICES=0 python3  val.py  \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams

Multi-scale testing on single GPU

CUDA_VISIBLE_DEVICES=0,1 python3 val.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams \
    --multi_scales True

Single-scale testing on multi GPU

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch val.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams

Multi-scale testing on multi GPU

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch val.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams \
    --multi_scales True

Note:

that the -model_path option accepts the path of pretrained weights file (segmentation model, e.g., setr)

3. Training

Training on single GPU

CUDA_VISIBLE_DEVICES=0 python3  train.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml

Note:

The training options such as lr, image size, model layers, etc., can be changed in the .yaml file set in -cfg. All the available settings can be found in ./config.py

Training on multi GPU

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch train.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml

Note:

The training options such as lr, image size, model layers, etc., can be changed in the .yaml file set in -cfg. All the available settings can be found in ./config.py

Contact

If you have any questions regarding this repo, please create an issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Semantic segmentation toolkit based on Visual Transformers

Environment

Installation

Demo

Quick start: training and testing models

1. Preparing data

Pascal-Context dataset

ADE20K dataset

Cityscapes dataset

Trans10kV2 dataset

2. Testing

Single-scale testing on single GPU

Multi-scale testing on single GPU

Single-scale testing on multi GPU

Multi-scale testing on multi GPU

3. Training

Training on single GPU

Training on multi GPU

Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

Semantic segmentation toolkit based on Visual Transformers

Environment

Installation

Demo

Quick start: training and testing models

1. Preparing data

Pascal-Context dataset

ADE20K dataset

Cityscapes dataset

Trans10kV2 dataset

2. Testing

Single-scale testing on single GPU

Multi-scale testing on single GPU

Single-scale testing on multi GPU

Multi-scale testing on multi GPU

3. Training

Training on single GPU

Training on multi GPU

Contact