English | 简体中文
Semantic segmentation aims at classifying each pixel in an image to a specified semantic category, including objects (e.g., bicycle, car, people) and stuff (e.g., road, bench, sky).
This code is developed under the following configurations:
Hardware: 1/2/4/8 GPU for training and testing Software: Centos 6.10, CUDA=10.2 Python=3.8, Paddle=2.1.0
- Create a conda virtual environment and activate it.
conda create -n paddlevit python=3.8
conda activate ppvit
- Install PaddlePaddle following the official instructions, e.g.,
conda install paddlepaddle-gpu==2.1.0 cudatoolkit=10.2 --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/
- Install PaddleViT
git clone https://github.com/BR-IDL/PaddleViT.git
cd PaddleViT/semantic_segmentation
pip3 install -r requirements.txt
We provide a demo script demo.py. This script performs inference on single images. You can put the input images in ./demo/img
.
cd demo
CUDA_VISIBLE_DEVICES=0 python3 demo.py \
--config ${CONFIG_FILE} \
--model_path ${MODEL_PATH} \
--pretrained_backbone ${PRETRAINED_BACKBONE} \
--img_dir ${IMAGE_DIRECTORY} \
--results_dir ${RESULT_DIRECTRORY}
Examples:
cd demo
CUDA_VISIBLE_DEVICES=0 python3 demo.py \
--config ../configs/setr/SETR_PUP_Large_768x768_80k_cityscapes_bs_8.yaml \
--model_path ../pretrain_models/setr/SETR_PUP_cityscapes_b8_80k.pdparams \
--pretrained_backbone ../pretrain_models/backbones/vit_large_patch16_224.pdparams \
--img_dir ./img/ \
--results_dir ./results/
Download Pascal-Context dataset. "pascal_context/SegmentationClassContext" is generated by running the script voc2010_to_pascalcontext.py. Specifically, downloading the PASCAL VOC2010 from http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar, and annotation file from https://codalabuser.blob.core.windows.net/public/trainval_merged.json. It should have this basic structure:
pascal_context
|-- Annotations
|-- ImageSets
|-- JPEGImages
|-- SegmentationClass
|-- SegmentationClassContext
|-- SegmentationObject
|-- trainval_merged.json
|-- voc2010_to_pascalcontext.py
Download ADE20K dataset from http://sceneparsing.csail.mit.edu/. It should have this basic structure:
ADEChallengeData2016
|-- annotations
| |-- training
| `-- validation
|-- images
| |-- training
| `-- validation
|-- objectInfo150.txt
`-- sceneCategories.txt
Download Cityscapes dataset from https://www.cityscapes-dataset.com/. **labelTrainIds.png are used for cityscapes training, which are generated by the script convert_cityscapes.py. It should have this basic structure:
cityscapes
|-- gtFine
| |-- test
| |-- train
| `-- val
|-- leftImg8bit
| |-- test
| |-- train
| `-- val
Download Trans10kV2 dataset from Google Drive. or Baidu Drive. code: oqms . It should have this basic structure:
Trans10K_cls12
|-- test
| |-- images
| `-- masks_12
|-- train
| |-- images
| `-- masks_12
|-- validation
| |-- images
| `-- masks_12
CUDA_VISIBLE_DEVICES=0 python3 val.py \
--config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
--model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams
CUDA_VISIBLE_DEVICES=0,1 python3 val.py \
--config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
--model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams \
--multi_scales True
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch val.py \
--config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
--model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch val.py \
--config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
--model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams \
--multi_scales True
Note:
- that the
-model_path
option accepts the path of pretrained weights file (segmentation model, e.g., setr)
CUDA_VISIBLE_DEVICES=0 python3 train.py \
--config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml
Note:
- The training options such as lr, image size, model layers, etc., can be changed in the
.yaml
file set in-cfg
. All the available settings can be found in./config.py
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch train.py \
--config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml
Note:
- The training options such as lr, image size, model layers, etc., can be changed in the
.yaml
file set in-cfg
. All the available settings can be found in./config.py
If you have any questions regarding this repo, please create an issue.