Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during training for custom data #282

Open
rocker12121 opened this issue Apr 13, 2023 · 0 comments
Open

Error during training for custom data #282

rocker12121 opened this issue Apr 13, 2023 · 0 comments

Comments

@rocker12121
Copy link

Hello

I am trying to train the model for my custom data of just 200-300 images. Our dataset generation is in the process so, I am just setting up the grounds to train this model for my custom data. I have a single GPU for training and I want to use Mobilenet.

The command I run is:
python3 train.py --gpus 0 --cfg config/custom-mobilenetv2dilated-c1_deepsup.yaml

but I encounter the following error. Can you please help me out with this?

[2023-04-13 01:45:10,405 INFO train.py line 240 26317] Loaded configuration file config/custom-mobilenetv2dilated-c1_deepsup.yaml
[2023-04-13 01:45:10,405 INFO train.py line 241 26317] Running with config:
DATASET:
imgMaxSize: 1000
imgSizes: (300, 375, 450, 525, 600)
list_train: ./data/training.odgt
list_val: ./data/validation.odgt
num_class: 3
padding_constant: 8
random_flip: True
root_dataset: ./data/
segm_downsampling_rate: 8
DIR: ckpt/custom-mobilenetv2dilated-c1_deepsup
MODEL:
arch_decoder: c1_deepsup
arch_encoder: mobilenetv2dilated
fc_dim: 320
weights_decoder:
weights_encoder:
TEST:
batch_size: 1
checkpoint: epoch_20.pth
result: ./
TRAIN:
batch_size_per_gpu: 3
beta1: 0.9
deep_sup_scale: 0.4
disp_iter: 20
epoch_iters: 5000
fix_bn: False
lr_decoder: 0.02
lr_encoder: 0.02
lr_pow: 0.9
num_epoch: 20
optim: SGD
seed: 304
start_epoch: 0
weight_decay: 0.0001
workers: 16
VAL:
batch_size: 1
checkpoint: epoch_20.pth
visualize: False
[2023-04-13 01:45:10,405 INFO train.py line 246 26317] Outputing checkpoints to: ckpt/custom-mobilenetv2dilated-c1_deepsup

samples: 135

1 Epoch = 5000 iters
Traceback (most recent call last):
File "train.py", line 273, in
main(cfg, gpus)
File "train.py", line 200, in main
train(segmentation_module, iterator_train, optimizers, history, epoch+1, cfg)
File "train.py", line 32, in train
batch_data = next(iterator)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/e/semantic-segmentation-pytorch-copy/mit_semseg/dataset.py", line 162, in getitem
assert(segm.mode == "L")
AssertionError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant