The following script assumes that the data is stored at artifacts/data and the annotation.csv file is at artifacts/data/annotation.csv
The path to data can be modified by modifying each of the following shell scripts.
- final_integration/train_autoencoder.sh
- final_integration/train_rm.sh
- final_integration/train_rotation.sh
Since our training occurs in steps, running following script in order is necessary
Running the scripts from the final_integration directory:
- final_integration/train_autoencoder.sh - Trains the autoencoder model using top views that are split by camera angles
- final_integration/train_rm.sh - Trains 6 CNN's (ResNet18) for 6 angles to predict encodings generated by the best encoder (from step 1) given a true front view in that angle
- final_integration/train_rotation.sh - Pretrains a ResNet18 model using SSL rotation pretext task
- final_integration/generate_mono_data.sh - Generates training data for predicting masks for dynamic elements
- final_integration/train_bb.sh - Train the GAN for predicting masks using the generated data from step 4 and the pretrained ResNet18 from step 3
Core testing code is highly condensed into just 1 file "final_integration/test_model/model_loader.py" and hence it might be a little unreadable.
Run the following script to test the overall performance of our model:
- final_integration/run_test.sh - Runs the testing script used for evaluation in the competition.
Note:
At this point, run_test.sh assumes that the following files exist:
- "artifacts/models/topview_resnet/front/best_performing.pt"
- "artifacts/models/topview_resnet/front_left/best_performing.pt"
- "artifacts/models/topview_resnet/front_right/best_performing.pt"
- "artifacts/models/topview_resnet/back/best_performing.pt"
- "artifacts/models/topview_resnet/back_left/best_performing.pt"
- "artifacts/models/topview_resnet/back_right/best_performing.pt"
- "artifacts/models/autoencoder/best_performing.pt"
- "artifacts/models/rotation_ssl/best_performing.pt"
- "artifacts/models/mono/front/monolayout/best/encoder.pth"
- "artifacts/models/mono/front_left/monolayout/best/encoder.pth"
- "artifacts/models/mono/front_right/monolayout/best/encoder.pth"
- "artifacts/models/mono/back/monolayout/best/encoder.pth"
- "artifacts/models/mono/back_left/monolayout/best/encoder.pth"
- "artifacts/models/mono/front/monolayout/best/encoder.pth"
- "artifacts/models/mono/front/monolayout/best/decoder.pth"
- "artifacts/models/mono/front_left/monolayout/best/decoder.pth"
- "artifacts/models/mono/front_right/monolayout/best/decoder.pth"
- "artifacts/models/mono/back/monolayout/best/decoder.pth"
- "artifacts/models/mono/back_left/monolayout/best/decoder.pth"
- "artifacts/models/mono/front/monolayout/best/decoder.pth"
The above files can be generated by running the training shell scripts in the default configuration.
A huge thank you to "Mani, Kaustubh and Daga, Swapnil and Garg, Shubhika and Narasimhan, Sai Shankar and Krishna, Madhava and Jatavallabhula, Krishna Murthy". A lot of our inspiration was from their research paper: MonoLayout: Amodal scene layout from a single image and their code