Implementation of Distilling the Knowledge in a Neural Network https://arxiv.org/pdf/1503.02531.pdf
vgg11 | ResNet18 | ResNet50 | ResNet101 |
---|---|---|---|
82.6078% | 89.5004% | 87.7389% | 88.3260% |
Model | Accuracy | |
---|---|---|
base line | 70.6669% | |
+ VGG 11 KD | 72.2943% | |
+ ResNet18 KD | 75.1597% | |
+ ResNet50 KD | 76.1282% | |
+ ResNet101 KD | 75.9685% |
vgg19 | ResNet18 | ResNet50 | ResNet101 |
---|---|---|---|
35.7628% | 61.8332% | 57.9319% | 55.4014% |
Model | Accuracy | |
---|---|---|
base line | 37.1206% | |
+ self KD | 38.2887% | |
+ VGG 19 KD | 37.8494% | |
+ ResNet18 KD | 43.7899% | |
+ ResNet50 KD | 57.9319% | |
+ ResNet101 KD | 46.7252% |