The below tables are models enabled by the Intel® Neural Compressor.
Framework
Version
Model
Accuracy
Performance
INT8 Tuning Accuracy
FP32 Accuracy Baseline
Acc Ratio [(INT8-FP32)/FP32]
INT8 throughput
FP32 throughput
Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1
CLX8280 1s 4c per instance bs1
tensorflow
2.5.0
resnet50v1.0
74.24%
74.27%
-0.04%
925.93
329.57
2.81x
tensorflow
2.5.0
resnet50v1.5
76.94%
76.46%
0.63%
726.14
281.58
2.58x
tensorflow
2.5.0
resnet101
77.21%
76.45%
0.99%
549.88
227.27
2.42x
tensorflow
2.5.0
inception_v1
70.30%
69.74%
0.80%
1256.73
705.65
1.78x
tensorflow
2.5.0
inception_v2
74.27%
73.97%
0.41%
1046.34
567.72
1.84x
tensorflow
2.5.0
inception_v3
77.29%
76.75%
0.70%
542.64
254.92
2.13x
tensorflow
2.5.0
inception_v4
80.36%
80.27%
0.11%
335.25
129.32
2.59x
tensorflow
2.5.0
inception_resnet_v2
80.42%
80.40%
0.02%
157.41
79.83
1.97x
tensorflow
2.5.0
mobilenetv1
73.93%
70.96%
4.19%
2372.88
691.70
3.43x
tensorflow
2.5.0
mobilenetv2
71.96%
71.76%
0.28%
1408.45
673.72
2.09x
tensorflow
2.5.0
ssd_resnet50_v1
37.91%
38.00%
-0.24%
49.84
17.03
2.93x
tensorflow
2.5.0
ssd_mobilenet_v1
23.02%
23.13%
-0.48%
571.43
260.22
2.20x
tensorflow
2.5.0
ssd_resnet34
21.97%
22.16%
-0.86%
26.49
7.29
3.63x
tensorflow
2.5.0
faster_rcnn_resnet101
30.33%
30.38%
-0.16%
45.47
12.99
3.50x
tensorflow
2.5.0
faster_rcnn_resnet101_saved
30.37%
30.38%
-0.03%
46.02
11.36
4.05x
tensorflow
2.5.0
mask_rcnn_inception_v2
28.61%
28.73%
-0.42%
89.78
35.58
2.52x
tensorflow
2.5.0
wide_deep_large_ds
77.61%
77.67%
-0.08%
5645.16
3723.40
1.52x
tensorflow
2.5.0
vgg16
72.13%
70.89%
1.75%
406.98
114.27
3.56x
tensorflow
2.5.0
vgg19
72.35%
71.01%
1.89%
344.83
94.39
3.65x
tensorflow
2.5.0
resnetv2_50
70.36%
69.64%
1.03%
448.72
378.58
1.19x
tensorflow
2.5.0
resnetv2_101
72.58%
71.87%
0.99%
271.84
205.46
1.32x
tensorflow
2.5.0
resnetv2_152
72.92%
72.37%
0.76%
188.78
138.83
1.36x
tensorflow
2.5.0
densenet121
72.31%
72.89%
-0.80%
213.54
145.14
1.47x
tensorflow
2.5.0
densenet161
76.36%
76.29%
0.09%
131.41
80.66
1.63x
tensorflow
2.5.0
densenet169
74.49%
74.65%
-0.21%
178.07
123.74
1.44x
tensorflow
2.5.0
ssd_resnet50_v1_ckpt
37.89%
38.00%
-0.29%
49.28
14.51
3.40x
tensorflow
2.5.0
ssd_mobilenet_v1_ckpt
23.02%
23.13%
-0.48%
573.30
219.37
2.61x
tensorflow
2.5.0
mask_rcnn_inception_v2_ckpt
28.61%
28.73%
-0.42%
85.90
34.10
2.52x
tensorflow
2.5.0
efficientnet_b0
78.53%
76.75%
2.32%
274.94
254.73
1.08x
tensorflow
2.5.0
resnet50_fashion
78.05%
78.12%
-0.09%
2229.30
938.34
2.37x
Framework
Version
Model
Accuracy
Performance
INT8 Tuning Accuracy
FP32 Accuracy Baseline
Acc Ratio [(INT8-FP32)/FP32]
INT8 throughput
FP32 throughput
Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1
CLX8280 1s 4c per instance bs1
tensorflow
1.15.0-up2
bert_large_squad
92.4835
92.9805
-0.53%
15.86
5.50
2.88x
tensorflow
1.15.0-up2
bert_base_mrpc
86.03%
86.52%
-0.57%
138.31
92.08
1.50x
tensorflow
1.15.0-up2
resnet_v1_50_slim
76.05%
75.18%
1.16%
752.69
265.96
2.83x
tensorflow
1.15.0-up2
resnet_v1_101_slim
77.15%
76.40%
0.98%
465.43
139.28
3.34x
tensorflow
1.15.0-up2
resnet_v1_152_slim
77.56%
76.81%
0.98%
343.14
94.31
3.64x
tensorflow
1.15.0-up2
inception_v1_slim
70.41%
69.77%
0.92%
1202.75
573.30
2.10x
tensorflow
1.15.0-up2
inception_v2_slim
74.38%
73.98%
0.54%
1021.90
487.47
2.10x
tensorflow
1.15.0-up2
inception_v3_slim
78.32%
77.99%
0.42%
591.22
222.01
2.66x
tensorflow
1.15.0-up2
inception_v4_slim
80.35%
80.19%
0.20%
321.69
114.21
2.82x
tensorflow
1.15.0-up2
vgg16_slim
72.16%
70.89%
1.79%
411.04
113.45
3.62x
tensorflow
1.15.0-up2
vgg19_slim
72.22%
71.01%
1.70%
346.19
95.08
3.64x
tensorflow
1.15.0-up2
resnetv2_50_slim
70.39%
69.72%
0.96%
458.72
357.14
1.28x
tensorflow
1.15.0-up2
resnetv2_101_slim
72.51%
71.91%
0.83%
277.12
191.94
1.44x
tensorflow
1.15.0-up2
resnetv2_152_slim
72.98%
72.40%
0.80%
193.91
132.53
1.46x
Framework
Version
Model
Accuracy
Performance
INT8 Tuning Accuracy
FP32 Accuracy Baseline
Acc Ratio [(INT8-FP32)/FP32]
INT8 throughput
FP32 throughput
Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1
CLX8280 1s 4c per instance bs1
pytorch
1.9.0+cpu
resnet18
69.58%
69.76%
-0.26%
492.61
263.65
1.87x
pytorch
1.9.0+cpu
resnet50
75.87%
76.13%
-0.34%
281.24
130.01
2.16x
pytorch
1.9.0+cpu
resnext101_32x8d
79.09%
79.31%
-0.28%
109.32
47.45
2.30x
pytorch
1.9.0+cpu
bert_base_mrpc
88.16%
88.73%
-0.64%
170.11
85.83
1.98x
pytorch
1.9.0+cpu
bert_base_cola
58.29%
58.84%
-0.93%
178.71
83.91
2.13x
pytorch
1.9.0+cpu
bert_base_sts-b
88.65%
89.27%
-0.70%
176.81
84.27
2.10x
pytorch
1.9.0+cpu
bert_base_sst-2
91.63%
91.86%
-0.25%
177.71
84.16
2.11x
pytorch
1.9.0+cpu
bert_base_rte
69.31%
69.68%
-0.52%
177.17
85.53
2.07x
pytorch
1.9.0+cpu
bert_large_mrpc
87.48%
88.33%
-0.95%
62.06
24.83
2.50x
pytorch
1.9.0+cpu
bert_large_squad
92.78988
93.04683
-0.28%
13.89
7.49
1.85x
pytorch
1.9.0+cpu
bert_large_qnli
91.12%
91.82%
-0.76%
63.02
24.21
2.60x
pytorch
1.9.0+cpu
bert_large_rte
72.92%
72.56%
0.50%
46.07
23.45
1.96x
pytorch
1.9.0+cpu
bert_large_cola
62.85%
62.57%
0.45%
61.92
24.52
2.52x
pytorch
1.9.0+cpu
inception_v3
69.39%
69.54%
-0.21%
230.34
131.21
1.76x
pytorch
1.9.0+cpu
peleenet
71.54%
72.08%
-0.75%
271.32
203.96
1.33x
pytorch
1.9.0+cpu
yolo_v3
24.50%
24.54%
-0.17%
59.09
28.49
2.07x
pytorch
1.9.0+cpu
se_resnext50_32x4d
79.02%
79.08%
-0.07%
204.02
109.12
1.87x
pytorch
1.9.0+cpu
mobilenet_v2
70.73%
71.86%
-1.57%
445.01
329.26
1.35x
pytorch
1.9.0+cpu
blendcnn
68.40%
68.40%
0.00%
2868.85
2755.91
1.04x
pytorch
1.5.0a0+b58f89b
resnet50_ipex
75.80%
76.13%
-0.44%
353.71
213.09
1.66x
pytorch
1.9.0+cpu
gpt_wikitext
60.06256
60.19923
-0.23%
13.11
12.06
1.09x
pytorch
1.9.0+cpu
roberta_base_mrpc
85.37%
85.51%
-0.17%
173.78
85.54
2.03x
pytorch
1.9.0+cpu
camembert_base_mrpc
84.72%
84.22%
0.60%
158.16
84.63
1.87x
pytorch
1.9.0+cpu
distilbert_base_mrpc
81.17%
80.99%
0.21%
279.44
158.91
1.76x
pytorch
1.9.0+cpu
albert_base_mrpc
88.77%
88.50%
0.31%
22.88
18.28
1.25x
pytorch
1.9.0+cpu
funnel_mrpc
91.72%
92.26%
-0.58%
79.44
78.01
1.02x
pytorch
1.9.0+cpu
bart_wnli
49.30%
52.11%
-5.41%
21.74
19.92
1.09x
pytorch
1.9.0+cpu
mbart_wnli
56.34%
56.34%
0.00%
39.87
20.34
1.96x
pytorch
1.9.0+cpu
t5_wmt_en_ro
24.3855
24.5213
-0.55%
2.76
2.59
1.06x
pytorch
1.9.0+cpu
marianmt_wmt_en_ro
22.3857
22.225
0.72%
1.94
1.84
1.05x
pytorch
1.9.0+cpu
pegasus_billsum
50.2328
51.2135
-1.91%
0.18
0.11
1.56x
pytorch
1.9.0+cpu
dialogpt_wikitext
36.18182
36.18182
0.00%
4.37
4.35
1.00x
pytorch
1.9.0+cpu
xlm-roberta-base_mrpc
87.93%
88.62%
-0.78%
79.57
77.46
1.03x
pytorch
1.9.0+cpu
flaubert_mrpc
79.81%
80.19%
-0.48%
361.20
295.11
1.22x
pytorch
1.9.0+cpu
barthez_mrpc
83.25%
83.81%
-0.66%
112.72
67.00
1.68x
pytorch
1.9.0+cpu
longformer_mrpc
90.97%
91.46%
-0.53%
12.97
10.97
1.18x
pytorch
1.9.0+cpu
layoutlm_mrpc
81.22%
78.01%
4.12%
145.26
78.19
1.86x
pytorch
1.9.0+cpu
deberta_mrpc
90.29%
90.91%
-0.68%
78.70
50.84
1.55x
pytorch
1.9.0+cpu
squeezebert_mrpc
87.96%
87.65%
0.36%
145.56
126.72
1.15x
pytorch
1.9.0+cpu
resnet18_fx
69.61%
69.76%
-0.22%
503.96
257.73
1.96x
pytorch
1.9.0+cpu
xlnet_base_mrpc
89.43%
89.47%
-0.04%
67.93
52.56
1.29x
pytorch
1.9.0+cpu
transfo_xl_mrpc
82.09%
81.20%
1.09%
6.64
4.94
1.34x
pytorch
1.9.0+cpu
ctrl_mrpc
82.00%
82.00%
0.00%
15.34
5.70
2.69x
pytorch
1.9.0+cpu
xlm_mrpc
80.50%
79.56%
1.18%
39.06
12.90
3.03x
pytorch
1.9.0+cpu
maskrcnn_fx
37.70%
37.80%
-0.26%
59.58
38.66
1.54x
Quantization-aware training models
Framework
Version
Model
Accuracy
Performance
INT8 Tuning Accuracy
FP32 Accuracy Baseline
Acc Ratio [(INT8-FP32)/FP32]
INT8 throughput
FP32 throughput
Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1
CLX8280 1s 4c per instance bs1
pytorch
1.9.0+cpu
resnet18_qat
69.75%
69.76%
-0.02%
492.96
262.86
1.87x
pytorch
1.9.0+cpu
resnet50_qat
76.05%
76.13%
-0.11%
273.97
128.53
2.13x
pytorch
1.9.0+cpu
resnet18_qat_fx
69.72%
69.76%
-0.05%
498.22
257.64
1.93x
pytorch
1.9.0+cpu
mobilenet_v2_qat
71.45%
71.86%
-0.56%
450.16
316.31
1.42x
Framework
Version
Model
Accuracy
Performance
INT8 Tuning Accuracy
FP32 Accuracy Baseline
Acc Ratio [(INT8-FP32)/FP32]
INT8 throughput
FP32 throughput
Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1
CLX8280 1s 4c per instance bs1
mxnet
1.7.0
resnet50v1
76.08%
76.33%
-0.32%
1125.40
335.57
3.35x
mxnet
1.7.0
inceptionv3
77.73%
77.64%
0.11%
623.33
230.49
2.71x
mxnet
1.7.0
mobilenet1.0
71.69%
72.22%
-0.74%
4375.00
1741.29
2.51x
mxnet
1.7.0
mobilenetv2_1.0
70.78%
70.87%
-0.12%
3500.00
1284.40
2.73x
mxnet
1.7.0
resnet18_v1
70.02%
70.14%
-0.17%
2325.58
731.45
3.18x
mxnet
1.7.0
squeezenet1.0
56.74%
56.96%
-0.38%
2916.67
1093.75
2.67x
mxnet
1.7.0
ssd-resnet50_v1
80.21%
80.23%
-0.03%
187.82
40.07
4.69x
mxnet
1.7.0
ssd-mobilenet1.0
74.94%
75.54%
-0.79%
445.01
116.28
3.83x
mxnet
1.7.0
resnet152_v1
78.21%
78.54%
-0.42%
394.37
119.60
3.30x
Framework
Version
Model
Accuracy
Performance
INT8 Tuning Accuracy
FP32 Accuracy Baseline
Acc Ratio [(INT8-FP32)/FP32]
INT8 throughput
FP32 throughput
Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1
CLX8280 1s 4c per instance bs1
onnxrt
1.8.0
resnet50_v1_5
72.11%
72.28%
-0.24%
546.02
339.97
1.61x
onnxrt
1.8.0
bert_base_mrpc_static
85.29%
86.03%
-0.86%
479.12
210.97
2.27x
onnxrt
1.8.0
bert_base_mrpc_dynamic
85.54%
86.03%
-0.57%
244.84
100.00
2.45x
onnxrt
1.8.0
vgg16
66.58%
66.68%
-0.15%
101.35
79.25
1.28x
onnxrt
1.8.0
ssd_mobilenet_v1
22.41%
23.10%
-2.99%
427.87
377.16
1.13x
onnxrt
1.8.0
ssd_mobilenet_v2
23.80%
24.68%
-3.57%
339.48
279.89
1.21x
onnxrt
1.8.0
distilbert_base_mrpc
84.56%
84.56%
0.00%
1081.92
386.53
2.80x
onnxrt
1.8.0
mobilebert_mrpc
85.54%
86.27%
-0.85%
437.23
400.23
1.09x
onnxrt
1.8.0
roberta_base_mrpc
88.73%
89.46%
-0.82%
494.70
203.90
2.43x
onnxrt
1.8.0
resnet50-v1-12
74.83%
74.97%
-0.19%
642.79
348.26
1.85x
onnxrt
1.8.0
resnet_v1_5_mlperf
76.11%
76.47%
-0.47%
599.32
343.47
1.74x
onnxrt
1.8.0
mobilenet_v3_mlperf
75.51%
75.75%
-0.32%
1397.21
1007.19
1.39x
onnxrt
1.8.0
bert_squad_model_zoo
80.43519
80.67171
-0.29%
73.68
40.81
1.81x
onnxrt
1.8.0
mobilebert_squad_mlperf
89.84479
90.0265
-0.20%
60.52
57.30
1.06x
onnxrt
1.8.0
vgg16_model_zoo
72.37%
72.38%
-0.01%
122.85
79.57
1.54x