You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, my tflite model is already quantized to INT8, and I aim to convert it directly into the OpenVINO IR format using the ov.convert_model() function. I anticipate that the INT8 IR format will surpass both the FP32 and FP16 models in inference speed.
Contrary to expectations, the INT8 IR format model runs slower than its FP32 and FP16 counterparts.
Execute the provided Python script directly; the results will be displayed in the console.
Relevant log output
This is the snippet of the log output.
=== Inference Time Comparison ===
FP32: 0.007910 seconds per inference
FP16: 0.006492 seconds per inference
INT8: 0.010170 seconds per inference
Issue submission checklist
I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.
The text was updated successfully, but these errors were encountered:
OpenVINO Version
2024.4.0
Operating System
Other (Please specify in description)
Device used for inference
CPU
Framework
Keras (TensorFlow 2)
Model used
ResNet50
Issue description
Operating System - Ubuntu 22.04
CPU - Intel® Core™ i7-7700K CPU
GPU - Mesa Intel® Arc(tm) A770 Graphics (DG2)
Memory - 32 GB
I have examined the FP32 tflite model conversion process as demonstrated in this notebook: https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/tflite-to-openvino/tflite-to-openvino.ipynb
However, my tflite model is already quantized to INT8, and I aim to convert it directly into the OpenVINO IR format using the ov.convert_model() function. I anticipate that the INT8 IR format will surpass both the FP32 and FP16 models in inference speed.
Contrary to expectations, the INT8 IR format model runs slower than its FP32 and FP16 counterparts.
Step-by-step reproduction
Python script for replicating the results: tflite_openvino.zip
Steps to reproduce:
Relevant log output
Issue submission checklist
The text was updated successfully, but these errors were encountered: