Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ad/remove api label #8

Open
wants to merge 357 commits into
base: develop
Choose a base branch
from
Open

Conversation

AlexanderDokuchaev
Copy link
Owner

Changes

Reason for changes

Related tickets

Tests

l-bat and others added 30 commits May 14, 2024 14:46
### Changes

<!--- What was changed (briefly), how to reproduce (if applicable), what
the reviewers should focus on -->

### Reason for changes

Fix bug in `_get_ratio_defining_params` method

### Tests

test_shared_gather_all_layers
### Changes

Bump version of black to 24.4.2

### Reason for changes

Improve formatting
### Changes

Remove openvino-dev dependency

### Reason for changes

Remove potentially deprecated package

### Related tickets

136314

### Tests

N/A
### Changes

tensorflow-metadata 1.13.0

### Reason for changes

```
tensorflow-metadata 1.14.0 requires protobuf<4.21,>=3.20.3, but you have protobuf 3.19.6 which is incompatible.
```

### Related tickets

140478
### Changes

Small fix: added scale_estimation flag to IR metadata.

### Reason for changes

Small fix: added scale_estimation flag to IR metadata.

### Related tickets
### Tests
### Changes

<!--- What was changed (briefly), how to reproduce (if applicable), what
the reviewers should focus on -->

### Reason for changes

<!--- Why should the change be applied -->

### Related tickets

<!--- Post the numerical ID of the ticket, if available -->

### Tests

<!--- How was the correctness of changes tested and whether new tests
were added -->
…ad (openvinotoolkit#2685)

### Changes

Resnet18 examples updated to showcase NNCF Torch save/load capabilities

### Reason for changes

To showcase NNCF Torch save/load capabilities

### Related tickets

129586

### Tests

test_examples/382/
### Changes

Add function for experimental tensor 
- concatenate
- logical_or
- masked_mean
- masked_median
- median
- percentile

---------

Co-authored-by: Daniil Lyakhov <[email protected]>
…nvinotoolkit#2691)

### Changes

Add log message for when no matches were found for AWQ algorithm

### Reason for changes

Not obvious enough when AWQ was actually applied.
### Changes

Bump torch to 2.3.0

### Related tickets

141679
### Changes

- ssd512_vgg_voc_magnitude_sparsity_int8

### Reason for changes

Unstable result on CUDA
…t#2692)

### Changes

Make searching for an Add node among all children to determine whether
the node has a bias.

### Reason for changes

Add node could be not only on 0-index of node children list.

### Related tickets

141885

### Tests

TBD
### Changes

Extended AWQ algorithms for patterns Act->MatMul and
Act->Multiply->MatMul with insertion for extra scales after activation.

### Reason for changes

Support AWQ for wider family of LLMs

### Related tickets

CVS-141131

### Tests

Added unit tests
### Changes

* NNCF wrapping for Tensor input parameters is skipped when actual
parameter is not a tensor


### Reason for changes

* To support models which could accept different types as an input
(example:
[YolovV8](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/nn/tasks.py#L77-L89)
```
def forward(self, x, *args, **kwargs):
    if isinstance(x, dict):  # for cases of training and validating while training.
        return self.loss(x, *args, **kwargs)
    return self.predict(x, *args, **kwargs)
```

### Related tickets

138682

### Tests

* tests/torch/test_input_management.py is extended
### Changes

Added support for bf16 in `nncf.compress_weights`

### Reason for changes

Support models with bf16 weights and fp32 or fp16 inference precision.

### Related tickets

ref: 141582

### Tests

<!--- How was the correctness of changes tested and whether new tests
were added -->


tests/openvino/native/quantization/test_weights_compression.py:test_compression_for_different_dtypes
### Changes

Removed `compress_to_fp16=False` from `save_model` functions in
examples.

### Reason for changes

Using this parameter when saving OpenVINO model is no longer required

### Tests

Changes were tested via pytest, using example tests in project. While
testing, some errors occured, but all of them were about wrong
development environment settings. Setting up the invironment took most
of the time.
### Changes

Add new test references for OpenVINO 2024.2

### Reason for changes

New OpenVINO release.

### Related tickets

N/A

### Tests

N/A
### Changes

- Unpatch torch during `torch.compile()` call for vanilla PyTorch model
- Raise ValueError if `torch.compile()` is called for NNCF-optimized
model
- Unpatch torch during forward call of compiled model

### Reason for changes

PyTorch dynamo compilation conflicts with nncf patching of PyTorch. This
results in errors during compiled model forward, even if the model was
not quantized, i.e. just import of `nncf.torch` results in failure.

### Related tickets

140265

### Tests

Added test for `torch.compile` compatibility with `nncf`
### Changes

Remove dependency from typing_extensions 

### Reason for changes

Fail with `ImportError` on python>=3.9.

nncf.common depends from typing_extensions packages

https://github.com/openvinotoolkit/nncf/blob/47187e5bbaef678508fdfdc980b258fc4e32d3db/nncf/common/accuracy_aware_training/training_loop.py#L22-L23

With python3.8 typing_extensions installs as depends of rich module, but
it works only on `python<3.9`
```
Collecting typing-extensions<5.0,>=4.0.0 (from rich>=13.5.2->nncf==2.11.0.dev0+1406e7f)
```

https://github.com/Textualize/rich/blob/349042fd8912ab5f0714ff9a46a70ef8a4be4700/pyproject.toml#L31

### Tests
tests/cross_fw/install (python 3.10)
### Changes

Update BKC for PyTroch

### Reason for changes

openvinotoolkit#2690
### Changes

- Fixed SmoothQuant algorithm to work with Split ports correctly;

### Reason for changes

- Bugfix

### Related tickets

- 140351

### Tests
### Changes

Remove extra dependencies to install backends from setup.py
Add [plots] extra dependencies to install optional packages to visualize
plots.

### Related tickets

134503

---------

Co-authored-by: Alexander Suslov <[email protected]>
### Changes

Remove duplicated ref graphs.

### Reason for changes

N/A

### Related tickets

N/A

### Tests

N/A
…olkit#2701)

### Changes

One of the subgraph inputs can be safely removed from ignored_scope as
does not affect the resulted quantized model.

### Reason for changes

Fix the incorrect name on a new OV release version.

### Related tickets

142369

### Tests

N/A
### Changes

Fix name of backend

### Reason for changes

```
>   packages = [item for b in backends for item in MAP_BACKEND_PACKAGES[b]]
E   KeyError: 'tensorflow'
```


### Tests


tests.cross_fw.examples.test_examples.test_examples[post_training_quantization_tensorflow_mobilenet_v2]
### Changes

Remove instructions to install OMZ model_tools from Makefile

### Reason for changes

Vanishing unnecessary dependency

### Related tickets

136314

### Tests

N/A
### Changes

* Documentation structure is updated
* QAT after PTQ documentation is present
### Reason for changes

* To align documentation with the latest changes in torch QAT

### Related tickets

129586

---------

Co-authored-by: Alexander Suslov <[email protected]>
### Changes

Repo link is used instead of an URL in docs

### Reason for changes

To fix links in documentation
### Changes

- Layer-wise engine implementation for OpenVINO backend
- GPTQ algorithm implementation for OpenVINO backend

### Reason for changes

Improving the weight compression workflow.

- gptq=False

Model | Config | Task | Perplexity | Acc |  Time (sec)
-- | -- | -- | -- | -- | --
facebook/opt-125m | mode=nncf.CompressWeightsMode.INT4_SYM, gptq=False,
group_size=128 | lambada_openai | 40.9560 | 0.3157 | 8.95
TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
mode=nncf.CompressWeightsMode.INT4_SYM, gptq=False, group_size=128 |
lambada_openai | 7.1380 | 0.5927 | 60.38

- gptq=True

Model | Config | Task | Perplexity | Acc | Time (sec)
-- | -- | -- | -- | -- | --
facebook/opt-125m | mode=nncf.CompressWeightsMode.INT4_SYM, gptq=True,
group_size=128 | lambada_openai | 29.8427 | 0.3623 | 248.31
TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
mode=nncf.CompressWeightsMode.INT4_SYM, gptq=True, group_size=128 |
lambada_openai | 6.8953 | 0.5911 | 2040.66

### Related tickets

ref: 126887

### Tests

tests/common/quantization/test_layerwise_scheduler.py
tests/shared/test_templates/template_test_nncf_tensor.py
tests/openvino/native/test_model_transformer.py
tests/openvino/native/test_layerwise.py
tests/openvino/native/quantization/test_gptq.py
tests/openvino/native/quantization/test_weights_compression.py
tests/torch/ptq/test_weights_compression.py

### Builds
post_training_weight_compression 64
AlexanderDokuchaev and others added 14 commits October 16, 2024 08:54
### Changes

- Use `if: ${{ !cancelled() }}` to upload artifact for failed job
- Make possible manually run workflow on branch

### Reason for changes

`continue-on-error: true` is make failed job green.
…plate and backend specific tests (openvinotoolkit#3004)

### Changes

- Added simple Depthwise and Transpose Convolution models in
`tests/cross_fw/test_templates/helpers.py`.

- Updated `map_references` for Torch FX backend to assign the right
reference node names for model classes Depthwise and Transpose
Convolutions.

- Added `ONNXConvolutionTransposeMetatype` into the list of
OPERATIONS_WITH_BIAS for ONNX.

- Added the missing target for Transpose Conv in `transformations.py`
for Torch FX in the function `_is_conv`.

- Added `OVConvolutionBackpropDataMetatype` into the list of
OPERATIONS_WITH_BIAS for OpenVino backend

- Replaced the unet graph in quantized reference graphs for FX backend.

### Extra Changes

- [X] Update and finalize the right changes to make in
`tests/openvino/native/test_bias_correction.py` for Transpose
Convolution node name and accommodate for the changes in the name after
each run.

### Closes issue

openvinotoolkit#2916

---------

Co-authored-by: dlyakhov <[email protected]>
### Changes

The tables and images have been added to illustrate the trade-off
between accuracy and footprint for the INT4_ASYM mode.

A script has been created to automate the process of generating this
visualization from a CSV file containing all the necessary raw data.
The script calculates the compression rate and the average relative
error for a given model size and metrics.
It should reduce the likelihood of errors and simplify the maintenance
of results.

### Reason for changes

INT4_ASYM is a more accurate and preferable mode for weight compression.
The previous results were obtained using the INT4_SYM mode. 

### Related tickets

n/a

### Tests

tests/tools/test_compression_visualization.py
### Changes

Model transformation tests are presented

### Reason for changes

To cover TorchFX model transformations by tests

### Related tickets

openvinotoolkit#2775 

### Tests

* test_model_insertion_transformation
* test_constant_update_transformation
* test_constant_update_transformation_no_constant
* TestQDQInsertion
* test_node_removal_transformation
### Changes

Added an error message

### Reason for changes

send warning message to avoid Inconsistencies arise when the dataset
size is less than the provided or default 'subset_size'.

### Related tickets

Closes: openvinotoolkit#2562

I had an inquiry:
I noticed that subset_size is sometimes put as 100, or 300, or specified
in the advanced parameters. Should a default be used here, or could you
point me to where I can find the correct subset_size to be imported?

---------

Co-authored-by: Liubov Talamanova <[email protected]>
Fixes after openvinotoolkit#3003 .

### Changes

1. Convert raw activations to WC statistics for GPTQ + SE scenario.
2. Allow 2D tensor inputs for data-aware mixed precision. 2D activations
arise in `opt`-like models, e.g. `opt-125m`. There, LayerNorm reshapes
activations from [B, L, D] to [B*L, D].

### Tests

1. Added a unit test for GPTQ + SE.
2. Modified a test for 2D activations and mixed precision.
3. Compressed tiny-llama to int4_asym with SQ + GPTQ before openvinotoolkit#3003 and
for this PR. Got the same PPL value of 15.739704794594019 .
### Changes

- Fix inputs collection for specific cases.

### Reason for changes

- Algorithm reliability.

### Related tickets

- 148633

### Tests

- Manual
### Changes

- Drop support python3.8
- Support numpy2 
- Use onnx==1.16.2 onnxruntime==1.19.2 for tests (older version is not
support numpy2)
- Use forked open_model_zoo  with bumped limit of numpy version
- Set `numpy<2` for some examples that does not support numpy2 
### Related tickets

138867
151791
### Changes

- Added telemetry for the `nncf.compress_weights` backend-specific
`impl`;
- Added missed wrapper for the `nncf.qantize` PyTorch `impl`;
- Fixed reported event for `nncf.quantize_with_accuracy_control`
OpenVINO `impl`.
- Added missed `app_name`, `app_version` fields for each event.

### Reason for changes

- Telemetry update.

### Related tickets

- 154833

### Tests

- TBD
### Changes

Transformation for removing fake quantize nodes and saving all weights
to disk in int8 format after quantization. It works as follows:
1. Reshape the scale if qdq operation is per-channel.
2. Pattern match the quantize-dequantize nodes.
3. Filter the matches to only include quantize-dequantize ops with
constant input.
4. Replace with the multiplication of the scale and input.

### Reason for changes

To compress the model after quantization

### Tests

Add `test_post_quantization_compression()` in
`tests/torch/fx/test_model_transformer.py` which checks the data type of
all weights in the model after applying quantization and also checks the
value after the decompression step (element-wise multiplication
operation).

### Tickets
openvinotoolkit#2766

---------

Co-authored-by: Daniil Lyakhov <[email protected]>
### Changes

Add `subset_size` to PTQ samples after merging
openvinotoolkit#2995 to remove the warning
message
CI: https://github.com/openvinotoolkit/nncf/actions/runs/11436915496
### Changes

* Resnet18 TorchFX example

### Reason for changes

* To showcase NNCF TorchFX quantization

### Related tickets

openvinotoolkit#2766 

### Tests

test_examples/544/ - Done
### Changes

- setup pythonpath for pytests
- support numpy2 for unplicker
- bump datasets to support numpy2 
- fix ROOT_PYTHONPATH_ENV that was always None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.