There are thounsands websites that can help users generate QR codes of any shape or style. They are free and convenient, but not safe in terms of data privacy. Almost always it is not a big deal (people create QR codes to share data anyway), but in rare cases with personal data (wifi network pass, links to private documents) there is a need for more safe tool.
Of course there are lots of software that can generate QR codes on users devices, but for the average person it is too complicated and time consuming.
Our hypothesis: Fully Homomorphic Encryption can help make online QR code generation not only easy but safe as well.
Our aproach is very simple:
- Default QR code image is generated on client side (with user's private data)
- Сlient encrypts image (FHE) and sends it to the server
- Server applies style transfer (model with FHE support) to encrypted image and sends result back to client
- Client decodes style transfered image
- User recieves stylish QR code. Easy and safe
The Quant AE neural network model implements a standard autoencoder architecture, but the layers are tailored to the specifics of FHE.
Model structure for 21x21px grayscale input and 21x21px RGB output images:
-
Encoder:
- Downsample block 1 [16 channels]
- Downsample block 2 [32 channels]
- Downsample block 3 [64 channels]
- Flatten layer
-
Latent Dimension:
- Hidden layer [220 neurons]
-
Decoder:
- Decoder Input [220, 441]
- Upsample block 1 [441, 1323]
- 2D Convertor
where:
-
Downsample block contains the following layers:
brevitas.nn.QuantIdentity
brevitas.nn.QuantConv2d(stride=2, ...)
torch.nn.LeakyReLU
-
Flatten layer is a
torch.flatten
function wrapped with twobrevitas.nn.QuantIdentity
layers -
Hidden layer is
brevitas.nn.QuantLinear
layer -
Decoder Input contains the following layers:
brevitas.nn.QuantLinear
torch.nn.BatchNorm1d
-
Upsample block contains the following layers:
brevitas.nn.QuantLinear
torch.nn.BatchNorm1d
torch.nn.LeakyReLU
-
2D Convertor layer converts input flatten tensor [b, n] to the 2D image tensor [b, 3, h, w] with
tensor.view
method (b - batch size, h - image height, w - image width)
Same as Quant AE model with pruned weights.
Training dataset are generated using qrcode library. Dataset consists of two types of QR code images: "default" (black and white) and "styled".
As a target QR code "style" we chose simple gradient transition between two colors.
After applying style transfer QR code image becomes unparseable. In order to make image valid QR code we use following algorithm at postprocessing stage (client side):
- Client generates "default" QR code image (with user data) and uses it as correction mask
- Each pixel of "styled" QR code (generated by server) is compared with corresponding pixel from correction mask.
The purpose of the comparison is to determine whether both pixels are interpreted as the "same" (white or black) for a virtual QR code reader.
- All mismatched pixels from "styled" QR code image are corrected
Pixel comparison algorithm:
- Convert both pixels to the grayscale mode
- Determine if a mask pixel is white or black (value > 125 or value < 125)?
- Determine if a pixel in a stylized image is white or black?
A pixel is considered white if its value is greater than some white threshold (190)
A pixel is considered black if its value is less than some black threshold (100)
Stylized image pixel correction algorithm:
- Convert pixel from RGB to LAB color space
- Change the "L" channel value in the corresponding direction by a fixed step value (we are only adjusting the brightness).
- Convert corrected image pixel back to the RGB color space and compare with mask pixel (see "Pixel comparison algorithm" above)
- Repeat correction if required
Meaning | |
---|---|
Inference time | Time (in seconds) that model took to generate QR code |
Readability | The percentage of generated QR codes (with correction) that were succssfully parsed by pyzbar library |
Reference diff | For every "default" QR code image in test dataset we have reference QR code image ("styled" QR code), which gives the idea of how inference result should look like. "Reference diff" is just an average diff between reference image's and inference images pixels. The lower the value, the better generation quality |
For comparison, we tried five options:
- QuantAE model (non FHE)
- Compiled QuantAE model in "simulate mode"
- Compiled QuantAE model in "execute mode"
- Compiled QuantAEPruned model in "simulate mode"
- Compiled QuantAEPruned model in "execute mode"
"QuantAE (Non-FHE)" | "QuantAE (Sim)" | "QuantAEPruned (Sim)" | "QuantAE (FHE)" | "QuantAEPruned (FHE)" | |
---|---|---|---|---|---|
Inference time | 0.011946 | 0.010590 | .010618 | 39.506345 | 38.536664 |
Readability | 0.833333 | 0.816667 | 0.701667 | 0.781667 | 0.683333 |
Reference diff | 0.095926 | 0.127455 | 0.220868 | 0.136675 | 0.260439 |
*Average results for several runs on dataset with 4000 QR codes images
- QuantAE better than QuantAEPruned in terms of "Readability" and "Reference diff". "Inference time" almost the same.
- After FHE compilation QuantAE model looses very little in quality overall.
- Unfortunately even on small images (21x21 px) out models "inference time" is too long: 40 seconds (32 CPUs) and around 5-6 minutes on weaker setups (4-8 CPUs).
We implemented a "standard" NST based on VGG19 network and backpropagation. Where VGG19 is used as a feature extractor to detect content and style loss, and the input image (noise or content copy) is iteratively adjusted to reduce content and style loss. Then we implemented quant version of VGG19 feature extractor (only convolution layers without classificator) based on "brevitas" library.
Problem
Iteratively adjusting the input image requires support for backpropagation (support for "gradient" functionality on the input image tensor).
The compiled FHE model does not support "gradient" functionality on the input data, since it converts it to a numpy array and does not update the input torch tensor.
As a result, this approach is not suitable for NST under FHE
We implemented a model based on the standard U-Net architecture, but encountered several issues when using it for FHE:
- Using skip connections (
torch.cat
) significantly increases FHE compilation time for large images (> 100px) - Using the
brevitas.nn.QuantMaxPool2d
Maxpool layers results in infinite compilation time - The brevitas library does not provide analogs of the
torch.nn.Upsample
andtorch.nn.ConvTranspose2d
layers.
To solve the problem described in point 3 above and to support quantization functionality we created our own implementation of "Convolution Transpose" layer - QuantUpsample2Nearest
.
Upsample to nearest algorithm uses the following operations:
- Creating special double diagonal matrices (Dw, Dh)
torch.matmul(input_data, Dw)
to duplicate input tensor value on one dimensiontensor.t()
We had to use transpose because FHE compiler does not supporttorch.matmul
with a right operand of 2+ dimensions:torch.matmul(Dh, input_data)
torch.reshape
Problem
Using multiple tensor.t()
operations (with torch.matmul
) results in infinite FHE compilation time.
Unfortunately, we were unable to compile QuantAE model with n_bits less than 8. Therefore, we did not use this parameter in models comparation.
This research is good first step for implementing QR code generator with FHE support.
Next steps:
- Optimize model for lower inference time. Lower inference time -> bigger images to generate -> more sophisticated styles support
- Try different models
- Improve QR code correction algorithm - 80% success rate is low
- Build docker container using files from /docker
- Clone this repo
- Run model_training.ipynb (Change constants if needed)
- Run main.ipynb (Change constants if needed)
Try out our model on HuggingFace