Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slightly different Box down-sampling result #8587

Open
AlttiRi opened this issue Dec 7, 2024 · 10 comments
Open

Slightly different Box down-sampling result #8587

AlttiRi opened this issue Dec 7, 2024 · 10 comments

Comments

@AlttiRi
Copy link

AlttiRi commented Dec 7, 2024

What did you do?

I have written an image down-scaler in JavaScript using the Box algorithm. I compare the result of it with the result or resampling with PIL's resize with Image.Resampling.BOX option. However, the results are slightly different.

Here is my TypeScript code:

type SingleChannelImageData = { data: Uint8Array; width: number; height: number; };

export function scaleDownLinearAverage(from: SingleChannelImageData, to: SingleChannelImageData) {
    const {data: orig, width, height} = from;
    const {data: dest, width: newWidth, height: newHeight} = to;
    const xScale = width  / newWidth;
    const yScale = height / newHeight;
    for (let newY = 0; newY < newHeight; newY++) {
        for (let newX = 0; newX < newWidth; newX++) {
            const fromY = yScale * newY       + 0.5 << 0;
            const fromX = xScale * newX       + 0.5 << 0;
            const toY   = yScale * (newY + 1) + 0.5 << 0;
            const toX   = xScale * (newX + 1) + 0.5 << 0;
            const count = (toY - fromY) * (toX - fromX);
            let value = 0;
            for (let y = fromY; y < toY; y++) {
                for (let x = fromX; x < toX; x++) {
                    value += orig[y * width + x];
                }
            }
            dest[newY * newWidth + newX] = value / count + 0.5 << 0;
        }
    }
}

When PIL works as expected

To make things simpler, let's use 1D (1 pixel height) gray-scaled (1 channel) images.

In most cases PIL produces the expected result, for example:

  • 4 pixels to 2 pixels: [255, 0, 255, 0] -> [(255 + 0) / 2, (0 + 255) / 2] -> [127.5, 127.5] -> rounding (+ 0.5 << 0) -> [128, 128]
  • 5 pixels to 2 pixels: [255, 0, 255, 0, 255] -> [(255 + 0 + 255) / 3, (0 + 255) / 2] -> [170, 127.5] -> [170, 128]

As well as for the most simple transform — the transform from 1D gray image to 1x1 image:

  • 8 pixels to 1 pixel: [1,2,3,4,5,6,7,8] ->[(1+2+3+4+5+6+7+8) / 8] -> [4.5] -> [5]
  • 9 pixels to 1 pixel: [0,0,0,0, 0,0,0,0, 220] ->[220 / 9] -> [24.444444444444443] -> [24]

What actually happened?

However, when the group/box/area size is 10, 12, 14, 20, 22, 26, 30, 36, 38, 42, ... pixels, then PIL produces the unexpected result — it rounds .5 to down.

For example:

  • 10 pixels to 1 pixel: [0,0,0,0, 0,0,0,0, 0,255] -> [25.5] -> [25]
from PIL import Image

pixel_values = [0,0,0,0, 0,0,0,0, 0,255]
image = Image.new("L", (len(pixel_values), 1))
image.putdata(pixel_values)
image = image.resize((1, 1), Image.Resampling.BOX)
pixel = image.getdata()[0]
avg = sum(pixel_values) / len(pixel_values)
avg_round = int(avg + 0.5)
print(avg, avg_round, pixel, avg_round == pixel)

It prints:

25.5 26 25 False

What did you expect to happen?

Rounding of 25.5 should be 26 (Math.round(25.5) / 25.5 + 0.5 << 0 / int(25.5 + 0.5).


Here is more complex example:

from PIL import Image

for uint in range(0xFFFF + 1):
    v1 = (uint >> 24) & 0xFF
    v2 = (uint >> 16) & 0xFF
    v3 = (uint >>  8) & 0xFF
    v4 = (uint >>  0) & 0xFF
    pixel_values   = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, ] # 10
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,] # 12
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0,] # 14
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,] # 20
    # pixel_values = [v1, v2, v3, v4,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0,] # 22

    # print(pixel_values)

    width = len(pixel_values)
    image = Image.new("L", (width, 1))
    image.putdata(pixel_values)
    image = image.resize((1, 1), Image.Resampling.BOX)

    pixel = image.getdata()[0]
    avg = sum(pixel_values) / width
    # avg_round = round(avg)   # Python 3' "round half to even" or "banker's rounding"
    avg_round = int(avg + 0.5) # "round" like it is in other languages (0.5 -> 1)

    # "fix"
    # magic_numbers = {10, 12, 14, 20, 22, 26, 30, 36, 38, 42, ... }
    # if width in magic_numbers and avg % 1 == 0.5:
    #     avg_round = avg_round - 1

    if avg_round != pixel:
        print(  "avg_round", avg_round, f"({sum(pixel_values)} / {width} = {avg})",
              "  pil_pixel", pixel,
              "  pixels",    pixel_values,
              )

Try to add/remove zeros from the pixel_values array to change its length, to see that there are magic array sizes that PIL behaves strangely with when rounding .5.

What are your OS, Python and Pillow versions?

  • OS: Windows 10
  • Python: 3.12.4
  • Pillow: 11.0.0
@radarhere
Copy link
Member

I suspect that rounding has been sacrificed for the sake of performance in Pillow, but I wonder if @homm could confirm?

@AlttiRi
Copy link
Author

AlttiRi commented Dec 10, 2024

BTW, is filter-based approach really optimal/needed?

My code above on Node.js for scaling down a gray-scaled image (https://i.imgur.com/DR94LKg.jpeg) 1000 times works for 19 seconds. PIL's Box filter for 12 seconds. I expected a bigger difference.

# ...
print(size) # (961, 1266)

# benchmark gray:
image_gray = image.convert("L")
print("bench gray")

start_time = time.time()
for i in range(1000):
    image_box = image_gray.resize(size, Image.Resampling.BOX)
print(time.time() - start_time)  # 12.385019302368164

start_time = time.time()
for i in range(1000):
    image_lanc = image_gray.resize(size, Image.Resampling.LANCZOS)
print(time.time() - start_time)  # 26.357529163360596

And Lanczos only ~2 times slower than Box resampling.

@AlttiRi
Copy link
Author

AlttiRi commented Dec 10, 2024

Here is a picture of the table of 51x51 px squares (1px border), the image has odd width and height (2755x1837).
PIL's box filter produces the result with the central vertical line is missed for down-scaling it in 4 times (to 1376x918).
Looks like the central vertical 1px line was removed from the computation.
My code works as expected.

from PIL import Image

image_path = "51-squares/_original.png"

image = Image.open(image_path)
newHeight = image.height // 2
size = (int(image.width / image.height * newHeight), newHeight)
print(image.width, image.height) # 2755 1837
print(image.width  / size[0]) # 2.0021802325581395
print(image.height / size[1]) # 2.0010893246187362
print(size) # (1376, 918)

# image_gray = image.convert("L")

image_box = image.resize(size, Image.Resampling.BOX)
image_box.save("pil-box.png")

image_lanc = image.resize(size, Image.Resampling.LANCZOS)
image_lanc.save("pil-lanczos.png")

51-squares.zip

@AlttiRi
Copy link
Author

AlttiRi commented Dec 10, 2024

One more about the central lines (the central cross that divides the image to 4 parts) and odd image dimension:

I compared down-scaling with BOX and LANCZOS filters. This image https://i.imgur.com/DR94LKg.jpeg is 1923x2533.
After down-scaling each side of it twice I find out the there is a "shift" from the central cross lines to the edges (scheme.jpg) with Lanczos scaling.

You need to toggle between DR94LKg-box.png and DR94LKg-lanczos.png to see it.
The pixels shift from the central cross to the edges in Lanczos image is pretty visible when comparison with Box images (both: my and PIL's ones).

It feels like there are extra 1px cross center lines, however, the image size is the same.
I think the BOX filter puts pixels in the correct order, since my code (which is simple as possible) puts the pixels in the same places. So, it is the issue of LANCZOS filter.


My scaler produces a visually identical result to PIL's Box filter one, except the horizontal central 1px line (you need to use zooming to see it). It's possible the same bug as in my previous message.

Or look at this:


DR94LKg.zip

image_path = "DR94LKg.jpeg"        # 1923x2533

image = Image.open(image_path)
newHeight = image.height // 2
size = (int(image.width / image.height * newHeight), newHeight)
print(image.width, image.height) # 1923 2533
print(image.width  / size[0])    # 2.001040582726327
print(image.height / size[1])    # 2.000789889415482
print(size)                      # (961, 1266)

image_gray = image.convert("L")

image_box = image_gray.resize(size, Image.Resampling.BOX)
image_box.save("DR94LKg-box.png")

image_lanc = image_gray.resize(size, Image.Resampling.LANCZOS)
image_lanc.save("DR94LKg-lanczos.png")

If I crop the image by 1 px for both sides: to 1922x2532, than everything is fine.
Both box-scaled images are the same (except the invisible for the eye differences of .5 rounding), and box resample result is very close to the result of Lanczos scaling (Box filter is really good for integer down-scaling, when you map the constant count of initial pixels to the result pixel).

DR94LKg-crop.zip

@homm
Copy link
Member

homm commented Dec 16, 2024

Thank you for investigations!

As of rounding — I believe there is nothing what we can do here, since convolution implementation have it's internal precision limits and this rounding doesn't affect visual quality of the image. As of missed line, this looks critical, I'l check the math in the implementation.

My code above on Node.js for scaling down a gray-scaled image (https://i.imgur.com/DR94LKg.jpeg) 1000 times works for 19 seconds. PIL's Box filter for 12 seconds. I expected a bigger difference.

Right, this is due to general convolution implementation for the BOX filter. Its implementation comes "for free" for us. There is also "shrink" operation which works much faster, but only with integer scaling.

@AlttiRi
Copy link
Author

AlttiRi commented Dec 16, 2024

You need to toggle between DR94LKg-box.png and DR94LKg-lanczos.png to see it.

lancos-shift

BTW, here is a gif.

@homm
Copy link
Member

homm commented Dec 16, 2024

BTW, here is a gif.

Yeah, I see the difference. I'll check

@AlttiRi
Copy link
Author

AlttiRi commented Jan 11, 2025

I thought it was some kind of optimization bug, but no, I just wrote a very simple (without any optimization) Lanczos down-scaler and got the pixel-identical results to PIL's ones.

function downscaleLanczos1D(pixels: Uint8Array, resultLength: number): Uint8Array {
    const inputLength = pixels.length;
    const scale = resultLength / inputLength;
    const result = new Uint8Array(resultLength);
    for (let resIdx = 0; resIdx < resultLength; resIdx++) {
        let outputValue = 0;
        let outputWeight = 0;
        const xStart = -0.5 + scale / 2 - resIdx;
        for (let inpIdx = 0; inpIdx < inputLength; inpIdx++) {
            const x = xStart + inpIdx * scale;
            const weight = lanczos(x, 3);
            outputValue += pixels[inpIdx] * weight;
            outputWeight += weight;
        }
        const value = outputValue / outputWeight;
        result[resIdx] = Math.round(Math.min(Math.max(value, 0), 255));
    }
    return result;
}
downscaleLanczos2D (also non-optimized, draft version)
type SingleChannelImageData = { data: Uint8Array; width: number; height: number; channels: 1; };
function downscaleLanczos2D(imageData: SingleChannelImageData, resultWidth: number, resultHeight: number): SingleChannelImageData {
    const height = imageData.height;
    const width  = imageData.width;

    const rows: Uint8Array[] = [];
    for (let h = 0; h < height; h++) {
        const row = imageData.data.slice(width * h, width * (h + 1));
        const resizedRow = downscaleLanczos1D(row, resultWidth);
        rows.push(resizedRow);
    }

    const columns: Uint8Array[] = [];
    for (let y = 0; y < rows[0].length; y++) {
        const column = rows.map(row => row[y]);
        const resizedColumn = downscaleLanczos1D(new Uint8Array(column), resultHeight);
        columns.push(resizedColumn);
    }

    const result: number[][] = [];
    for (let x = 0; x < columns[0].length; x++) {
        const row = columns.map(column => column[x]);
        result.push(row);
    }

    return { data: new Uint8Array(result.flat()), width: resultWidth, height: resultHeight, channels: 1, };
}

Here is the visualization how it works for the case of down-scaling 10 pixels to 6 ones for the first result pixel:

lanczos_from_10_to_6_step_1_xOffset_-2_scaleX_0 80

@AlttiRi
Copy link
Author

AlttiRi commented Jan 11, 2025

Nevermind. I just forgot to clamp down the values above 255. I did not expect that's possible.
I edited the message above.


For example, 14 pixels to 11 ones:

console.log(downscaleLanczos1D(new Uint8Array([
    255,255,255,255,255,255,255,255, 128,128,128, 255,255,255,
]), 11));
from PIL import Image
import numpy as np

def resize_with_lanczos(pixels_2d, target_width, target_height):
    img_array = np.array(pixels_2d, dtype=np.uint8)
    img = Image.fromarray(img_array)
    resized_img = img.resize((target_width, target_height), resample=Image.Resampling.LANCZOS)
    resized_pixels = np.array(resized_img).tolist()
    return resized_pixels

pixels2D = [[255,255,255,255,255,255,255,255, 128,128,128, 255,255,255,]]
resized_pixels2D = resize_with_lanczos(pixels2D, 11, 1)
print(resized_pixels2D)

The result

[[255, 255, 255, 255, 252, 255, 164, 110, 173, 255, 252]]

The result pixel with the index 5 has value 266.0674 and should be clamped down to 255.


Now, my code above produces the same result as PIL does.

100 % pixel identical.

@AlttiRi
Copy link
Author

AlttiRi commented Jan 11, 2025

I think it makes sense to say that PIL's Lanczos down-sampling may be used as a reference. Since it 100 % matches the base, non-optimized Lanczos implementation.


UPD: Almost 100 %.

  • [11, 22, 33, 44] -> [27] My code (27.499999999999996)
  • [11, 22, 33, 44] -> [28] PIL (27.5)

Rounding. Again.

The diff is less than 0.01 %. Only 10 pixels of an 480x633 image (303840 pixels).


I tested other libs and they produces pixel-different results.

PIL

cv2

Python's cv2 produces a terrible result, which looks like nearest neighbour result, not lancsoz. WTF?

Pica

JS Pica library produces a result visually looks like PIL's one, but not pixel-identical. Pixels diff is 7.54 %.

Sharp

JS Sharp library (based on libvips) produces a slightly less similar result. The diff can be seen with zooming. Pixels diff is 40.66 %.


It's strange that different libraries do not produce absolutely the same result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants