[red-knot] Property tests #14178

sharkdp · 2024-11-07T22:11:25Z

Summary

This is something I played with, wondering if it could be useful. It found a few bugs so far:

If we want to merge this, there are a few things to do / to consider:

The tests are extremely slow at the moment (> 30 ms per iteration in debug mode; > 3 ms in release mode) . The reason for this is our incredible naive way of setting up the TestDb in every iteration. This is hopefully easy to fix by caching the TestDb. The tests are now three orders of magnitude faster.
Even if this resolved, we probably don't want to run those tests alongside the normal unit tests. For one, they can be arbitrary slow, depending on the number of iterations. And second, they are not deterministic. Even if we fix both of these issues (e.g. make the tests opt-in via a special config + fix random seeds), I'm not sure we would want them to run in CI? I'm afraid they would be a constant source of flakiness because even after fixing seeds, their behavior might change as soon as we make changes to the (test) code. One option could be to only run them offline from time to time, using a special config.
There are a lot of "spurious" failures that turn out to be caused:
- insufficient understanding of equality of types (we currently don't understand that int | str is equal to str | int, there is an open TODO in the code)
- no "propagation" of Never. For example: tuple[Never, int] is equivalent to Never, but we currently don't simplify this anywhere.
- other limitations of our current code, like the fact that is_disjoint_from/is_subtype_of can produce false negative answers.
The current shrinking implementation is very naive, which leads to counterexamples that are very long (str & Any & ~tuple[Any] & ~tuple[Unknown] & ~Literal[""] & ~Literal["a"] | str & int & ~tuple[Any] & ~tuple[Unknown]), requiring the developer to simplify manually
The code needs improvements

Test Plan

QUICKCHECK_TESTS=1000 cargo test --release -p red_knot_python_semantic property_tests

sharkdp · 2024-11-07T22:19:27Z

crates/red_knot_python_semantic/src/types.rs

+        let db = setup_db();
+
+        let t1 = t1.into_type(&db);
+        let t2 = t2.into_type(&db);
+        let t3 = t3.into_type(&db);


If we want to move forward with this, we could probably auto-generate this boilerplate in each test using a macro.

## Summary Minor fix to `Type::is_subtype_of` to make sure that Boolean literals are subtypes of `int`, to match runtime semantics. Found this while doing some property-testing experiments [1]. [1] #14178 ## Test Plan New unit test.

github-actions · 2024-11-07T22:37:28Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

sharkdp · 2024-11-07T23:11:46Z

crates/red_knot_python_semantic/src/types.rs

+        if size == 0 {
+            arbitrary_core_type(g)
+        } else {
+            match u32::arbitrary(g) % 4 {


There must be a better way to do this. If only we had the author of this library on our team...

CC: @BurntSushi

I think this looks reasonable to me?

) ## Summary Another bug found using [property testing](#14178). ## Test Plan New unit test

carljm · 2024-11-08T16:55:40Z

This is awesome! Thank you for doing this, I've been wanting to explore this. I think property testing is very well suited to testing type relation invariants, and I do think we should move forward with actually landing this.

sharkdp mentioned this pull request Nov 7, 2024

[red-knot] Minor: fix Literal[True] <: int #14177

Merged

AlexWaygood added the red-knot Multi-file analysis & type inference label Nov 7, 2024

sharkdp force-pushed the david/property-tests branch 2 times, most recently from c1f05ab to 634b4f6 Compare November 7, 2024 22:17

sharkdp commented Nov 7, 2024

View reviewed changes

This was referenced Nov 8, 2024

[red-knot] Fix intersection simplification for ~Any/~Unknown #14195

Merged

[red-knot] Fix is_assignable_to for unions #14196

Merged

sharkdp added a commit that referenced this pull request Nov 8, 2024

[red-knot] Fix intersection simplification for ~Any/~Unknown (#14195

670f958

) ## Summary Another bug found using [property testing](#14178). ## Test Plan New unit test

sharkdp added 5 commits November 9, 2024 20:21

[red-knot] Add property tests for Type

402625f

recursive generation of types

7790222

Add Never again

bd4d3f8

Implement shrinking

7efc024

Add double-negation test

87c67df

sharkdp force-pushed the david/property-tests branch from c1086b8 to 87c67df Compare November 9, 2024 19:21

sharkdp added 2 commits November 9, 2024 22:09

Ignore double negation test for now

1b9e588

Speed up tests by three orders of magnitude

806da67

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Property tests #14178

[red-knot] Property tests #14178

sharkdp commented Nov 7, 2024 •

edited

Loading

sharkdp Nov 7, 2024

github-actions bot commented Nov 7, 2024 •

edited

Loading

sharkdp Nov 7, 2024

MichaReiser Nov 8, 2024

BurntSushi Nov 8, 2024

carljm commented Nov 8, 2024

[red-knot] Property tests #14178

Are you sure you want to change the base?

[red-knot] Property tests #14178

Conversation

sharkdp commented Nov 7, 2024 • edited Loading

Summary

Test Plan

sharkdp Nov 7, 2024

Choose a reason for hiding this comment

github-actions bot commented Nov 7, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

sharkdp Nov 7, 2024

Choose a reason for hiding this comment

MichaReiser Nov 8, 2024

Choose a reason for hiding this comment

BurntSushi Nov 8, 2024

Choose a reason for hiding this comment

carljm commented Nov 8, 2024

sharkdp commented Nov 7, 2024 •

edited

Loading

github-actions bot commented Nov 7, 2024 •

edited

Loading

`ruff-ecosystem` results