Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Property tests #14178

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

[red-knot] Property tests #14178

wants to merge 7 commits into from

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Nov 7, 2024

Summary

This is something I played with, wondering if it could be useful. It found a few bugs so far:

If we want to merge this, there are a few things to do / to consider:

  • The tests are extremely slow at the moment (> 30 ms per iteration in debug mode; > 3 ms in release mode) . The reason for this is our incredible naive way of setting up the TestDb in every iteration. This is hopefully easy to fix by caching the TestDb. The tests are now three orders of magnitude faster.
  • Even if this resolved, we probably don't want to run those tests alongside the normal unit tests. For one, they can be arbitrary slow, depending on the number of iterations. And second, they are not deterministic. Even if we fix both of these issues (e.g. make the tests opt-in via a special config + fix random seeds), I'm not sure we would want them to run in CI? I'm afraid they would be a constant source of flakiness because even after fixing seeds, their behavior might change as soon as we make changes to the (test) code. One option could be to only run them offline from time to time, using a special config.
  • There are a lot of "spurious" failures that turn out to be caused:
    • insufficient understanding of equality of types (we currently don't understand that int | str is equal to str | int, there is an open TODO in the code)
    • no "propagation" of Never. For example: tuple[Never, int] is equivalent to Never, but we currently don't simplify this anywhere.
    • other limitations of our current code, like the fact that is_disjoint_from/is_subtype_of can produce false negative answers.
  • The current shrinking implementation is very naive, which leads to counterexamples that are very long (str & Any & ~tuple[Any] & ~tuple[Unknown] & ~Literal[""] & ~Literal["a"] | str & int & ~tuple[Any] & ~tuple[Unknown]), requiring the developer to simplify manually
  • The code needs improvements

Test Plan

QUICKCHECK_TESTS=1000 cargo test --release -p red_knot_python_semantic property_tests

@AlexWaygood AlexWaygood added the red-knot Multi-file analysis & type inference label Nov 7, 2024
@sharkdp sharkdp force-pushed the david/property-tests branch 2 times, most recently from c1f05ab to 634b4f6 Compare November 7, 2024 22:17
Comment on lines 3325 to 3524
let db = setup_db();

let t1 = t1.into_type(&db);
let t2 = t2.into_type(&db);
let t3 = t3.into_type(&db);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to move forward with this, we could probably auto-generate this boilerplate in each test using a macro.

sharkdp added a commit that referenced this pull request Nov 7, 2024
## Summary

Minor fix to `Type::is_subtype_of` to make sure that Boolean literals
are subtypes of `int`, to match runtime semantics.

Found this while doing some property-testing experiments [1].

[1] #14178

## Test Plan

New unit test.
Copy link
Contributor

github-actions bot commented Nov 7, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

if size == 0 {
arbitrary_core_type(g)
} else {
match u32::arbitrary(g) % 4 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be a better way to do this. If only we had the author of this library on our team...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks reasonable to me?

sharkdp added a commit that referenced this pull request Nov 8, 2024
)

## Summary

Another bug found using [property
testing](#14178).

## Test Plan

New unit test
@carljm
Copy link
Contributor

carljm commented Nov 8, 2024

This is awesome! Thank you for doing this, I've been wanting to explore this. I think property testing is very well suited to testing type relation invariants, and I do think we should move forward with actually landing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants