-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to disable Physical input schema should be the same as the one converted from logical schema
error
#13065
Comments
Physical input schema should be the same as the one converted from logical schema
error
I should clarify -- ideally this check should be enabled by default and it is a goal we should shoot for. However, as there are clearly bugs in the code that currently prevent it from passing cleanly in all cases (which were previously in the code), I think it is better to relax the check and sort out the errors rather than hard failing plans. |
Some additional context -- we have definitely not isolated all the bugs this check uncovered. Even with this know bug (#13010) patched for us, we are still encountering this failed check every few minutes. Changing this check to a warning, not an error, has been necessary for us. Assuming that we are not the only ones, having the feature @alamb proposed here (to also convert to a warning based on configuration) would help unblock others from ungrading datafusion IMO. |
We (InfluxData) will likely contribute fixes back upstream into DataFusion as we find issues as well |
It is probably happens when users utilize only physical planner, although there are schema alignments that happening on logical planner? |
I remember bunch of issues on schema comparison when different null flag for the column caused a problem.Since DF reworked the |
Is your feature request related to a problem or challenge?
This bug, released in DataFusion 42.0.0 ,
AggregateUDFImpl::is_null
#11989Added a new check in the DefaultPhysicalPlanner that the schema of the output plan is the same as the input plan
datafusion/datafusion/core/src/physical_planner.rs
Lines 660 to 662 in 818ce3f
While @jayzhan211 's heroic efforts has this passing in all the DataFusion tests, it turned out this check failed on many downstream implementations:
Downstream in InfluxDB 3.0 we turned the check into a warning in our fork to unblock our upgrade
We even made a patch release to try and get the delta-rs upgrade working:
But it is still failing when I write this (see delta-io/delta-rs#2886 (comment))
Describe the solution you'd like
Note there is at least one open outstanding bug: #13010
I would like some way to disable this check to unblock upgrades in downstream crates.
Describe alternatives you've considered
I propose we add a new config value that lets downstream crates opt in / out of this check, similarly to
datafusion.optimizer.skip_failed_rules
(see Config Docs)Something like:
datafusion.execution.validate_schema
: If true, theDefaultPhysicalPlanner
will error if the input plan's schema does not exactly match the output plan.Additional context
No response
The text was updated successfully, but these errors were encountered: