-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support array types #191
Support array types #191
Conversation
crates/query-engine/translation/src/translation/query/values.rs
Outdated
Show resolved
Hide resolved
5c8cfb9
to
1fa5019
Compare
1fa5019
to
e258b81
Compare
RuntimeConfiguration { | ||
metadata: &self.config.metadata, | ||
metadata: metadata_to_current(&self.config.metadata), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this mapping on every request don't we? I wonder if in future we do it whenever RawConfiguration
becomes Configuration
(one of Connector
trait functions run by the config server IIRC), and keep RuntimeConfiguration
inside Configuration
.
Not in this PR, mind. Probably worth benchmarking it too to check a speculative speed up becomes a real one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this makes sense.
IIUC, what the user will consider as being "their configuration metadata" is the json-serialized Configuration
, not just the RawConfiguration
, meaning that the whole thing is what needs to be able to exist in different versions.
It's still a bit nebulous to me though, so I could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All I mean is, currently we take the whole Configuration
and derive RuntimeConfiguration
from it on each request, surely we could just do this once?
It's not really relevant to this PR, mind, just something I noticed because you made a change near the code that does this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all seems sensible. Left a couple of questions, but they could probably both be answered in a couple of follow-up tidy ups or optimisation passes at some later point.
### What This PR adds a new version (`"2"`) of the deployment configuration data format. This version of the configuration is capable of expressing array types in collections and arguments. Since this is the first time a new version is introduced there are a lot of changes the only purpose of which is to distinguish between versions. Only the infrastructure-related shell of the connector is aware of different versions of deployment configurations existing. The core of the connector only works with a single internal version. This PR is also the one to introduce tests of array types. In hindsight this ought to have been possible in the previous PR that introduced the internal types and transformations (#191). Note that there is not yet any automated way to upgrade a configuration to a newer version, but this will be introduced shortly. This PR also adds a changelog entry. ### How The file `version2.rs` is a duplicate of `version1.rs`, which has been adapted to use the new data types (incidentally these are just the ones of the internal model). `configuration.sql` now exists as `version1.sql` and `version2.sql` respectively, since these have different capabilities. This is because `version2.sql` introduces the ability to introspect array types. `configuration.rs` now exposes `RawConfiguration` and `Configuration` types which are enums of all the supported versions (currently 1 and "2"). One big wart on the implementation is that serde and schemars are unable to derive trait implementations for these types correctly, since they only support strings as enum tags, and we used a number literal for version 1. Once we drop support of version 1 completely we can remove the manually implemented instances. The various `Connector` trait implementations now explicitly work on the internal representation of a configuration, `RuntimeConfiguration`. --------- Co-authored-by: Daniel Harvey <[email protected]>
What
We now recognize array types in several places:
Notable limitations, due in part to the modest expressiveness of the ndc-spec in its current state:
How
The introspection query has had its
types
sub-query split intoscalar_types
andarray_types
subqueries. We benefit from the fact that postgres only supports one-dimensional array types (i.e., array dimensionality is a dynamic aspect of an array value), so we don't have to recurse to construct the json-representation of array types.Most of the rust-language changes of this PR follow mechanically from changing
metadata::ScalarType
to the newmetadata::Type
, which also models array types.However, since our field types now include arrays we need to observe that certain parts (e.g. ComparisonOperators) only accept base scalar types (e.g. in
version1.rs
,filtering.rs
).In defining the new type for column types we also take care to maintain backwards compatibility, which means deserialization gets a bit more verbose than just the automatically derived implementations we used to get by with (see
metadata/database.rs
).Since we also support array-typed arguments we have to be able to output SQL array constructors (see
values.rs
,sql/convert.rs
).We introduce two new tests:
Update
Altering the representation of column types in the configuration constitues a breaking change. Even if we are still able to ingest the historic configuration formats these are not described by the published json schema, and we would only ever output in the newest format.
In order to address the issue of versioning the original contents of this PR is being split in two:
An interesting consequence of introducing the distinction between api transport types and internal business types is that the various
tables.json
files of the query-engine translation tests contain json-serialized data of business types.Worth highlighting is this means we cannot simply copy a subsection of a deployment configuration into a
tables.json
file like we used to, as the business types do not necessarily correspond to any api-version.While this may seem confusing it also has some benefits:
Updating test files can become a chore too. There is no tool that can upgrade them automatically like we do for deployments, and it would probably not be worth the effort to try and make one.
What I ended up doing in the particular case of this PR was a small shell script:
The
jq
script matches any object that has atype
key and wraps its value with a{"scalarType": ..}
object. Happily this covered the full breadth of my change.sponge is a tool I just discovered which buffers all the stdin it receives and writes everything into the file it's given. Essentially this endows our command with the ability to make in-place edits with shell pipe filters.