-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(rust/drivers): adbc driver for datafusion #2267
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! I'll split my review over multiple chunks.
The failing test is known to be flaky.
rust/Cargo.toml
Outdated
@@ -34,6 +34,7 @@ categories = ["database"] | |||
|
|||
[workspace.dependencies] | |||
adbc_core = { path = "./core" } | |||
arrow = { version = "53.1.0", default-features = false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding arrow
we can add the arrow
subcrates to reduce the size of our dependency tree.
It looks like we need to add:
arrow-cast
for https://docs.rs/arrow-cast/latest/arrow_cast/pretty/fn.print_batches.html (dev-dependency
)arrow-select
for https://docs.rs/arrow-select/latest/arrow_select/concat/fn.concat_batches.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. makes sense, didn't realize most of these functions could be imported from multiple crates. Added both as a dev-dependency.
To fix the failing CI job we need to install You could add - name: Install Protoc
uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }} as a step after arrow-adbc/.github/workflows/rust.yml Lines 68 to 72 in 4639ca8
|
@mbrobbel thanks, got everything green. I had to temporarily add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good as an initial commit for this driver. The todo!()
s and unwrap
s can be handled in follow-up PRs.
.github/workflows/rust.yml
Outdated
- name: Install Protoc | ||
uses: arduino/setup-protoc@v3 | ||
with: | ||
repo-token: ${{ secrets.GITHUB_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...Why does installing protoc need the repo token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, supposedly to query Github for the latest version...
Is there some other source for protoc that doesn't rely on a third party action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We can just do curl+unzip but it will require some logic to choose the right binary in order to work cross-platform.
- Another option is to install
grpcio-tools
from pypi and then aliasprotoc
topython -m grpc_tools.protoc
, but that obviously requires python to be installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose system protoc would be too old? (And Conda too heavy?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pixi should be fine, I think. I can add pixi.toml in rust directory, we can even install rust from there instead of using rustup iirc. Not sure if just protoc justifies pixi adoption, but it would be the simplest alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) substrait crate depends on protoc
(2) I guess, but if we still need to write some logic to install the system protoc (check whether to use apt or brew dependending on the os), we might as well download the binaries instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can use the action minus the token? (Even if it's from Arduino, I'm a bit hesitant about passing even a read-only token to an external action.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, that works as well.. I will drop the token then, if we see that the step becomes flaky as a result, we can always make the changes afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is a no-go 😆. judging from the action code, even if I pin an exact version, the action still tries to fetch all versions, presumably to validate.. seems like there's no way to make it work w/o a token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidavidm Finally got it working with pre-compiled binaries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about these?
Fixes #2263
An initial driver implementation for datafusion:
P.S. I'm not much of a rust person, so I'd appreciate any kind of pointers