-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add s3tables catalog #807
base: main
Are you sure you want to change the base?
Conversation
52e5b72
to
25f5e2a
Compare
6e1e958
to
d63d2d8
Compare
Hi @flaneur2020, I suggest splitting this PR into multiple ones to make it easier to review and accelerate the iteration speed. |
@Xuanwo i believe the missing part of this pr is adding tests, i've created a real s3tables bucket to test it and it looks work fine, can you give some suggestions about the test part? i found the glue catalog is using a mock service from |
d63d2d8
to
e63c4be
Compare
e63c4be
to
43dd573
Compare
b6fcbee
to
ad0160d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR, really great! Only some small suggestions.
use crate::utils::{create_metadata_location, create_sdk_config}; | ||
|
||
#[derive(Debug)] | ||
pub struct S3TablesCatalogConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments for all public structs.
} | ||
|
||
impl S3TablesCatalog { | ||
pub async fn new(config: S3TablesCatalogConfig) -> Result<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same, please add comments for all public APIs. Better to have an example if it's simple.
|
||
#[derive(Debug)] | ||
pub struct S3TablesCatalogConfig { | ||
table_bucket_arn: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, it's a bit confused for me to have a hard require for table_bucket_arn
for the first look. Would you like to add a comment here to explain that all operations need table_bucket_arn
instead of bucket
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s3 tables bucket is very strange, no file is stored in path under s3://{s3table_bucket}/
, but every table has a special "bucket name" like s3://{xxxxxxxx}_s3_table
, when we access the files, it's always under th paths like s3://{xxxxxxxx}_s3_table
. in the admin console of s3, you can not browse any of the files in this s3table bucket either.
in my understanding, this ARN is the identifier of the the abstract s3tables bucket, and it's used everywhere in the s3tables sdk, the plain bucket path is almost useless to us.
let me add it in the comments.
fixes #754
this PR adds the implementation of s3tables catalog. i've tested the CRUD of namespaces/tables in my local laptop with a real s3tables bucket.
need to add a mocked test suites in ci in another pr.