Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add region kw argument to login. #581

Conversation

NullHypothesis
Copy link
Contributor

@NullHypothesis NullHypothesis commented Jun 6, 2024

As mentioned in this story, there are several ways to allow users to specify the region:

  1. Modify tiledb.cloud.client.config.config.host directly, which is the status quo.
  2. Add a region keyword argument to classes like DAG, Delayed, etc.
  3. Specify the region as part of logging in (this PR implements a PoC).

The first option is confusing to customers, so we need a better approach. I've been playing with the second approach and find it confusing too: It's easy to forget to pass the region argument. Besides, if a user wants to run a task graph in region A, odds are that they want all their code to run in region A. The third approach strikes me as the safest and least confusing but it feels odd having to specify the region as part of the login function, like so:

tiledb.cloud.login(..., region=tiledb.cloud.AwsRegion.AP_SOUTHEAST_1)

Or perhaps we should instead expose a new function, like tiledb.cloud.set_region()? Any thoughts on this?

region_url = urllib3.util.Url(
scheme=orig_url.scheme,
# TODO: Validate region.
host=region + ".aws." + orig_url.host,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the region be <region>.<cloud_provider> so its future proofed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. I turned the region into an enum, which also helps with validating the given region.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don’t want to make this AWS-specific; since this is just going to replace the host parameter in the endpoint, would it make sense to have a set of constants that the user can pass in?

AWS_US_EAST_1 = "https://us-east-1.aws.api.tiledb.com"
# etc.

# Or namespace it with classes:
class AWS:
  US_EAST_1 = "https://us-east-1.aws.api.tiledb.com"

class OtherCloud:
  SOME_REGION: "https://some-region.other.api.tiledb.com"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking with the enum was that we simply add another one as we support new cloud providers. That said, if y'all prefer constants, then that's fine with me.

Also, in de2950f, I revised the code to expose a separate function instead, tiledb.cloud.login_to_region(). This feels more natural to use.

@NullHypothesis NullHypothesis force-pushed the phw/sc-48037/design-improve-support-for-running-task-graphs branch 2 times, most recently from 3ca5333 to 52252f1 Compare June 10, 2024 22:06
@NullHypothesis NullHypothesis changed the title PoC: Add region kw argument to login. Add region kw argument to login. Jun 10, 2024
@NullHypothesis NullHypothesis marked this pull request as ready for review June 10, 2024 22:57
:param region: the cloud region to run code in
:return:
"""
if not region in AWS:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be if region not in AWS instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're semantically identical but your suggestions is more readable, so I addressed this in 5af2601.

AP_SOUTHEAST_1 = Asia, Singapore
"""

US_EAST_1 = "https://us-east-1.aws.api.tiledb.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can expose a method to the clients to just print all the available regions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My expectation was that users turn to the docs to see what regions are supported. That said, if you see value in such a function then I'm happy to add it.

@@ -63,6 +63,36 @@ def Ctx(config=None):
return tiledb.Ctx(Config(config))


class AWS(enum.Enum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call this AWSRegion or namespace it inside a Region class (so it shows up as Region.AWS.whatever)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 445c41d.

:return:
"""
if region not in AWS:
raise Exception("Cloud region is invalid")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise Exception("Cloud region is invalid")
raise ValueError(f"region {region} is invalid")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 25bccf9.

Copy link
Contributor

@thetorpedodog thetorpedodog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m still skeptical of this being its own function which takes enums rather than just a set of string constants that you can pass to host. Among other things, as we add new regions, existing installs would have a divide where existing regions can use the region type and the login_to_region function, but new regions would use login with host=.

In short, to me,

tiledb.cloud.login(host=tiledb.cloud.Region.AWS.US_EAST_1)

(or maybe make region into a module, tiledb.cloud.regions)

seems like a perfectly reasonable thing to write, and doesn’t require any new functions.

@NullHypothesis NullHypothesis force-pushed the phw/sc-48037/design-improve-support-for-running-task-graphs branch from 9aa1b76 to e5401a6 Compare June 13, 2024 13:32
@NullHypothesis
Copy link
Contributor Author

In short, to me,

tiledb.cloud.login(host=tiledb.cloud.Region.AWS.US_EAST_1)

(or maybe make region into a module, tiledb.cloud.regions)

seems like a perfectly reasonable thing to write, and doesn’t require any new functions.

See e5401a6.

Copy link
Contributor

@thetorpedodog thetorpedodog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one suggestion but looks good overall.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe put this directly into region/__init__.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, addressed in the squashed commit that I just force-pushed. Thanks for all your feedback!

Some users need to log in to specific regions only.  This PR exposes
host constants that make this easier, like so:

    import tiledb.cloud
    from tiledb.cloud.region import AWS
    tiledb.cloud.login(host=AWS.US_EAST_1)

As of today, we support five AWS regions.
@NullHypothesis NullHypothesis force-pushed the phw/sc-48037/design-improve-support-for-running-task-graphs branch from 3121ee2 to 53ce099 Compare June 14, 2024 15:06
@NullHypothesis NullHypothesis merged commit b1ff828 into main Jun 16, 2024
18 checks passed
@NullHypothesis NullHypothesis deleted the phw/sc-48037/design-improve-support-for-running-task-graphs branch June 16, 2024 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants