Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starburst materialization documentation #1363

Merged
merged 1 commit into from
Feb 7, 2024
Merged

Conversation

pajaks
Copy link
Contributor

@pajaks pajaks commented Feb 2, 2024

Description:

Add starburst documentation


This change is Reviewable


To get HOST and PORT go to your Cluster -> Connection info

There is also need to grant access to temporary storage (Roles and privileges -> Select specific role -> Privileges -> Add privilege -> Location). "Create schema and table in location" should be selected.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link towards https://docs.starburst.io/starburst-galaxy/cluster-administration/manage-cluster-access/manage-users-roles-and-tags/account-and-cluster-privileges-and-entities.html#location-privileges-

Add also the fact that the location privileges should correspond to the location of the schema configured for this connector.

Galaxy has a list of reserved words that must be quoted in order to be used as an identifier. Flow automatically quotes fields that are in the reserved words list. You can find this list in Trino's documentation [here](https://trino.io/docs/current/language/reserved.html) and in the table below.

:::caution
In Galaxy, objects created with quoted identifiers must always be referenced exactly as created, including the quotes. Otherwise, SQL statements and queries can result in errors. See the [Trino docs](https://trino.io/docs/current/language/reserved.html#language-identifiers).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quoted identifiers -> is rather abstract for a newbie in the Trino ecosystem.

Maybe provide an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example is in Trino docs.

@pajaks pajaks force-pushed the starburst_doc branch 2 times, most recently from ad52abc to ff0874b Compare February 2, 2024 13:01
* A Starburst Galaxy account (To create one: [Staburst Galaxy start](https://www.starburst.io/platform/starburst-galaxy/start/) that includes:
* A running cluster containing an [Amazon S3](https://docs.starburst.io/starburst-galaxy/working-with-data/create-catalogs/object-storage/s3.html) catalog
* A [schema](https://docs.starburst.io/starburst-galaxy/data-engineering/working-with-data-lakes/table-formats/index.html#create-schema) which is a logical grouping of tables
* Storage on S3 for temporary data with `awsAccessKeyId` and `awsSecretAccessKey` which should correspond to used catalog

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we didn't use cross account IAM role?


To get host go to your Cluster -> Connection info -> Other clients ([Connect clients](https://docs.starburst.io/starburst-galaxy/working-with-data/query-data/connect-clients.html))

There is also need to grant access to temporary storage (Roles and privileges -> Select specific role -> Privileges -> Add privilege -> Location). "Create schema and table in location" should be selected. [Doc](https://docs.starburst.io/starburst-galaxy/cluster-administration/manage-cluster-access/manage-users-roles-and-tags/account-and-cluster-privileges-and-entities.html#location-privileges-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What location would one specific?

* A user with a role assigned that grants access to create, modify, drop tables in specified Amazon S3 catalog
* At least one Flow collection

### Setup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should specify this is setup for Starburst Galaxy


There is also need to grant access to temporary storage (Roles and privileges -> Select specific role -> Privileges -> Add privilege -> Location). "Create schema and table in location" should be selected. [Doc](https://docs.starburst.io/starburst-galaxy/cluster-administration/manage-cluster-access/manage-users-roles-and-tags/account-and-cluster-privileges-and-entities.html#location-privileges-)

## Configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should call this setup for Estuary. Or call the other one configuration. Trying to be consistent in terminology for setups steps for both products.

To use this connector, begin with data in one or more Flow collections.
Use the below properties to configure a Starburst materialization, which will direct one or more of your Flow collections to new Starburst tables.

### Properties

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put an example?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nm. I see it below

| **`/awsAccessKeyId`** | AWS Access Key ID | | string | Required |
| **`/awsSecretAccessKey`** | AWS Secret Access Key | | string | Required |
| **`/region`** | AWS Region | Region of AWS storage | string | Required |
| **`/bucket`** | Bucket name | | string | Required |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these for the temporary storage?

dyaffe

This comment was marked as outdated.

@dyaffe dyaffe marked this pull request as ready for review February 7, 2024 14:58
@dyaffe
Copy link
Member

dyaffe commented Feb 7, 2024

LGTM

@dyaffe dyaffe merged commit 88f45ef into estuary:master Feb 7, 2024
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants