Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/web/snowflake/1.0.1 #126

Merged
merged 8 commits into from
Feb 24, 2022
Merged

Conversation

agnessnowplow
Copy link
Contributor

New Snowflake web data model release to fix the following issues:

Snowflake Web: Update column check stored procedure (Close #125)
Snowflake Web: Fix varchar length for pseudonymized fields (Close #122)
Snowflake Web: Remove start_date variable from users module (Close #123)
Snowflake Web: Update copyright notices (Close #124)
Snowflake Web: Fix logic in users_sessions_this_run to account for sparse data (Close #120)
Snowflake Web: Fix varchar length for yauaa columns (Close #97)
Snowflake Web: Fix se_label column length in events_staged (Close #109)

pr_check has been run, which passed.

The prepare for release still needs to be added and the release notes should contain the DDL for users to accommodate the new version with increased column lengths.

Copy link
Contributor

@bill-warner bill-warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job on this. All the fixes look good. The only points I have around the stored procedure, which I think we could potentially clean up. We can catch up on this if you want

@@ -89,12 +89,18 @@ CREATE OR REPLACE PROCEDURE {{.output_schema}}.column_check(SOURCE_SCHEMA VARCHA
var delim = '~';
var sourceColumns = list_cols_with_type(SOURCE_SCHEMA,SOURCE_TABLE,delim).split(delim);
var targetColumns = list_cols_with_type(TARGET_SCHEMA,TARGET_TABLE,delim).split(delim);
var sourceColumnLengths = list_cols_with_length(SOURCE_SCHEMA,SOURCE_TABLE,delim).split(delim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of needing these extra helper functions and ultimately queries. I am wondering would be easier to return all the columns as an array of objects:

with prep as (       
select
    column_name,
    ordinal_position,
    data_type,
    character_maximum_length,
    numeric_precision,
    numeric_scale,
    case 
      when isc.data_type='TEXT' then CONCAT(isc.column_name, ' VARCHAR(',isc.character_maximum_length, ')')
      when isc.data_type='NUMBER' then CONCAT(isc.column_name, ' NUMBER(', isc.numeric_precision, ',',isc.numeric_scale, ')')
      else CONCAT(isc.column_name, ' ', isc.data_type)
    end as column_definition

from information_schema.columns isc

where table_schema = upper('scratch')
and table_name = upper('page_views_staged_test_0')
order by ordinal_position
)
  
select arrayagg(object_construct(*)) within group (order by ordinal_position asc) from prep

This would mean we can remove list_cols_with_length altogether. You can then extract the info you need from each object without having to run more queries. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the valuable feedback, I agree that it could be cleaner. Just added a reworked stored procedure commit, similar to the mobile model, without helper functions.

Copy link
Contributor

@bill-warner bill-warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the thinking. I still have a few outstanding questions with the stored procedure and one potential means to simplify a bit. interested to get your thoughts. It may be that I glossing over some of the finer details.

function add_columns_to(sch, tbl, cols) {
var alter_stmt = `ALTER TABLE ` + sch + `.` + tbl + ` ADD COLUMN ` + cols;
snowflake.createStatement({sqlText: alter_stmt}).execute();
if (cols_to_add !== '') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be a bit clearer to use missing_in_target > 0 here. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, this has been taken into consideration for the final commit.

ORDER BY ordinal_position
),

varchar_check AS (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we simply take the script from here

And add after here:

, ',') WITHIN GROUP (ORDER BY sc.ordinal_position) AS cols_to_add

LISTAGG(CASE WHEN tc.column_name IS NOT NULL AND sc.character_maximum_length > tc.character_maximum_length THEN sc.column_name END, ',') as cols_w_incompatible_char_limits

Does that get us the details you need?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like it does. As agreed, to keep it in synch with the mobile model I have changed it according to the above.

Copy link
Contributor

@bill-warner bill-warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Well done. Assuming the pr_check passed validation with the latest changes then I think we are good. Last things to do:

  • Squash the latter commits where needed.
  • Fix the typo in this commit message: Snowfalke web: Update column check stored procedure
  • Add prepare for release commit. You can see an example of what needs to be changed here: a41c728. In short, update the model version in the GE expectation files, update the playbook's version number, update both CHANGELOGS

Copy link
Contributor

@bill-warner bill-warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one more correction I think

CHANGELOG Outdated Show resolved Hide resolved
@agnessnowplow agnessnowplow merged commit eaeb04a into master Feb 24, 2022
@agnessnowplow agnessnowplow deleted the release/web/snowflake/1.0.1 branch September 1, 2022 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment