implements pagination #811

matt-winkler · 2023-10-18T22:02:36Z

resolves #810

Problem

Currently, dbt snowflake requests ALL schemas in a target database via the show terse objects command. This can create a bottleneck for some networks when the return payload forces snowflake to return data via internal stages.

Solution

This PR includes functionality that paginates the requests to list schemas in the target db.

The solution is similar to that within #572, but for paginating the listing of schemas instead of paginating the listing of relations.

NOTE: what performance concerns do we have with this?

Checklist

I have read the contributing guide and understand what's expected of me
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

cla-bot · 2023-10-18T22:02:38Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: matt-winkler.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

dbt/include/snowflake/macros/adapters.sql

cla-bot · 2023-10-19T14:51:51Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: matt-winkler.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

mikealfare · 2023-10-23T16:40:23Z

.changes/unreleased/Under the Hood-20231019-084132.yaml

This looks like a duplicate changie, can this be removed?

mikealfare · 2023-10-23T16:51:34Z

dbt/include/snowflake/macros/adapters.sql

-    {% do exceptions.raise_compiler_error(msg) %}
-  {% endif %}
-  {{ return(result) }}
+{% macro snowflake__get_paginated_schemas_array(max_iter, max_results_per_iter, max_total_results, database, watermark) %}


This code looks familiar, did we use pagination for a different query recently? If so, is there any way to reuse that, or to augment that so that both use cases can use it? The only piece that appears to be use case specific is paginated_sql and the error message. The former could be an argument to this macro and the latter could probably be made more generic (e.g. swap schemas for objects).

Yep, you are remembering correctly @mikealfare !

The solution is similar to that within #572, but for paginating the listing of schemas instead of paginating the listing of relations.

mikealfare · 2023-10-23T17:04:49Z

tests/functional/adapter/test_list_schemas.py

+
+    def test__snowflake__list_schemas_termination(self, project):
+        """
+        validates that we do NOT trigger pagination logic snowflake__list_relations_without_caching


I'm not clear on how this validates that the pagination logic is not triggered. I agree that it shouldn't since we're allowing for 200 schemas per result (line 127) but only creating 100 schemas (line 125 and line 15). However, we're only checking for the correct number of schemas at the end, which I would think is the same regardless of whether pagination was used.

mikealfare · 2023-10-23T17:07:12Z

tests/functional/adapter/test_list_schemas.py

+        }
+
+    def test__snowflake__list_schemas(self, project):
+        """


I don't think providing the arguments max_iter:1 and max_results_per_iter:200 results in different behavior than the default (10 and 1000 respectively) since we're only creating 100 test schemas. If that's the case, is this the same test case as TestListSchemasSingle.test__snowflake__list_schemas_termination?

mikealfare · 2023-10-23T17:11:02Z

tests/functional/adapter/test_list_schemas.py

+        validates pagination logic terminates and raises a compilation error
+        when exceeding the limit of how many results to return.
+        """
+        run_dbt(["run"])


Since there are no models or tests, does this do anything here? It doesn't appear to be needed in the other tests. Also, if this does something, it potentially makes the test run order-dependent. If the first method in this class runs first, then it's run before running run_dbt(["run"]). If the second method in this class runs first, then it's run after running run_dbt(["run"]). This creates some difficult to find test failures. We just resolved something like this prior to releasing 1.7.0rc1.

mikealfare · 2023-10-23T17:14:34Z

tests/functional/adapter/test_list_schemas.py

I like that you considered positive and negative test cases that cover a few scenarios. You structured the test cases very similarly. I think you could take this one step further and make it a single parameterized test case, making it easy to see the three scenarios (e.g. if <scenario 1> then <expected outcome 1>, etc.). If parameterized tests are a new thing, let me know if you want to pair on it (or if you just want to pair on it).

matt-winkler · 2023-11-06T20:57:18Z

@mikealfare Thanks for the review on this. Haven't forgotten about it but tied up with quarter review items but plan to pick up again soon.

github-actions · 2024-05-05T01:46:15Z

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

github-actions · 2024-05-12T01:48:06Z

Although we are closing this PR as stale, it can still be reopened to continue development. Just add a comment to notify the maintainers.

dbeatty10 · 2024-05-13T12:50:03Z

@matt-winkler we can re-open this any time you want.

implements pagination

4c72e52

matt-winkler requested a review from a team as a code owner October 18, 2023 22:02

matt-winkler requested a review from colin-rogers-dbt October 18, 2023 22:02

dbeatty10 added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label Oct 19, 2023

McKnight-42 reviewed Oct 19, 2023

View reviewed changes

dbt/include/snowflake/macros/adapters.sql Outdated Show resolved Hide resolved

remove commented code

5735d80

McKnight-42 added the ok to test label Oct 19, 2023

update changelog

2536a52

cla-bot bot added the cla:yes label Oct 19, 2023

mikealfare reviewed Oct 23, 2023

View reviewed changes

.changes/unreleased/Under the Hood-20231019-084132.yaml Outdated

Copy link

Contributor

mikealfare Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a duplicate changie, can this be removed?

mikealfare reviewed Oct 23, 2023

View reviewed changes

Merge branch 'main' into feature/paginate-schemas-in-database

1dce46a

github-actions bot added the Stale label May 5, 2024

github-actions bot closed this May 12, 2024

mikealfare deleted the feature/paginate-schemas-in-database branch July 17, 2024 23:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implements pagination #811

implements pagination #811

matt-winkler commented Oct 18, 2023 •

edited by dbeatty10

Loading

cla-bot bot commented Oct 18, 2023

cla-bot bot commented Oct 19, 2023

mikealfare Oct 23, 2023

mikealfare Oct 23, 2023

dbeatty10 Oct 31, 2023

mikealfare Oct 23, 2023

mikealfare Oct 23, 2023

mikealfare Oct 23, 2023

mikealfare Oct 23, 2023

matt-winkler commented Nov 6, 2023

github-actions bot commented May 5, 2024

github-actions bot commented May 12, 2024

dbeatty10 commented May 13, 2024

implements pagination #811

implements pagination #811

Conversation

matt-winkler commented Oct 18, 2023 • edited by dbeatty10 Loading

Problem

Solution

Checklist

cla-bot bot commented Oct 18, 2023

cla-bot bot commented Oct 19, 2023

mikealfare Oct 23, 2023

Choose a reason for hiding this comment

mikealfare Oct 23, 2023

Choose a reason for hiding this comment

dbeatty10 Oct 31, 2023

Choose a reason for hiding this comment

mikealfare Oct 23, 2023

Choose a reason for hiding this comment

mikealfare Oct 23, 2023

Choose a reason for hiding this comment

mikealfare Oct 23, 2023

Choose a reason for hiding this comment

mikealfare Oct 23, 2023

Choose a reason for hiding this comment

matt-winkler commented Nov 6, 2023

github-actions bot commented May 5, 2024

github-actions bot commented May 12, 2024

dbeatty10 commented May 13, 2024

matt-winkler commented Oct 18, 2023 •

edited by dbeatty10

Loading