Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filesystem improvement and fixes #1274

Merged
merged 11 commits into from
Apr 24, 2024
Merged

filesystem improvement and fixes #1274

merged 11 commits into from
Apr 24, 2024

Conversation

sh-rp
Copy link
Collaborator

@sh-rp sh-rp commented Apr 24, 2024

Description

  • Add a fs base client that can be accessed from the pipeline with a few useful methods to inspect the loaded files and in the process refactors the filesystem destination a bit for more clarity and easier testing.
  • Cleans up some tests utils a bit, still ongoing
  • Adds better tests for state and truncating tables with different path layouts and fixes 2 tests along the way

Fixes two bugs:

  • State restore not working for certain layout settings
  • Truncating tables in subfolders for certain layout settings did not work

Copy link

netlify bot commented Apr 24, 2024

Deploy Preview for dlt-hub-docs ready!

Name Link
🔨 Latest commit b93b825
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/66294b310b70fe000892c2ca
😎 Deploy Preview https://deploy-preview-1274--dlt-hub-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@sh-rp sh-rp marked this pull request as ready for review April 24, 2024 14:38
from fsspec import AbstractFileSystem


class FSClientBase(ABC):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to be debated, I find it very useful, but we can also make it private for now

@@ -956,6 +958,21 @@ def sql_client(self, schema_name: str = None, credentials: Any = None) -> SqlCli
schema = self._get_schema_or_create(schema_name)
return self._sql_job_client(schema, credentials).sql_client

def fs_client(self, schema_name: str = None, credentials: Any = None) -> FSClientBase:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more or less analog to sql_client

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename it to _fs_client until we discuss the exact interface.

@sh-rp sh-rp changed the title [wip] filesystem improvement and fixes filesystem improvement and fixes Apr 24, 2024
Copy link
Collaborator

@rudolfix rudolfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd hide the _fs_client. otherwise LGTM.
the FsClient interface: we need to take a look what we need to expose to end user and then make it public

@@ -956,6 +958,21 @@ def sql_client(self, schema_name: str = None, credentials: Any = None) -> SqlCli
schema = self._get_schema_or_create(schema_name)
return self._sql_job_client(schema, credentials).sql_client

def fs_client(self, schema_name: str = None, credentials: Any = None) -> FSClientBase:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename it to _fs_client until we discuss the exact interface.

@rudolfix rudolfix merged commit 0587f36 into devel Apr 24, 2024
47 of 48 checks passed
@rudolfix rudolfix deleted the d#/load_utils_cleanup branch April 24, 2024 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants