Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make merge write-disposition fall back to append if no primary or merge keys are specified #1225

Merged
merged 11 commits into from
Apr 24, 2024

Conversation

sh-rp
Copy link
Collaborator

@sh-rp sh-rp commented Apr 16, 2024

Description

Fall back to append if no merge keys are available

@sh-rp sh-rp self-assigned this Apr 16, 2024
Copy link

netlify bot commented Apr 16, 2024

Deploy Preview for dlt-hub-docs ready!

Name Link
🔨 Latest commit 1214db0
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/6627cb486bde580008f18715
😎 Deploy Preview https://deploy-preview-1225--dlt-hub-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@sh-rp
Copy link
Collaborator Author

sh-rp commented Apr 16, 2024

@rudolfix comments from the last pr (which somehow got automatically closed when I reset this branch)

===
it is a good start :)

there's already test for it (test_merge_no_merge_keys)
we should also test what happens if we have hard delete column
filesystem also fallbacks to replace when there are no keys (and I think there's a test for it)
what about vector databases? they support limited merge (on primary key). I think they do not fallback to replace but make sure

===

@@ -3,6 +3,7 @@
import pytest
import random
from os import environ
import io
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i pulled this over from my filesystem state branch because i needed those changes here.

@@ -226,6 +226,10 @@ def destinations_configs(
DestinationTestConfiguration(destination="synapse", supports_dbt=False),
]

# sanity check that when selecting default destinations, one of each sql destination is actually
# provided
assert set(SQL_DESTINATIONS) == {d.destination for d in destination_configs}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has happened a few times already... not really related to this pr


assert local_path.read_bytes() == client.fs_client.read_bytes(destination_path)


def test_pipeline_merge_write_disposition(default_buckets_env: str) -> None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test shows that filesystem always falls back to append when "merge" is set. this is also what it says in the docs. I'm not sure why this test says it would replace when there is a primary key, this did not work before and it does not work that way now and this test was somehow very strange (not sure if I wrote that) but now it runs correctly. I we want to replace in certain cases, I have to add it and specify that in the docs.

@sh-rp sh-rp changed the title D#/merge fallback Make merge write-disposition fall back to append if no primary or merge keys are specified Apr 17, 2024
@sh-rp sh-rp added the ci full run the full load tests on pr label Apr 17, 2024
@sh-rp sh-rp marked this pull request as ready for review April 17, 2024 14:19
# Conflicts:
#	tests/load/pipeline/test_filesystem_pipeline.py
@sh-rp sh-rp requested a review from rudolfix April 19, 2024 11:47
Copy link
Collaborator

@rudolfix rudolfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rudolfix rudolfix merged commit 6879a09 into devel Apr 24, 2024
48 checks passed
@rudolfix rudolfix deleted the d#/merge_fallback branch April 24, 2024 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci full run the full load tests on pr
Projects
None yet
Development

Successfully merging this pull request may close these issues.

change the merge behavior when no primary key and merge key are specified to append
2 participants