You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Job for a_test__sub_record.1d2b10ead9.insert_values failed terminally in load 1726687518.354422 with message null value in column "_dlt_parent_id" of relation "a_test__sub_record" violates not-null constraint
DETAIL: Failing row contains (a, PztIkGSe+oI/8Q, null, null, 2KkBaSEHNC3a/A).
. The package is aborted and cannot be retried.
The reason is that the "parent:" field in the schema yaml is missing.
Part of dlt knows that "sub_record" is a nested table (because it generates a table name a_test__sub_record with two underscores), but another part (I think it's schema.utils.is_nested_table) doesn't know it's a nested table, because the "parent" clause is missing in the schema.
My use case: I copy schema from the export schema to the import schema to control types, but I forgot to copy one line.
It took me a couple of hours to track this down, by stepping through the loading code in a debugger. It could be easier.
This is a "quality of life" issue. I've found dlt is working well, but if I do anything wrong, it can be hard to figure out what's happening.
Proposed solution
The behavior I'd like is: dlt stops earlier (say, during schema parsing, or extract, instead of load), and says "a_test__sub_record is a nested table, but missing a parent declaration in the schema".
Related issues
No response
The text was updated successfully, but these errors were encountered:
@boxydog in 1.0 you can dispatch data from nested structures to "root" tables in such a way that dlt will not add its standard linking. this is what happens above. we are still working to make this behavior easy to declare (#1713 and #1647)
what we could do is that if we see a table with "parent_key" (_dlt_parent_id by default) but without "parent" table hint we fail the normalization (a good idea actually)
did you manipulate import schema yourself to get rid of parent? if not we have a bug somewhere and this is way more serious. also please note that lack of "parent" will merge a_test__sub_record in a separate job, probably switching to append since table definition lacks primary key
what we could do is that if we see a table with "parent_key" (_dlt_parent_id by default) but without "parent" table hint we fail the normalization (a good idea actually)
Yes please.
did you manipulate import schema yourself to get rid of parent?
Yes.
I copy schema from the export schema to the import schema to control types, but I forgot to copy one line.
Feature description
When I run the attached code, I get an error:
The reason is that the "parent:" field in the schema yaml is missing.
Part of dlt knows that "sub_record" is a nested table (because it generates a table name
a_test__sub_record
with two underscores), but another part (I think it'sschema.utils.is_nested_table
) doesn't know it's a nested table, because the "parent" clause is missing in the schema.Unpack with "tar xvfz test_files.tgz": test_files.tgz
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
My use case: I copy schema from the export schema to the import schema to control types, but I forgot to copy one line.
It took me a couple of hours to track this down, by stepping through the loading code in a debugger. It could be easier.
This is a "quality of life" issue. I've found dlt is working well, but if I do anything wrong, it can be hard to figure out what's happening.
Proposed solution
The behavior I'd like is: dlt stops earlier (say, during schema parsing, or extract, instead of load), and says "a_test__sub_record is a nested table, but missing a parent declaration in the schema".
Related issues
No response
The text was updated successfully, but these errors were encountered: