You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
Starting dbt version 1.7, when bucket retention policy is set, python model build will throw error Unhandled error while executing target/run/restream_bi/models/python/python_model_test.py[0m Google Cloud Dataproc Agent reports job failure.
Expected Behavior
In dbt version1.6, the python model does not get a build error even when when bucket retention policy is set.
Steps To Reproduce
Change dbt version to 1.7 or later in Dev env
Set a value to bucket retention policy in bucket details in Google Cloud
Run dbt run -s <python model>
Relevant log output
Google Cloud Dataproc Agent reports job failure.
Using the default container image
Waiting for container log creation
PYSPARK_PYTHON=/opt/dataproc/conda/bin/python
JAVA_HOME=/usr/lib/jvm/temurin-11-jdk-amd64
SPARK_EXTRA_CLASSPATH=
:: loading settings :: file = /etc/spark/conf/ivysettings.xml
/usr/lib/spark/python/lib/pyspark.zip/pyspark/pandas/__init__.py:49: UserWarning: 'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is required to set this environment variable to '1'in both driver and executor sides if you use pyarrow>=2.0.0. pandas-on-Spark will set it for you but it does not work if there is a Spark context already launched.
Environment
- OS:macOS- Python/dbt:
Error in dbt cloud version 1.7 and versionless
OK in dbt cloud version 1.6
Which database adapter are you using with dbt?
bigquery
Additional Context
No response
The text was updated successfully, but these errors were encountered:
sherminsb
changed the title
[Bug] Python models failed when Google bucket retention policy is set (starting from dbt version 1.7)
[Bug] Python model build failed when Google bucket retention policy is set (starting from dbt version 1.7)
Aug 5, 2024
dbeatty10
changed the title
[Bug] Python model build failed when Google bucket retention policy is set (starting from dbt version 1.7)
[Regression] Python model build failed when Google bucket retention policy is set (starting from dbt version 1.7)
Aug 5, 2024
The root cause here is a change we made in 1.7 to use "indirect" writes instead of writing directly to Bigquery. This allows us to support a wider range of Bigquery functionality (namely writing to partitioned tables).
See BQ connector docs for more info. Note that this means using a bucket retention policy will be incompatible with using partitioned/incremental materialization strategies.
I think the right behavior here is to try and use the direct writes for basic tables materializations and only use indirect if the user sets a partition config
Is this a new bug in dbt-core?
Current Behavior
Starting dbt version 1.7, when bucket retention policy is set, python model build will throw error Unhandled error while executing target/run/restream_bi/models/python/python_model_test.py[0m Google Cloud Dataproc Agent reports job failure.
Expected Behavior
In dbt version1.6, the python model does not get a build error even when when bucket retention policy is set.
Steps To Reproduce
dbt run -s <python model>
Relevant log output
Environment
Which database adapter are you using with dbt?
bigquery
Additional Context
No response
The text was updated successfully, but these errors were encountered: