This is my project followed by Airbyte tutorial: "Building a seamless and efficient data pipeline for e-commerce analytics".
The tutorial dived into the practical implementation of a data workflow using Airbyte for data integration, dbt for transformation, Dagster for orchestration, and BigQuery for data warehousing.
- Python
- Terraform (Infrastruction as Code)
- Airbyte (Data ingestion)
- dbt (Data transformation)
- Dagster (Pipeline orchestration)
- BigQuery (Data warehouse)
Version 3.10 or 3.11 (Don't use 3.12 because some libraries are incompatible)
pip install dbt-core dbt-bigquery
pip install dagster dagster-webserver dagster-dbt dagster-airbyte
- For dbt to interact with BigQuery (by GCP Service Account)
export DBT_BIGQUERY_KEYFILE_PATH=path/to/credentials.json
- For Dagster to interact with dbt
export DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1
- For Dagster to interact with Airbyte
export AIRBYTE_PASSWORD=password