ST_bigdata_consume_batch_ma_with_cr_ecd

Job name: consume-batch-ma-with-cr-ecd-{dev/preview/live}

Layer: Consume

Tables used:

Successors: TBD

Overview

The pipeline consists of several components:

Loading tables: Load multiple tables into Dynamic Frames with push down predicate option to prepare them for further processing.
Setting country values: Preparing a list with values for geoid, country name, distribution type and data source to process data according to those values.
Custom Transformation: Custom transformation logic implemented in PySpark for further data processing, including renaming fields and casting data types.
Concat Dynamic Frames: Perform union-like operations to get a single DynamicFrame with values for every loop iteration.
Writing to S3: Write DynamicFrame for each itteration to S3, compressed in json and csv format.
AWS Glue Data Catalog Integration: Writing transformed data to an AWS Glue Data Catalog table, enabling querying and analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.circleci		.circleci
deployment		deployment
script		script
README.md		README.md