Skip to content

Commit

Permalink
LSQL: relational migration merge - guide (#121)
Browse files Browse the repository at this point in the history
* relational migration instructions initial

* LMIG -> LDMS, + LSQL naming

* LSQL content additions

* Changes for LSQL to add MySQL EC2 instance

* pulling in Rob's changes to template

* Fixed syntax error in UserData

* Changes alongside my review

* Rob's changes

* instruction updates for clarity

* image and instruction improvements

* updated images

* Final revision before merge

---------

Co-authored-by: Sean Shriver <[email protected]>
  • Loading branch information
robm26 and switch180 authored Nov 16, 2024
1 parent 03f2c12 commit ba4946e
Show file tree
Hide file tree
Showing 71 changed files with 1,005 additions and 5 deletions.
6 changes: 5 additions & 1 deletion content/authors.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,15 @@ weight: 100
1. Daniel Yoder ([danielsyoder](https://github.com/danielsyoder)) - The brains behind amazon-dynamodb-labs.com and the co-creator of the design scenarios

### 2024 additions
The Generative AI workshop LBED was released in 2024:
The Generative AI workshop LBED was released in early 2024:
1. John Terhune - ([@terhunej](https://github.com/terhunej)) - Primary author
1. Zhang Xin - ([@SEZ9](https://github.com/SEZ9)) - Content contributor and original author of a lab that John used as the basis of LBED
1. Sean Shriver - ([@switch180](https://github.com/switch180)) - Editor, tech reviewer, and merger

The LSQL relational migration lab was released in late 2024:
1. Robert McCauley - ([robm26](https://github.com/robm26)) - Primary author
1. Sean Shriver - ([@switch180](https://github.com/switch180)) - Editor, tech reviewer, and merger

### 2023 additions
The serverless event driven architecture lab was added in 2023:

Expand Down
2 changes: 1 addition & 1 deletion content/change-data-capture/index.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: "LCDC: Change Data Capture for Amazon DynamoDB"
chapter: true
description: "200 level: Hands-on exercises with DynamoDB Streams and Kinesis Data Streams with Kinesis Analytics."
weight: 40
weight: 80
---
In this workshop, you will learn how to perform change data capture of item level changes on DynamoDB tables using [Amazon DynamoDB Streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html) and [Amazon Kinesis Data Streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/kds.html). This technique allows you to develop event-driven solutions that are initiated by alterations made to item-level data stored in DynamoDB.

Expand Down
3 changes: 2 additions & 1 deletion content/index.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ Prior expertise with AWS and NoSQL databases is beneficial but not required to c
If you're brand new to DynamoDB with no experience, you may want to begin with *Hands-on Labs for Amazon DynamoDB*. If you want to learn the design patterns for DynamoDB, check out *Advanced Design Patterns for DynamoDB* and the *Design Challenges* scenarios.

### Looking for a larger challenge?
The DynamoDB Immersion Day has a series of workshops designed to cover advanced topics. If you want to dig deep into streaming aggregations with AWS Lambda and DynamoDB Streams, consider LEDA. Or if you want an easier introduction CDC you can consider LCDC.
The DynamoDB Immersion Day has a series of workshops designed to cover advanced topics. If you want to dig deep into streaming aggregations with AWS Lambda and DynamoDB Streams, consider LEDA. Or if you want an easier introduction CDC you can consider LCDC. Do you have a relational database to migrate to DynamoDB? We offer LSQL and a AWS DMS lab LDMS: we highly recommend LSQL unless you have a need to use DMS.

Do you want to integrate Generative AI to create a context-aware reasoning application? If so consider LBED, a lab that takes a product catalog from DynamoDB and contiously indexes it into OpenSearch Service for natural language queries supported by Amazon Bedrock.

Dive into the content:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: "5. LMIG: Relational Modeling & Migration"
title: "LDMS: AWS DMS Migration"
date: 2021-04-25T07:33:04-05:00
weight: 50
---

In this module, also classified as LMIG, you will learn how to design a target data model in DynamoDB for highly normalized relational data in a relational database.
In this module, classified as LDMS, you will learn how to design a target data model in DynamoDB for highly normalized relational data in a relational database.
The exercise also guides a step by step migration of an IMDb dataset from a self-managed MySQL database instance on EC2 to a fully managed key-value pair database Amazon DynamoDB.
At the end of this lesson, you should feel confident in your ability to design and migrate an existing relational database to Amazon DynamoDB.

Expand Down
30 changes: 30 additions & 0 deletions content/relational-migration/application refactoring/index.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title : "Application Refactoring"
weight : 40
---

## Updating the Client Application for DynamoDB
After you have chosen your DynamoDB table schema, and migrated any historical data over,
you can consider what code changes are required so a new version of your app can call the DynamoDB
read and write APIs.

The web app we have been using includes forms and buttons to perform standard CRUD (Create, Read, Update, Delete) operations.

The web app makes HTTP calls to the published API using standard GET and POST methods against certain API paths.

1. In Cloud9, open the left nav and locate the file **app.py**
2. Double click to open and review this file

In the bottom half of the file you will see several small handler functions that
pass core read and write requests on to the **db** object's functions.


Notice the file contains a conditional import for the **db** object.

```python
if migration_stage == 'relational':
from chalicelib import mysql_calls as db
else:
from chalicelib import dynamodb_calls as db
```

32 changes: 32 additions & 0 deletions content/relational-migration/application refactoring/index2.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title : "DynamoDB-ready middle tier"
weight : 41
---

## Deploy a new DynamoDB-ready API

If you recall, we had run the command ```chalice deploy --stage relational``` previously
to create the MySQL-ready middle tier.

We can repeat this to create a new API Gateway and Lambda stack, this time using the DynamoDB stage.

1. Within the Cloud9 terminal window, run:
```bash
chalice deploy --stage dynamodb
```
2. When this completes, find the new Rest API URL and copy it.
3. You can paste this into a new browser tab to test it. You should see a status message indicating
the DynamoDB version of the API is working.

We now need a separate browser to test out the full web app experience, since
the original browser has a cookie set to the relational Rest API.

4. If you have multiple browsers on your laptop, such as Edge, Firefox, or Safari,
open a different browser and navigate to the web app:

[https://amazon-dynamodb-labs.com/static/relational-migration/web/index.html](https://amazon-dynamodb-labs.com/static/relational-migration/web/index.html).

(You can also open the same browser in Incognito Mode for this step.)

5. Click the Target API button and paste in the new Rest API URL.
6. Notice the title of the page has updated to **DynamoDB App** in a blue color. If it isn't blue, you can refresh the page and see the color change.
36 changes: 36 additions & 0 deletions content/relational-migration/application refactoring/index3.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title : "Testing and reviewing DynamoDB code"
weight : 42
---

## Test drive your DynamoDB application

1. Click Tables to see a list of available tables in the account. You should see the
Customers table, vCustOrders table, and a few other tables used by separate workshops.

3. Click on the Customers table, click the SCAN button to see the table's data.
4. Test the CRUD operations such as get-item, and the update and delete buttons in the data grid,
to make sure they work against the DynamoDB table.
4. Click on the Querying tab to display the form with GSI indexes listed.
5. On the idx_region GSI, enter 'North' and press GO.

![DynamoDB GSI Form](/static/images/relational-migration/ddb_gsi.png)

## Updating DynamoDB functions

Let's make a small code change to demonstrate the process to customize the DynamoDB functions.

6. In Cloud9, left nav, locate the chalicelib folder and open it.
7. Locate and open the file dynamodb_calls.py
8. Search for the text ```get_request['ConsistentRead'] = False```
9. Update this from False to True and click File/Save to save your work.
10. In the terminal prompt, redeploy:

```bash
chalice deploy --stage dynamodb
```

11. Return to the web app, click on the Customers table, and enter cust_id value "0001" and click the GET ITEM button.
12. Verify a record was retrieved for you. This record was found using a strongly consistent read.
13. Feel free to extend the DynamoDB code to add new functions or modify existing ones.

20 changes: 20 additions & 0 deletions content/relational-migration/data migration/index.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
title : "Data Migration"
weight : 30
---

## Transform, Extract, Convert, Stage Import

Recall our strategy for migrating table data into DynamoDB via S3 was
summarized in the :link[Workshop Introduction]{href="../introduction/index5" target=_blank}.

For each table or view that we want to migrate, we need a routine that will ```SELECT *``` from it,
and convert the result dataset into DynamoDB JSON before writing it to an S3 bucket.

![Migration Flow](/static/images/relational-migration/migrate_flow.png)

For migrations of very large tables we may choose to use purpose-built data tools like
AWS Glue, Amazon EMR, or Amazon DMS. These tools can help you define and coordinate multiple
parallel jobs that perform the work to extract, transform, and stage data into S3.

In this workshop we can use a Python script to demonstrate this ETL process.
36 changes: 36 additions & 0 deletions content/relational-migration/data migration/index2.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title : "ETL Scripts"
weight : 31
---


## mysql_s3.py

A script called mysql_s3.py is provided that performs all the work to convert and load a query result
set into S3. We can run this script in preview mode by using the "stdout" parameter.

1. Run:
```bash
python3 mysql_s3.py Customers stdout
```
You should see results in DynamoDB JSON format:

![mysql_s3.py output](/static/images/relational-migration/mysql_s3_output.png)

2. Next, run it for our view:
```bash
python3 mysql_s3.py vCustOrders stdout
```
You should see similar output from the view results.

The script can write these to S3 for us. We just need to omit the "stdout" command line parameter.

3. Now, run the script without preview mode:
```bash
python3 mysql_s3.py Customers
```
You should see confirmation that objects have been written to S3:

![mysql_s3.py output](/static/images/relational-migration/mysql_s3_write_output.png)


54 changes: 54 additions & 0 deletions content/relational-migration/data migration/index3.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title : "Full Migration"
weight : 32
---

## DynamoDB Import from S3

The Import from S3 feature is a convenient way to have data loaded into a new DynamoDB table.
Learn more about this feature [here](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.HowItWorks.html).

Import creates a brand new table, and is not able to load data into an existing table.
Therefore, it is most useful during the one-time initial load of data during a migration.

## migrate.sh

A script is provided that performs multiple steps to coordinate a migration:
* Runs **mysql_desc_ddb.py** and stores the result in a table definition JSON file
* Runs **mysql_s3.py** to extract, transform, and load data into an S3 bucket
* Uses the **aws dynamodb import-table** CLI command to request a new table, by providing the bucket name and table definition JSON file

1. Run:
```bash
./migrate.sh Customers
```
The script should produce output as shown here:

![Migrate Output](/static/images/relational-migration/migrate_output.png)

Notice the ARN returned. This is the ARN of the Import job, not the new DynamoDB table.

The import will take a few minutes to complete.

2. Optional: You can check the status of an import job using this command, by setting the Import ARN on line two.

```bash
aws dynamodb describe-import \
--import-arn '<paste ARN here>' \
--output json --query '{"Status ":ImportTableDescription.ImportStatus, "FailureCode ":ImportTableDescription.FailureCode, "FailureMessage ":ImportTableDescription.FailureMessage }'
```

We can also check the import status within the AWS Console.

3. Click into the separate browser tab titled "AWS Cloud9" to open the AWS Console.
4. In the search box, type DynamoDB to visit the DyanmoDB console.
5. From the left nav, click Imports from S3.
6. Notice your import is listed along with the current status.
![Import from S3](/static/images/relational-migration/import-from-s3.png)
7. Once the import has completed, you can click it to see a summary including item count and the size of the import.
8. On the left nav, click to Tables.
9. In the list of tables, click on the Customers table.
10. On the top right, click on Explore Table Items.
11. Scroll down until you see a grid with your imported data.

Congratulations! You have completed a relational-to-DynamoDB migration.
36 changes: 36 additions & 0 deletions content/relational-migration/data migration/index4.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title : "VIEW migration"
weight : 33
---

## Migrating from a VIEW

In the previous step, you simply ran ```./migrate.sh Customers``` to perform a migration of this table
and data to DynamoDB.

You can repeat this process to migrate the custom view vCustOrders.

1. Run:
```bash
./migrate.sh vCustOrders
```

The script assumes you want a two-part primary key of Partition Key and Sort Key, found in the two leading columns.

In case you want a Partition Key only table, you could specify this like so.

```bash
./migrate.sh vCustOrders 1
```

But don't run this command, because if you do, the S3 Import will fail as you already have a table called vCustOrders.
You could create another view with a different name and import that, or just delete the DynamoDB table
from the DynamoDB console before attempting another migration of vCustOrders.

However, this is not advisable since this particular dataset is not unique by just the first column.

![View output](/static/images/relational-migration/view_result.png)

::alert[Import will write all the records it finds in the bucket to the table. If a duplicate record is encountered, it will simply overwrite it. Please be sure that your S3 data does not contain any duplicates based on the Key(s) of the new table you define.]{header="Note:"}

The second import is also not advisable since if you created a new vCustOrders table in step 1, the second Import attempt would not be able to replace the existing table, and would fail.
30 changes: 30 additions & 0 deletions content/relational-migration/data migration/index5.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title : "SQL Transformation Patterns for DynamoDB"
weight : 34
---

## Shaping Data with SQL

Let's return to the web app and explore some techniques you can use to shape and enrich your relational
data before importing it to DynamoDB.

1. Within the web app, refresh the browser page.
2. Click on the Querying tab.
3. Notice the set of SQL Sample buttons below the SQL editor.
4. Click button one.
The OrderLines table has a two-part primary key, as is common with DynamoDB. We can think of the returned dataset as an Item Collection.
5. Repeat by clicking each other sample buttons. Check the comment at the top of each query, which summarizes the technique being shown.

![SQL Samples](/static/images/relational-migration/sparse.png)

Notice the final two sample buttons. These demonstrate alternate ways to combine data from multiple tables.
We already saw how to combine tables with a JOIN operator, resulting in a denormalized data set.

The final button shows a different approach to combining tables, without using JOIN.
You can use a UNION ALL between multiple SQL queries to stack datasets together as one.
When we arrange table data like this, we describe each source table as an entity and so the single DynamoDB
table will be overloaded with multiple entities. Because of this, we can set the partition key and sort key
names to generic values of PK and SK, and add some decoration to the key values so that it's clear what type
of entity a given record represents.

![Stacked entities](/static/images/relational-migration/stacked.png)
20 changes: 20 additions & 0 deletions content/relational-migration/data migration/index6.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
title : "Custom VIEWs"
weight : 35
---

## Challenge: Create New Views

The SQL editor window is provided so that you have an easy way to run queries and
experiment with data transformation techniques.

Using the sample queries as a guide, see how many techniques you can combine in a single query.
Look for opportunities to align attributes across the table so that they can be queried by a GSI.
Consider using date fields in column two, so that they become Sort Key values, and be queryable with
DynamoDB range queries.

1. When you have a SQL statement you like, click the CREATE VIEW button.
2. In the prompt, enter a name for your new view. This will add a CREATE VIEW statement to the top of you query.
3. Click RUN SQL to create the new view.
4. Refresh the page, and your view should appear as a button next to the vCustOrders button.

Loading

0 comments on commit ba4946e

Please sign in to comment.