LSQL: relational migration merge - guide (#121)

* relational migration instructions initial * LMIG -> LDMS, + LSQL naming * LSQL content additions * Changes for LSQL to add MySQL EC2 instance * pulling in Rob's changes to template * Fixed syntax error in UserData * Changes alongside my review * Rob's changes * instruction updates for clarity * image and instruction improvements * updated images * Final revision before merge --------- Co-authored-by: Sean Shriver <[email protected]>
aws-samples · Nov 16, 2024 · ba4946e · ba4946e
1 parent 03f2c12
commit ba4946e
Show file tree

Hide file tree

Showing 71 changed files with 1,005 additions and 5 deletions.
diff --git a/content/authors.en.md b/content/authors.en.md
@@ -14,11 +14,15 @@ weight: 100
 1. Daniel Yoder ([danielsyoder](https://github.com/danielsyoder)) - The brains behind amazon-dynamodb-labs.com and the co-creator of the design scenarios
 
 ### 2024 additions
-The Generative AI workshop LBED was released in 2024:
+The Generative AI workshop LBED was released in early 2024:
 1. John Terhune - ([@terhunej](https://github.com/terhunej)) - Primary author
 1. Zhang Xin - ([@SEZ9](https://github.com/SEZ9)) - Content contributor and original author of a lab that John used as the basis of LBED
 1. Sean Shriver - ([@switch180](https://github.com/switch180)) - Editor, tech reviewer, and merger
 
+The LSQL relational migration lab was released in late 2024:
+1. Robert McCauley - ([robm26](https://github.com/robm26)) - Primary author
+1. Sean Shriver - ([@switch180](https://github.com/switch180)) - Editor, tech reviewer, and merger
+
 ### 2023 additions
 The serverless event driven architecture lab was added in 2023:
 

diff --git a/content/change-data-capture/index.en.md b/content/change-data-capture/index.en.md
@@ -2,7 +2,7 @@
 title: "LCDC: Change Data Capture for Amazon DynamoDB"
 chapter: true
 description: "200 level: Hands-on exercises with DynamoDB Streams and Kinesis Data Streams with Kinesis Analytics."
-weight: 40
+weight: 80
 ---
 In this workshop, you will learn how to perform change data capture of item level changes on DynamoDB tables using [Amazon DynamoDB Streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html) and [Amazon Kinesis Data Streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/kds.html). This technique allows you to develop event-driven solutions that are initiated by alterations made to item-level data stored in DynamoDB.
 

diff --git a/content/index.en.md b/content/index.en.md
@@ -16,7 +16,8 @@ Prior expertise with AWS and NoSQL databases is beneficial but not required to c
 If you're brand new to DynamoDB with no experience, you may want to begin with *Hands-on Labs for Amazon DynamoDB*. If you want to learn the design patterns for DynamoDB, check out *Advanced Design Patterns for DynamoDB* and the *Design Challenges* scenarios.
 
 ### Looking for a larger challenge?
-The DynamoDB Immersion Day has a series of workshops designed to cover advanced topics. If you want to dig deep into streaming aggregations with AWS Lambda and DynamoDB Streams, consider LEDA. Or if you want an easier introduction CDC you can consider LCDC.
+The DynamoDB Immersion Day has a series of workshops designed to cover advanced topics. If you want to dig deep into streaming aggregations with AWS Lambda and DynamoDB Streams, consider LEDA. Or if you want an easier introduction CDC you can consider LCDC. Do you have a relational database to migrate to DynamoDB? We offer LSQL and a AWS DMS lab LDMS: we highly recommend LSQL unless you have a need to use DMS.
+
 Do you want to integrate Generative AI to create a context-aware reasoning application? If so consider LBED, a lab that takes a product catalog from DynamoDB and contiously indexes it into OpenSearch Service for natural language queries supported by Amazon Bedrock.
 
 Dive into the content:

diff --git a/...hands-on-labs/rdbms-migration/index.en.md → content/rdbms-migration/index.en.md b/...hands-on-labs/rdbms-migration/index.en.md → content/rdbms-migration/index.en.md
@@ -1,10 +1,10 @@
 ---
-title: "5. LMIG: Relational Modeling & Migration"
+title: "LDMS: AWS DMS Migration"
 date: 2021-04-25T07:33:04-05:00
 weight: 50
 ---
 
-In this module, also classified as LMIG, you will learn how to design a target data model in DynamoDB for highly normalized relational data in a relational database.
+In this module, classified as LDMS, you will learn how to design a target data model in DynamoDB for highly normalized relational data in a relational database.
 The exercise also guides a step by step migration of an IMDb dataset from a self-managed MySQL database instance on EC2 to a fully managed key-value pair database Amazon DynamoDB.
 At the end of this lesson, you should feel confident in your ability to design and migrate an existing relational database to Amazon DynamoDB.
 

diff --git a/...rdbms-migration/migration-chapter00.en.md → ...rdbms-migration/migration-chapter00.en.md b/...rdbms-migration/migration-chapter00.en.md → ...rdbms-migration/migration-chapter00.en.md
diff --git a/...bms-migration/migration-chapter02-1.en.md → ...bms-migration/migration-chapter02-1.en.md b/...bms-migration/migration-chapter02-1.en.md → ...bms-migration/migration-chapter02-1.en.md
diff --git a/...rdbms-migration/migration-chapter02.en.md → ...rdbms-migration/migration-chapter02.en.md b/...rdbms-migration/migration-chapter02.en.md → ...rdbms-migration/migration-chapter02.en.md
diff --git a/...rdbms-migration/migration-chapter03.en.md → ...rdbms-migration/migration-chapter03.en.md b/...rdbms-migration/migration-chapter03.en.md → ...rdbms-migration/migration-chapter03.en.md
diff --git a/...rdbms-migration/migration-chapter04.en.md → ...rdbms-migration/migration-chapter04.en.md b/...rdbms-migration/migration-chapter04.en.md → ...rdbms-migration/migration-chapter04.en.md
diff --git a/...rdbms-migration/migration-chapter05.en.md → ...rdbms-migration/migration-chapter05.en.md b/...rdbms-migration/migration-chapter05.en.md → ...rdbms-migration/migration-chapter05.en.md
diff --git a/...rdbms-migration/migration-chapter06.en.md → ...rdbms-migration/migration-chapter06.en.md b/...rdbms-migration/migration-chapter06.en.md → ...rdbms-migration/migration-chapter06.en.md
diff --git a/content/relational-migration/application refactoring/index.en.md b/content/relational-migration/application refactoring/index.en.md
@@ -0,0 +1,30 @@
+---
+title : "Application Refactoring"
+weight : 40
+---
+
+## Updating the Client Application for DynamoDB
+After you have chosen your DynamoDB table schema, and migrated any historical data over, 
+you can consider what code changes are required so a new version of your app can call the DynamoDB 
+read and write APIs.
+
+The web app we have been using includes forms and buttons to perform standard CRUD (Create, Read, Update, Delete) operations.
+
+The web app makes HTTP calls to the published API using standard GET and POST methods against certain API paths.
+
+1. In Cloud9, open the left nav and locate the file **app.py**
+2. Double click to open and review this file
+
+In the bottom half of the file you will see several small handler functions that 
+pass core read and write requests on to the **db** object's functions.
+
+
+Notice the file contains a conditional import for the **db** object.
+
+```python
+if migration_stage == 'relational':
+    from chalicelib import mysql_calls as db
+else:
+    from chalicelib import dynamodb_calls as db
+```
+
diff --git a/content/relational-migration/application refactoring/index2.en.md b/content/relational-migration/application refactoring/index2.en.md
@@ -0,0 +1,32 @@
+---
+title : "DynamoDB-ready middle tier"
+weight : 41
+---
+
+## Deploy a new DynamoDB-ready API
+
+If you recall, we had run the command ```chalice deploy --stage relational``` previously 
+to create the MySQL-ready middle tier.
+
+We can repeat this to create a new API Gateway and Lambda stack, this time using the DynamoDB stage.
+
+1. Within the Cloud9 terminal window, run:
+```bash
+chalice deploy --stage dynamodb
+```
+2. When this completes, find the new Rest API URL and copy it.
+3. You can paste this into a new browser tab to test it. You should see a status message indicating 
+the DynamoDB version of the API is working.
+
+We now need a separate browser to test out the full web app experience, since
+the original browser has a cookie set to the relational Rest API.
+
+4. If you have multiple browsers on your laptop, such as Edge, Firefox, or Safari, 
+open a different browser and navigate to the web app:
+
+[https://amazon-dynamodb-labs.com/static/relational-migration/web/index.html](https://amazon-dynamodb-labs.com/static/relational-migration/web/index.html).
+
+(You can also open the same browser in Incognito Mode for this step.)
+
+5. Click the Target API button and paste in the new Rest API URL.
+6. Notice the title of the page has updated to **DynamoDB App** in a blue color. If it isn't blue, you can refresh the page and see the color change.
diff --git a/content/relational-migration/application refactoring/index3.en.md b/content/relational-migration/application refactoring/index3.en.md
@@ -0,0 +1,36 @@
+---
+title : "Testing and reviewing DynamoDB code"
+weight : 42
+---
+
+## Test drive your DynamoDB application
+
+1. Click Tables to see a list of available tables in the account. You should see the 
+Customers table, vCustOrders table, and a few other tables used by separate workshops.
+
+3. Click on the Customers table, click the SCAN button to see the table's data.
+4. Test the CRUD operations such as get-item, and the update and delete buttons in the data grid,
+to make sure they work against the DynamoDB table.
+4. Click on the Querying tab to display the form with GSI indexes listed.
+5. On the idx_region GSI, enter 'North' and press GO.
+
+![DynamoDB GSI Form](/static/images/relational-migration/ddb_gsi.png)
+
+## Updating DynamoDB functions
+
+Let's make a small code change to demonstrate the process to customize the DynamoDB functions.
+
+6. In Cloud9, left nav, locate the chalicelib folder and open it.
+7. Locate and open the file dynamodb_calls.py
+8. Search for the text ```get_request['ConsistentRead'] = False```
+9. Update this from False to True and click File/Save to save your work.
+10. In the terminal prompt, redeploy:  
+
+```bash
+chalice deploy --stage dynamodb
+```
+
+11. Return to the web app, click on the Customers table, and enter cust_id value "0001" and click the GET ITEM button.
+12. Verify a record was retrieved for you. This record was found using a strongly consistent read.
+13. Feel free to extend the DynamoDB code to add new functions or modify existing ones.
+
diff --git a/content/relational-migration/data migration/index.en.md b/content/relational-migration/data migration/index.en.md
@@ -0,0 +1,20 @@
+---
+title : "Data Migration"
+weight : 30
+---
+
+## Transform, Extract, Convert, Stage Import
+
+Recall our strategy for migrating table data into DynamoDB via S3 was 
+summarized in the :link[Workshop Introduction]{href="../introduction/index5" target=_blank}.
+
+For each table or view that we want to migrate, we need a routine that will ```SELECT *``` from it, 
+and convert the result dataset into DynamoDB JSON before writing it to an S3 bucket.
+
+![Migration Flow](/static/images/relational-migration/migrate_flow.png)
+
+For migrations of very large tables we may choose to use purpose-built data tools like 
+AWS Glue, Amazon EMR, or Amazon DMS. These tools can help you define and coordinate multiple 
+parallel jobs that perform the work to extract, transform, and stage data into S3.
+
+In this workshop we can use a Python script to demonstrate this ETL process. 
diff --git a/content/relational-migration/data migration/index2.en.md b/content/relational-migration/data migration/index2.en.md
@@ -0,0 +1,36 @@
+---
+title : "ETL Scripts"
+weight : 31
+---
+
+
+## mysql_s3.py
+
+A script called mysql_s3.py is provided that performs all the work to convert and load a query result
+set into S3. We can run this script in preview mode by using the "stdout" parameter.
+
+1. Run:
+```bash
+python3 mysql_s3.py Customers stdout
+```
+You should see results in DynamoDB JSON format:
+
+![mysql_s3.py output](/static/images/relational-migration/mysql_s3_output.png)
+
+2. Next, run it for our view:
+```bash
+python3 mysql_s3.py vCustOrders stdout
+```
+You should see similar output from the view results.
+
+The script can write these to S3 for us. We just need to omit the "stdout" command line parameter.
+
+3. Now, run the script without preview mode:
+```bash
+python3 mysql_s3.py Customers 
+```
+You should see confirmation that objects have been written to S3:
+
+![mysql_s3.py output](/static/images/relational-migration/mysql_s3_write_output.png)
+
+
diff --git a/content/relational-migration/data migration/index3.en.md b/content/relational-migration/data migration/index3.en.md
@@ -0,0 +1,54 @@
+---
+title : "Full Migration"
+weight : 32
+---
+
+## DynamoDB Import from S3
+
+The Import from S3 feature is a convenient way to have data loaded into a new DynamoDB table. 
+Learn more about this feature [here](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.HowItWorks.html).
+
+Import creates a brand new table, and is not able to load data into an existing table.
+Therefore, it is most useful during the one-time initial load of data during a migration.
+
+## migrate.sh
+
+A script is provided that performs multiple steps to coordinate a migration:
+* Runs **mysql_desc_ddb.py** and stores the result in a table definition JSON file
+* Runs **mysql_s3.py** to extract, transform, and load data into an S3 bucket
+* Uses the **aws dynamodb import-table** CLI command to request a new table, by providing the bucket name and table definition JSON file
+
+1. Run:
+```bash
+./migrate.sh Customers
+```
+The script should produce output as shown here:
+
+![Migrate Output](/static/images/relational-migration/migrate_output.png)
+
+Notice the ARN returned. This is the ARN of the Import job, not the new DynamoDB table.
+
+The import will take a few minutes to complete. 
+
+2. Optional: You can check the status of an import job using this command, by setting the Import ARN on line two.
+
+```bash
+aws dynamodb describe-import \
+    --import-arn '<paste ARN here>' \
+    --output json --query '{"Status         ":ImportTableDescription.ImportStatus, "FailureCode    ":ImportTableDescription.FailureCode, "FailureMessage ":ImportTableDescription.FailureMessage }'
+```
+
+We can also check the import status within the AWS Console.
+
+3. Click into the separate browser tab titled "AWS Cloud9" to open the AWS Console.
+4. In the search box, type DynamoDB to visit the DyanmoDB console.
+5. From the left nav, click Imports from S3.
+6. Notice your import is listed along with the current status. 
+ ![Import from S3](/static/images/relational-migration/import-from-s3.png)
+7. Once the import has completed, you can click it to see a summary including item count and the size of the import.
+8. On the left nav, click to Tables.
+9. In the list of tables, click on the Customers table.
+10. On the top right, click on Explore Table Items.
+11. Scroll down until you see a grid with your imported data.
+
+Congratulations! You have completed a relational-to-DynamoDB migration.
diff --git a/content/relational-migration/data migration/index4.en.md b/content/relational-migration/data migration/index4.en.md
@@ -0,0 +1,36 @@
+---
+title : "VIEW migration"
+weight : 33
+---
+
+## Migrating from a VIEW
+
+In the previous step, you simply ran ```./migrate.sh Customers``` to perform a migration of this table 
+and data to DynamoDB.
+
+You can repeat this process to migrate the custom view vCustOrders.
+
+1. Run:
+```bash
+./migrate.sh vCustOrders
+```
+
+The script assumes you want a two-part primary key of Partition Key and Sort Key, found in the two leading columns.
+
+In case you want a Partition Key only table, you could specify this like so. 
+
+```bash
+./migrate.sh vCustOrders 1
+```
+
+But don't run this command, because if you do, the S3 Import will fail as you already have a table called vCustOrders.
+You could create another view with a different name and import that, or just delete the DynamoDB table
+from the DynamoDB console before attempting another migration of vCustOrders.
+
+However, this is not advisable since this particular dataset is not unique by just the first column.
+
+![View output](/static/images/relational-migration/view_result.png)
+
+::alert[Import will write all the records it finds in the bucket to the table. If a duplicate record is encountered, it will simply overwrite it. Please be sure that your S3 data does not contain any duplicates based on the Key(s) of the new table you define.]{header="Note:"}
+
+The second import is also not advisable since if you created a new vCustOrders table in step 1, the second Import attempt would not be able to replace the existing table, and would fail.
diff --git a/content/relational-migration/data migration/index5.en.md b/content/relational-migration/data migration/index5.en.md
@@ -0,0 +1,30 @@
+---
+title : "SQL Transformation Patterns for DynamoDB"
+weight : 34
+---
+
+## Shaping Data with SQL
+
+Let's return to the web app and explore some techniques you can use to shape and enrich your relational
+data before importing it to DynamoDB.
+
+1. Within the web app, refresh the browser page.
+2. Click on the Querying tab.
+3. Notice the set of SQL Sample buttons below the SQL editor.
+4. Click button one. 
+The OrderLines table has a two-part primary key, as is common with DynamoDB. We can think of the returned dataset as an Item Collection.
+5. Repeat by clicking each other sample buttons. Check the comment at the top of each query, which summarizes the technique being shown.
+
+![SQL Samples](/static/images/relational-migration/sparse.png)
+
+Notice the final two sample buttons. These demonstrate alternate ways to combine data from multiple tables.
+We already saw how to combine tables with a JOIN operator, resulting in a denormalized data set.
+
+The final button shows a different approach to combining tables, without using JOIN. 
+You can use a UNION ALL between multiple SQL queries to stack datasets together as one. 
+When we arrange table data like this, we describe each source table as an entity and so the single DynamoDB
+table will be overloaded with multiple entities. Because of this, we can set the partition key and sort key
+names to generic values of PK and SK, and add some decoration to the key values so that it's clear what type
+of entity a given record represents.
+
+![Stacked entities](/static/images/relational-migration/stacked.png)
diff --git a/content/relational-migration/data migration/index6.en.md b/content/relational-migration/data migration/index6.en.md
@@ -0,0 +1,20 @@
+---
+title : "Custom VIEWs"
+weight : 35
+---
+
+## Challenge: Create New Views
+
+The SQL editor window is provided so that you have an easy way to run queries and 
+experiment with data transformation techniques.
+
+Using the sample queries as a guide, see how many techniques you can combine in a single query.
+Look for opportunities to align attributes across the table so that they can be queried by a GSI. 
+Consider using date fields in column two, so that they become Sort Key values, and be queryable with 
+DynamoDB range queries.
+
+1. When you have a SQL statement you like, click the CREATE VIEW button.
+2. In the prompt, enter a name for your new view. This will add a CREATE VIEW statement to the top of you query. 
+3. Click RUN SQL to create the new view. 
+4. Refresh the page, and your view should appear as a button next to the vCustOrders button.
+