In this demo, you'll get an overview of Azure's Load Testing service; a managed service that can be used to simulate load on your application's UI and APIs endpoints.
You'll also get an insight into:
- how to identify an application's breaking point under incrementally increasing load.
- how to leverage server-side metrics and Azure AppInsights to identify the performance bottleneck.
- how to guard your application against performance regressions leveraging load testing in CI/CD pipelines.
All these are especially crucial for an e-commerce application like Contoso Traders, which is expected to instantly handle a large, sudden spike in number of users, with low latency and no downtime.
Please execute the steps outlined in the deployment instructions to provision the infrastructure in your own Azure subscription.
-
In the Azure portal, you can navigate to the Azure Container App in the
contoso-traders-rg{SUFFIX}
resource group. This is the application that hosts theCarts API
. -
You can get the URL of the
Carts API
by as shown below. -
In a separate browser tab, enter the following url in the address bar to load the API's swagger page:
<ACA url>/swagger/index.html
-
You can now identify the API that you want to load test. In this case, we'll be load testing the
Carts API
'sGET <ACA url>/v1/ShoppingCart/loadtest
endpoint. Please note down this endpoint for later use.
-
In the Azure portal, you can navigate to the Azure Load Testing service in the
contoso-traders-rg{SUFFIX}
resource group. -
You can create a new load test as follows: Navigate to the
Tests
section, and then click onCreate
>Create a URL-based Test
button. -
In the
Basic
blade, you can specify the target URL. You can also specify the number of concurrent users, and the duration of the test. See example below:
Note: The target URL is the URL from the
Carts API
that you identified in the previous section.
-
Once you've entered the load test specifications above, you can run it by clicking on the
Run
button. -
The load test will take about 2 minutes to complete. Once done, it'll display the summary and client-side metrics.
-
Click on the
App Components
button. Then from the flyout, select thecontoso-traders-carts{SUFFIX}
CosmosDB component. This will add relevant metrics from the CosmosDB to the load test dashboard. -
Re-run the load test, and you'll see the impact of the synthetic load on the DB (in real-time).
Note: Unfortunately, ACA metrics are not yet supported in Azure Load Testing's server side metrics. This feature will be coming soon.
-
In the Azure portal, you can navigate to the Azure Container App in the
contoso-traders-rg{SUFFIX}
resource group. -
For demo purposes, we have configured a
HTTP Scaling
rule that horizontally scales out additional replicas when the number of concurrent requests exceeds a threshold (3
in this case). ACA also supports automatic scale-in to zero when traffic dips below threshold. -
In the metrics tab, you can see the various metrics measured & published by the ACA infrastructure. You can create a metric chart that combines two metrics:
replica count
vsrequests
. It'll now have updated with the latest data after the load test. Of particular interest is the replica count chart ofCarts API
, which shows the instances auto-scaled out under increasing load. After load subsided, the instances auto-scaled back in to zero.
-
Navigate back to the Load Testing service, and click on the recently concluded test run. From there you can click on
Download
>Input File
. This will download the JMX file in a zip archive. -
You can review the JMX file by simply loading it up in notepad or VSCode.
-
The load test results can also be downloaded via in the
Download
>Results
button. This will download a CSV file (inside a zip archive).
-
Let us now modify the existing load test. We'll use it to put the application under increasing load, ultimately leading to failure. The goal is to identify the application's breakpoints (performance bottlenecks).
-
Modify the existing test configuration as follows:
- Increase the number of concurrent users to
250
(from original5
). - Change the test duration to
300
seconds (from original120
seconds) - Change the ramp-up time to
300
seconds (from original120
seconds).
- Increase the number of concurrent users to
-
Increase the number of engine instances to
2
(from original1
). -
Run the modified load test. You'll notice that the application starts to eventually fail under the increased load.
-
If you add the server-side metrics for the
contoso-traders-cartsct{SUFFIX}
CosmosDB, you'll notice that the DB's normalized RU consumption eventually starts to peg at 100% under load. -
App Insights can help us narrow down the root cause of the error. Navigate to the
contoso-traders-rg{SUFFIX}
resource group, and click on thecontoso-traders-aict{SUFFIX}
resource. -
In the App Insights blade, click on the
Failures
tab. Narrow down the time range to (say) the last 30 minutes. You'll see the listed failures (sampled by App Insights) that occurred during the load test. -
Clicking on any one sample will give you a detailed view of the error (including stack trace in case of an exception). In this case, the error is a
500
error, caused by aTaskCanceledException
(due to a gateway timeout in CosmosDB). This is a good indication that the application is failing due to a performance bottleneck.
-
We have a GitHub workflow that executes load tests on the application's APIs. This workflow is automatically triggered on every checkin to the
main
branch. Specifically the load tests are run on theProduct API
andCarts API
immediately after they're deployed to the AKS cluster and ACA respectively. This will help identify if any code (or infra) change causes the application performance to degrade under (simulated) load. -
The workflow uses a github action to invoke the Azure Load Testing service and simulate load on the application's
Product API
andCarts API
, which are hosted on AKS and ACA respectively. -
The workflow file references a load test configuration file (yml), which specifies the following:
- The load test parameters.
- The JMX/JMeter script to be used.
- The pass/fail criteria for the test.
See an example of a load test configuration file below.
testName: contoso-traders-carts testPlan: contoso-traders-carts.jmx engineInstances: 1 failureCriteria: - avg(response_time_ms) > 5000 - percentage(error) > 20
-
The load test takes about 3 minutes to execute. In this specific example, you can see that the load test failed since the average response time exceeded the specified threshold of 5000ms.
-
Once done, you can navigate to the Azure Portal to get more in-depth details about the test.
In this demo, you got an overview of Azure's Load Testing service; including how to create a load test, run it, and review the results. You also saw how to incorporate server-side metrics from Azure Services, and how to export the JMX file and results. Finally, you saw how to create a new load test from the JMX file, and how to use a GitHub workflow to execute load tests on the application's APIs.