-
Notifications
You must be signed in to change notification settings - Fork 1
JMeter Testing for Custos Deployment
- We performed stress testing and load testing on the Custos microservices using JMeter. The purpose of stress testing is to detect the failure point for the Custos services. We keep increasing the number of concurrent user threads till the service fails. The purpose of load testing is to inspect the performance of the system when it has been subjected to a significant load of requests over a certain amount of time.
- In our approach, we have first tried to identify the point of failure for each microservice/management client by increasing the concurrent load on each microservice. We observed that all microservices/endpoints could handle a load of 100 concurrent threads well.
- We slowly increased the concurrent thread count in increments of 50 and noted the point where the requests started failing and demonstrated a significant increase in error rate.
- After noting down the maximum concurrent thread count the service could handle, we increased the loop count in JMeter for stress testing. We have stress tested the services with requests ranging from 2500 to 5000 (ramp-up period of 1 second) where we had 100-150 concurrent threads running.
- Find Users by Limit
- Create User
- Update User
- Create Group
- Add User to Group
- Create Entity
- Share Entity with Users
- Share Entity with Groups
Tested Functionality: find_users() by limit
- The find_users() gRPC servicer in the user management client was able to handle a load of up to 150 concurrent threads with 0% error and 200 concurrent threads with a low error rate of 1.16%.
- However, when we increased the load to 300 concurrent threads, the error rate went up to 82% as the pods started restarting after processing just 30 requests.
- Hence, our failure point for this endpoint is about 300 concurrent requests.
Pods started restarting after processing just 30 requests when a load of 300 concurrent requests was enforced
300 concurrent requests, ~82% error rate
Tests following functionalities: Register User, Enable User, Add User to Group, Share Entity with Users
- The gRPC servicer endpoints corresponding to register_user(), enable_user(), add_user_to_group(), share_entity_with_users() performed well with concurrent request load of 100, 150, and 200, but started failing after 250 requests.
- There was a spike in the error rate from 2% at 200 to 78% at 300.
Error rate increased from 2% at 200 requests to 78% at 300.
Tested Functionality: update_user_profile()
With a load of 300 concurrent requests, failures started appearing after processing 100 requests as can be seen below.
Tested Functionality: create_group()
Failure point was seen at a load of 300 concurrent requests, after processing 62 requests successfully.
Tested Functionality: add_user_to_group()
- With 150 concurrent requests, the error rate was around 0.67%.
- With 300 concurrent requests, the error rate was around 6.67%.
- With 1000 concurrent requests, the error rate spiked up to ~70%.
150 requests
300 requests
1000 requests
Tested Functionality: create_entity()
- With 150 concurrent requests, the error rate was around 1.33%.
- With 300 concurrent requests, the error rate spiked up to ~44%.
150 requests
300 requests
Tested Functionality: share_entity_with_users()
With 300 concurrent requests, a breaking point was observed after 72 requests were processed.
A large batch of requests failed after 72.
Tested Functionality: share_entity_with_groups()
With a load of 300 concurrent threads, requests started failing after processing 172.
- Testing Overview
- Load Testing
- Overall System Load Testing
- Spike Testing
- Fault Tolerance Testing
- Conclusion & Future Improvements
- Data Assimilation
- Architecture Improvements: Message Queues, Caching and Polling
- CI/CD and Infrastructure Deployment
- Visualization
- Custos Deployment Status
- Rancher Setup
- Kubernetes Cluster Deployment using Rancher
- Setting cert-manager, keycloak, consul, vault and MySQL
- Custos Deployment
- JMeter Testing for Custos Deployment with Python SDK
- Custos - Suggested Improvements