An example that showcases the benefit of running AI inside Redis.
This repository references from Chat Bot Demo. I recommend checkout the parent repo to have the overview of Redis AI.
Instead of direct invoke model in APIs, our solution is routing task to most efficient model on all Redis AI instance, and allow locking if needed. This make our system horizontally scalable. It can describe by bellow steps:
- Step 1: APIs receive user requests, push them to 'Resource Manager Queue', and waiting for response on Redis PubSub channel with task_id.
- Step 2: Worker pick up the task and find the best host/model. If the model be found, then create new invoking task and push to it invoke queue with model and host information. If not, the task will be retried until one available.
- Step 3: Workers will pick the invoke task and process it. The invoking will include multiple tensor feed, invoke calls to the Redis AI base on the AI model that need to run. After finished, result will be push back to API by Redis PubSub.
- Step 4: API return response to user if is final response.
- Docker
- Docker-compose
$ git clone https://github.com/phamvanvuhb1999/workhorse.git
$ cd workhorse
$ docker-compose up
Try out the API (we have only one API endpoint -> /chat
which accepts message
as the JSON key with your message as value) using curl
as shown below.
curl http://localhost:5000/chat -H "Content-Type: application/json" -d '{"message": "I am crazy"}'
After that, you could try with '/orc' API to see the resource manager be implemented with Celery work.
Open a browser and point it to http://localhost:5000
, you'll see the chat view that still be keep from the parent repo.