HIL (redis) #52

jankrepl · 2024-11-19T09:20:43Z

How to test?

First of all, spin up redis

$ docker run -p 6380:6379 -d redis

And then add the following entry to your .env

NEUROAGENT_HIL__REDIS_URI=redis://localhost:6380

I added a get-now tool that is the only tool that has the hil: True.

$ curl -X POST -H "content-type: application/json" -H "x-project-id: $PROJECT_ID" -H "x-virtual-lab-id: $VIRTUAL_LAB_ID" -H "Authorization: Bearer $TOKEN" -d '{"query": "Could you please call the get-now tool three times as the same time"}' 'localhost:8001/qa/run/' | jq

You can then use the approvals router and the endpoint to view and decline/approve outstanding requests.

jankrepl · 2024-11-19T12:26:05Z

swarm_copy/app/dependencies.py

@@ -230,7 +232,7 @@ async def get_vlab_and_project(


 def get_starting_agent(
-    _: Annotated[None, Depends(get_vlab_and_project)],
+    # _: Annotated[None, Depends(get_vlab_and_project)],


I just did this since the token generating service was down at the time of writing this PR.

We should not merge that

jankrepl · 2024-11-19T12:32:27Z

swarm_copy/stream.py

@@ -21,7 +21,7 @@ async def stream_agent_response(
 ) -> AsyncIterator[str]:
    """Redefine fastAPI connections to enable streaming."""
    if isinstance(agents_routine.client, AsyncOpenAI):
-        connected_agents_routine = AgentsRoutine(
+        connected_agents_routine = AgentsRoutine(  # type: ignore[call-arg]


I did not implement the logic here. I mean nothing would change really...just did not want to overcomplicate this poc

WonderPG · 2024-11-19T14:18:06Z

Similar to Phantasm in terms of idea (besides the approval dashboard), however there is still the issue of the execution hanging. The hanging time is handled with the ttl, which for now is set to 10 mins by default. Now in my opinion issues will arise when we will start having more users. With more traffic we will have more and more concurrent hanging executions. To limit that we will have to reduce the ttl until a certain limit, which must have a lower bound to ensure that a user has enough time to read and modify his configuration.

Also thinking about the future, if we implement the queue mechanism to pick up requests it will block the queue for as long as the user wants, up to the ttl limit. This is both an advantage and a disadvantage: If your request is picked up you should be able to execute it until the end without going back to the queue but you also block other requests that could run while you do your approval. This is rather for larger simulations, since I assume that you will want to double check/do some research before clicking the "Yes I want to spend a ton of money" button. Speaking of which, having a larger ttl might also be necessary for bigger simulations since you want to quadruple check it before ruining your vlab.

On the plus side this solution is definitely cleaner than ours. I'm pretty sure we could refactor ours big time but it will remain some sort of mess regardless.

Also couple of side questions:

Is it okay for the frontend to wait for the answer of the user's request, while sending approval requests on the side ?
What will happen if the user clicks on another conversation while polling for approval then going back to the original one ?
How is the frontend knowing when there is something awaiting for approval ? Do they have to regularly query the GET all to know ?
What if the user wants to list the tools used to generate an AI message and one of the tool was modified by HIL ? The user will see the wrong inputs ? Also we have seen that this can be an issue for the model, e.g. ask to resolve Thalamus id -> HIL -> change Thalamus to auditory cortex -> Model says "I found the id of the thalamus" while showing the ID of the Auditory cortex. Modifying the inputs of the AI_TOOL message solved the problem apparently.
Aren't we hanging resources on the frontend as well ?

Anyway I tried to be super critic, but definitely a great solution I like it. Simple and efficient. Thank you for that !

WonderPG · 2024-11-19T14:22:01Z

Ah I see that modifying the input is not possible for now. I guess it's the patch endpoint that will handle it somehow ?

jankrepl · 2024-11-19T19:05:43Z

Now in my opinion issues will arise when we will start having more users. With more traffic we will have more and more concurrent hanging executions

Possibly. Honestly, I am not able to say at this point what is going to happen when there are a lot of requests. We can alternatively go for the pub sub model (redis supports it) but I guess there might be other downsides.

having a larger ttl might also be necessary for bigger simulations since you want to quadruple check it before ruining your vlab.

Yeh. Very likely. Agree.

Is it okay for the frontend to wait for the answer of the user's request, while sending approval requests on the side ?

Sure. Frontends do everything async so this isn't a problem.

What will happen if the user clicks on another conversation while polling for approval then going back to the original one ?

So currently the approvals are basically identified via approval_key=approval/<userid>/<approvalid>. So they are "global" for a given user and there is no concept of threads (=conversations). Of course this could be customized.

How is the frontend knowing when there is something awaiting for approval ? Do they have to regularly query the GET all to know ?

Exactly. The frontend would indeed have to be polling the approvals/ endpoint every X milliseconds. https://developer.mozilla.org/en-US/docs/Web/API/Window/setInterval

What if the user wants to list the tools used to generate an AI message and one of the tool was modified by HIL ? The user will see the wrong inputs ? Also we have seen that this can be an issue for the model, e.g. ask to resolve Thalamus id -> HIL -> change Thalamus to auditory cortex -> Model says "I found the id of the thalamus" while showing the ID of the Auditory cortex. Modifying the inputs of the AI_TOOL message solved the problem apparently.

This POC does not allow for modifications. I guess they are useful and phantasm does support them.

Aren't we hanging resources on the frontend as well ?

I think the Frontend -> Backend polling is not a problem. Backend -> Redis might be more prone to issues but I am not sure.

Anyway I tried to be super critic, but definitely a great solution I like it. Simple and efficient. Thank you for that !

:)) Thanks. Again, it was just a suggestion. Happy to change it/not use it

@WonderPG

jankrepl · 2025-01-09T08:29:28Z

Closing this one since it is not the approach we decided to go for

Add a new config section

4c01253

jankrepl marked this pull request as draft November 19, 2024 09:20

jankrepl added 9 commits November 19, 2024 10:21

Update changelog

9ef17a1

Avoid mentioning seconds explicitly

5bc4072

Add stuff

03c8adf

Add a new tool very lightweight

04d4012

WIP

bf621e4

Initial stuff

4e51bd3

CRUD

ca76372

Finalize

153b71e

Fix CI

825d630

jankrepl commented Nov 19, 2024

View reviewed changes

Fix mypy

95400a0

jankrepl commented Nov 19, 2024

View reviewed changes

jankrepl marked this pull request as ready for review November 19, 2024 12:34

jankrepl added 3 commits November 19, 2024 13:37

Fix issue

b3c6d2d

Add timeout possibility

ca6760b

Add tool name

e4d7804

jankrepl closed this Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HIL (redis) #52

HIL (redis) #52

jankrepl commented Nov 19, 2024 •

edited

Loading

jankrepl Nov 19, 2024

jankrepl Nov 19, 2024

WonderPG commented Nov 19, 2024

WonderPG commented Nov 19, 2024

jankrepl commented Nov 19, 2024 •

edited

Loading

jankrepl commented Jan 9, 2025

HIL (redis) #52

HIL (redis) #52

Conversation

jankrepl commented Nov 19, 2024 • edited Loading

How to test?

jankrepl Nov 19, 2024

Choose a reason for hiding this comment

jankrepl Nov 19, 2024

Choose a reason for hiding this comment

WonderPG commented Nov 19, 2024

WonderPG commented Nov 19, 2024

jankrepl commented Nov 19, 2024 • edited Loading

jankrepl commented Jan 9, 2025

jankrepl commented Nov 19, 2024 •

edited

Loading

jankrepl commented Nov 19, 2024 •

edited

Loading