Add ADR 005: Publishing API debouncing #230

csutter · 2024-03-05T11:54:23Z

No description provided.

leenagupte

I've left one inline comment.

It's annoying that we have to do this, but this does seem like the simplest option to stop hammering Discovery Engine.

docs/adr/005-publishing-api-debounce.md

richardTowers · 2024-03-08T14:52:15Z

docs/adr/005-publishing-api-debounce.md

+Upon receiving a message, a worker would:
+- use the [`redlock-rb` gem][redlock-gem] to acquire a lock of `lock:<content_id>` with a sensible
+  timeout
+- if the lock currently exists, [wait for it to be available again](#addendum-how-to-wait-for-lock)
+- check `latest:<content_id>` to see if a newer message has recently been successfully processed
+  - if so, do nothing and go straight to releasing the lock
+- process the message (update or delete the document in Discovery Engine)
+  - if processing fails, handle as usual (requeue or discard)
+  - otherwise use `SET latest:<content_id> <version>` to update the latest processed version
+- release the lock


Is there still a scenario here where we could exceed Discovery Engine's rate
limit if the messages come in quickly in the correct order? Say:

Time(ms) Payload Version Content ID

0 0 cafebabe

10 1 cafebabe

20 2 cafebabe

30 3 cafebabe

40 4 cafebabe

50 5 cafebabe

If each of thos messages gets processed quickly enough, we'll hit the rate limit.

Do we want to put some kind of deliberate delay in to handle that case?

For example, if we made the locks timeout after 1 second, and then just didn't
release them explicitly, that would throttle messages to one message per
second.

If we do release the locks explicitly, then this won't necessarily slow down
the processing of sequential messages by much.

Good shout - with the caveat that during peak times latency for an upsert API call to Vertex is about 1s anyway, and only the parallel requests are a problem. But an enforced delay of sorts would probably still help us during off-peak times when it's 0.1s.

leenagupte reviewed Mar 8, 2024

View reviewed changes

docs/adr/005-publishing-api-debounce.md Outdated Show resolved Hide resolved

Add ADR 005: Publishing API debouncing

b9c842e

csutter force-pushed the adr branch from 3be8e3a to b9c842e Compare March 8, 2024 14:20

richardTowers reviewed Mar 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ADR 005: Publishing API debouncing #230

Add ADR 005: Publishing API debouncing #230

csutter commented Mar 5, 2024

leenagupte left a comment

richardTowers Mar 8, 2024

csutter Mar 8, 2024

Time(ms)	Payload Version	Content ID
0	0	cafebabe
10	1	cafebabe
20	2	cafebabe
30	3	cafebabe
40	4	cafebabe
50	5	cafebabe

Add ADR 005: Publishing API debouncing #230

Are you sure you want to change the base?

Add ADR 005: Publishing API debouncing #230

Conversation

csutter commented Mar 5, 2024

leenagupte left a comment

Choose a reason for hiding this comment

richardTowers Mar 8, 2024

Choose a reason for hiding this comment

csutter Mar 8, 2024

Choose a reason for hiding this comment