Skip to content

Latest commit

 

History

History

highly-recursive-do

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Prompt:

I want to create a Cloudflare Worker with a RecursiveFetcherDO:

The worker fetch handler takes a GET request with an ?amount param and generates that amount of URLs that lead to https://test.github-backup.com?random=${Math.random()}

It sends that to the RecursiveFetcherDO which should ultimately return a mapped object {[status:number]:number} that counts the status of requests (or 500 if the DO creation failed)

The DO will do 2 things:

1. if it takes in more than 1 URL, it will chunk the url array in up to 5 chunks (this number is a configurable constant we want to experiment with) and creates a new instance of RecursiveFetcherDO for each chunk, finally aggregating the mapped status object.
2. if it takes just 1 URL, will fetch that URL and return { [status:number]: 1 } or { 500: 1 } if it crashes

Please implement this in cloudflare, typescript.

First I did a Naive implementation like above, without exponential backoff.

Results with BRANCHES_PER_LAYER of 2, meaning it is highly recursive:

Results with BRANCHES_PER_LAYER of 3 (We won't hit a depth of 15):

  • 50000 requests: {"result":{"200":49270,"500 - Failed to fetch self - Your account is generating too much load on Durable Objects. Please back off and try again later.":474,"503:error code: 1200":253,"500 - Failed to fetch self":3},"duration":17462}
  • 3^10 requests (59049): {"result":{"200":58138,"500 - Failed to fetch self - Your account is generating too much load on Durable Objects. Please back off and try again later.":911},"duration":10784}.
  • 250000 requests: {"result":{"200":115023,"503:error code: 1200":3915,"500 - Failed to fetch self - Your account is generating too much load on Durable Objects. Please back off and try again later.":131061,"500 - Failed to fetch self":1},"duration":16153}

After this, I implemented exponential backoff as can be seen in the current implementation. The results show it's very stable for 100k requests:

{"result":{"200":100000},"duration":13895}
{"result":{"200":100000},"duration":11600}
{"result":{"200":100000},"duration":12396}
{"result":{"200":100000},"duration":12901}
{"result":{"200":100000},"duration":56135}
{"result":{"200":100000},"duration":13297}
{"result":{"200":100000},"duration":14078}
{"result":{"200":100000},"duration":37302}
{"result":{"200":100000},"duration":12484}
{"result":{"200":100000},"duration":65307}

100k fetch responses in 11.6 seconds, that's an impressive feat!

With 1M requests it takes for ever to answer, so there must be better things we can do. If we could handle concurrency better, it may work. Nevertheless, that's not the purpose of this experiment; My goal was to do 100k requests as fast as possible and that succeeds with decent looking reliability.

If we wanted to controll max concurrency we could've just used queues and carefully ramp it up. This shows we can also immediately instantiate 100k DOs.