You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Earlier iteration used min/max numeric values for memory size, with a separate string for the units.
memory:
min: 21
max: 24
units "GiB"
The client set the minimum value, and the server set the maximum value. That worked, but it kind of mangled the meaning of min and max.
The current design represents memory size as string values for requested and offered, with units in the string.
memory:
requested: "21GiB"
offered: "24GiB"
That's nice for human to write, but crappy for a machine to read. Having seen this in action, it is probably better to split this back into a numeric value and a separate string for the units.
The meaning of requested and offered is clearer than the original min and max, but again, having seen this in action, I'm questioning why we need the separate values.
So it might be better to simplify it to a single value.
memory:
size: 21
units "GiB"
However, while working on the allocation algorithm we have found a use case that uses the max and min fields
as actual maximum and minimum values.
If the user requests multiple compute resources, the resource allocation algorithm doesn't do a separate evaluation for each compute resource. The service combines the memory and cpu core requirements for all of the compute resources into total values for the whole set. It then runs the allocation algorithm based on those totals, looking for blocks that have at least that amount of resources available. If it finds a block that has more resources available, it will create an offer distributing them across the compute nodes based on the ratio of how much they requested.
For example, if the user asks for two compute nodes, a head-node with 4GiB of memory and a compute-node with 16GiB of memory.
The allocation algorithm will add these together and look for a block of resources that has at least 20GiB of memory.
If the search algorithm finds a block with 64GiB of memory, it will distributed this across the nodes based on a ratio of the requested sizes. Allocating 4 * (64 / 20) = 12GiB to the head-node and 16 * (64 / 20) = 51GiB to the compute-node.
There is a use case where the head-node is just running a simple UI and the user wants as much as possible of the memory to be allocated to the compute-node. To meet this use case we can add the min and max values discussed earlier, using the max value to limit the amount of memory to allocate to the head-node and leaving the compute-node unlimited.
The service would allocate the extra memory up to the limit on the head-node, and then allocate the rest to the compute-node.
resources:
compute:
- name: head-node
memory:
min: 4GiB
max: 4GiB # limit set by the client
- name: compute-node
memory:
min: 16GiB
max: 51GiB # extra allocated by the service
The min and max values make sense in the request, but do they make sense in the response?
We be very specific in the text description about what they mean in the context of a request and response, but that approach is open to mis-interpretation.
Also, because this structure doesn't distinguish between the max value on the head-node set by the client and the max value on the compute-node set by the service we can't use a response from one service as a request to another service without a complicated edit.
If we send this response as-is to another service that has 100GiB of memory available, the max: 51GiB on the compute-node will prevent the new service from allocating more memory to it.
Separating out the offered memory size as a separate value in the response solves this.
resources:
compute:
- name: head-node
memory:
min: 4GiB
max: 4GiB # limit set by the client
size: 4GiB
- name: compute-node
memory:
min: 16GiB
size: 51GiB
That makes a clearer separation between what the client is requesting and what the service is offering
and we can filter the response from one service to remove the size value and send it as a request to
another service
However, this structure is very specific to this case and looses the option to add more fields to the request or offer. sections.
If we keep the explicit requested and offered sections, then we leave space to add more fields later on.
For example, some computing algorithms were particularly sensitive to memory speed, then in the future we could add a ddr field to the request and response sections.
In this example it is clear that the client is specifying DDR4-4000 memory in the comp-node and the server is responding with an offer of DDR4-4400 for both the head-node and comp-node.
This case is way out of scope for the current iterations, but it shows how keeping the request and offer sections separate provide a level of future proofing to the structure.
A more appropriate example would be read/write bandwidth and latency for disc based storage. The same requested and offered pattern is likely to be very useful for describing the client's requirements and the service offers for storage resources.
We aren't quite there yet, but we can evolve towards that gradually. In particular, the requirement for the service to remember and provide the requested fields in its response is only SHOULD requirement at this stage.
Remembering what the client asked for requires extra code that our current prototype doesn't have. Including the extra code to support this is on the 'nice to have' list.
The text was updated successfully, but these errors were encountered:
Earlier iteration used min/max numeric values for memory size, with a separate string for the units.
The client set the minimum value, and the server set the maximum value. That worked, but it kind of mangled the meaning of min and max.
The current design represents memory size as string values for
requested
andoffered
, with units in the string.That's nice for human to write, but crappy for a machine to read. Having seen this in action, it is probably better to split this back into a numeric value and a separate string for the units.
The meaning of
requested
andoffered
is clearer than the originalmin
andmax
, but again, having seen this in action, I'm questioning why we need the separate values.So it might be better to simplify it to a single value.
However, while working on the allocation algorithm we have found a use case that uses the
max
andmin
fieldsas actual maximum and minimum values.
If the user requests multiple compute resources, the resource allocation algorithm doesn't do a separate evaluation for each compute resource. The service combines the memory and cpu core requirements for all of the compute resources into total values for the whole set. It then runs the allocation algorithm based on those totals, looking for blocks that have at least that amount of resources available. If it finds a block that has more resources available, it will create an offer distributing them across the compute nodes based on the ratio of how much they requested.
For example, if the user asks for two compute nodes, a head
-node
with 4GiB of memory and acompute-node
with 16GiB of memory.The allocation algorithm will add these together and look for a block of resources that has at least 20GiB of memory.
If the search algorithm finds a block with 64GiB of memory, it will distributed this across the nodes based on a ratio of the requested sizes. Allocating
4 * (64 / 20) = 12GiB
to thehead-node
and16 * (64 / 20) = 51GiB
to thecompute-node
.There is a use case where the
head-node
is just running a simple UI and the user wants as much as possible of the memory to be allocated to thecompute-node
. To meet this use case we can add themin
andmax
values discussed earlier, using themax
value to limit the amount of memory to allocate to thehead-node
and leaving thecompute-node
unlimited.The service would allocate the extra memory up to the limit on the
head-node
, and then allocate the rest to thecompute-node
.The
min
andmax
values make sense in the request, but do they make sense in the response?We be very specific in the text description about what they mean in the context of a request and response, but that approach is open to mis-interpretation.
Also, because this structure doesn't distinguish between the
max
value on thehead-node
set by the client and themax
value on thecompute-node
set by the service we can't use a response from one service as a request to another service without a complicated edit.If we send this response as-is to another service that has 100GiB of memory available, the
max: 51GiB
on thecompute-node
will prevent the new service from allocating more memory to it.Separating out the offered memory size as a separate value in the response solves this.
That makes a clearer separation between what the client is requesting and what the service is offering
and we can filter the response from one service to remove the
size
value and send it as a request toanother service
Which responds with an offer of 100GiB for the
compute-node
.However, this structure is very specific to this case and looses the option to add more fields to the request or offer. sections.
If we keep the explicit
requested
andoffered
sections, then we leave space to add more fields later on.For example, some computing algorithms were particularly sensitive to memory speed, then in the future we could add a
ddr
field to the request and response sections.In this example it is clear that the client is specifying
DDR4-4000
memory in the comp-node and the server is responding with an offer ofDDR4-4400
for both the head-node and comp-node.This case is way out of scope for the current iterations, but it shows how keeping the
request
andoffer
sections separate provide a level of future proofing to the structure.A more appropriate example would be read/write bandwidth and latency for disc based storage. The same
requested
andoffered
pattern is likely to be very useful for describing the client's requirements and the service offers for storage resources.In summary, the target to aim for is :
We aren't quite there yet, but we can evolve towards that gradually. In particular, the requirement for the service to remember and provide the
requested
fields in its response is only SHOULD requirement at this stage.Remembering what the client asked for requires extra code that our current prototype doesn't have. Including the extra code to support this is on the 'nice to have' list.
The text was updated successfully, but these errors were encountered: