You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today Bigslice does not implement explicit memory management of task outputs and state. Instead it serializes task output to disk and relies on the operating system's page caching system to keep the output in memory. Memory associated with intermediate state is managed by a set of heuristics (e.g., canary sizes, spill thresholds, etc.)
We propose to introduce explicit memory management into Bigslice. This will accomplish two objectives: (1) it will help us keep un-serialized task output in memory, so that follow-on tasks can be run without an extra decode; (2) operations like cogroup and reduce can be made more efficient as they can use more memory, relying on actual runtime memory footprints instead of heuristics.
How can we reclaim memory?
Since Bigslice calls into user code, it cannot control memory allocation in a fine-grained way. (Contrast with, e.g., a query processing engine with limited data types and operations.)
Instead, we can treat the Bigslice program as a black box that allocates and frees memory. We can monitor memory usage and apply a simple control regime to reclaim memory and avoid OOMing.
We have at least three mechanisms for reducing memory footprint in such a black box: spilling outputs to disk, and pausing task processing, or aborting task processing.
Abstractly, a controller could define a set of watermarks and apply reclamation actions with increasing aggressiveness until memory utilization is driven below a low watermark. One such scheme defines two watermarks: a low watermark, and a high watermark. If the process exceeds the high watermark, it pauses task processing and begins spilling outputs to disk. Outputs are spilled in order of their recency. Once memory utilization decreases to below the low watermark, spilling stops. If the low watermark is not reached after all available outputs have been spilled, the worker can begin to abort tasks until memory utilization dips below the low watermark. If the low watermark has not been reached still, then either (1) there is a space leak in Bigslice itself, or (2) user code has allocated memory associated with global references that cannot be cleared by Bigslice. These conditions can both be treated as OOM conditions.
A second controller could adjust the job's load factor based on the rate of aborted tasks, so that future tasks are less likely to be aborted.
APIs for reclamation
Bigslice could support reclamation through an API that lets various components register reclaimable objects in a central registry.
type Reclaimable interface {
// Reclaim reclaims the resource represented by
// this object.
Reclaim() error
// This type could include methods to indicate priority,
// cost, etc.
}
// Register registers the a reclaimable object.
func (*Reclaimer) Register(r Reclaimable)
// Reclaim reclaims the next object in the reclamation
// queue. Returns true when a reclamation was performed,
// or an error if the reclamation failed.
func (*Reclaimer) Reclaim() (ok bool, err error)
Prior work
An earlier change implemented memory reclamation for combiners.
The text was updated successfully, but these errors were encountered:
Today Bigslice does not implement explicit memory management of task outputs and state. Instead it serializes task output to disk and relies on the operating system's page caching system to keep the output in memory. Memory associated with intermediate state is managed by a set of heuristics (e.g., canary sizes, spill thresholds, etc.)
We propose to introduce explicit memory management into Bigslice. This will accomplish two objectives: (1) it will help us keep un-serialized task output in memory, so that follow-on tasks can be run without an extra decode; (2) operations like cogroup and reduce can be made more efficient as they can use more memory, relying on actual runtime memory footprints instead of heuristics.
How can we reclaim memory?
Since Bigslice calls into user code, it cannot control memory allocation in a fine-grained way. (Contrast with, e.g., a query processing engine with limited data types and operations.)
Instead, we can treat the Bigslice program as a black box that allocates and frees memory. We can monitor memory usage and apply a simple control regime to reclaim memory and avoid OOMing.
We have at least three mechanisms for reducing memory footprint in such a black box: spilling outputs to disk, and pausing task processing, or aborting task processing.
Abstractly, a controller could define a set of watermarks and apply reclamation actions with increasing aggressiveness until memory utilization is driven below a low watermark. One such scheme defines two watermarks: a low watermark, and a high watermark. If the process exceeds the high watermark, it pauses task processing and begins spilling outputs to disk. Outputs are spilled in order of their recency. Once memory utilization decreases to below the low watermark, spilling stops. If the low watermark is not reached after all available outputs have been spilled, the worker can begin to abort tasks until memory utilization dips below the low watermark. If the low watermark has not been reached still, then either (1) there is a space leak in Bigslice itself, or (2) user code has allocated memory associated with global references that cannot be cleared by Bigslice. These conditions can both be treated as OOM conditions.
A second controller could adjust the job's load factor based on the rate of aborted tasks, so that future tasks are less likely to be aborted.
APIs for reclamation
Bigslice could support reclamation through an API that lets various components register reclaimable objects in a central registry.
Prior work
An earlier change implemented memory reclamation for combiners.
The text was updated successfully, but these errors were encountered: