Design: Fast private variables

At the moment, private (statically scoped) variables are broken and not faster than dynamic variables (see https://github.com/gerdstolpmann/omake/issues/43 for a description of the problems). We sketch here a new implementation fixing the problems.

Currently, the private variables are a map of bindings of values to symbols, and this map is passed on with the execution context. Functions remember the map of the definition time, and reset the current map of private variables every time the function is entered. This is semantically wrong, and because we have to update the (functional) map for every assignment it is also slow.

Mutable stacked frames

The suggestion is to switch to a stack of mutable frames, where every frame is an array of fixed size. The array maps the array position to the current value. Essentially, this is the "usual" compiler strategy.

We define

type frame =
  { id : scope_id;
    dparent : frame;
    sparent : frame;
    content : value array
  }

A frame is allocated when a function is entered. The dparent is the dynamic parent, i.e. the frame of the caller of the function. The sparent is the static parent, i.e. the current frame when the function was defined.

All frames for the same scope share the same scope_id. In particular, when a new function is defined, the scope for that function gets a fresh scope_id.

Note that we avoid option types for dparent and sparent. The root frame is

let rec root_frame =
  { id = ...;
    dparent = root_frame;
    sparent = root_frame;
    content = [| |];
  }

Environment: also includes a dictionary

The static environment consists of

the current frame
a dictionary of all visible private variables

type env =
  { frame : frame;
    dict : (string -> (scope_id * int)) mapping;
  }

Every name is mapped to a pair (id,k) with the ID of the scope, and the position k of the variable in the frame for the scope.

How private variables are referenced

A reference to a private variable is a pair (p,k) where p says how many static levels to ascend (= follow sparent) and k is the index in content.

We change var_info as follows:

type var_info =
   VarPrivate        of Lm_location.t * var * scope_id * (* p: *) int * (* k: *) int
 | VarThis           of Lm_location.t * var
 | VarVirtual        of Lm_location.t * var
 | VarGlobal         of Lm_location.t * var

In Omake_ast_ir there is already a static analysis of the scopes. The VarPrivate generated there calculates the right p and k parameters.

Caveat: Note that var_info is then only valid in a certain scope. When a private variable is copied to another scope the parameters need to be updated. This is the main reason why we save the scope_id with the private variables. This ID allows us to check at runtime whether the frame reachable by ascending the static parent p times has the expected ID.

How private variables are updated

Generally, an update of a variable that was defined in the same scope is just accomplished by mutating the content array of the frame. We only allow updates of private variables in the same function, but not updates of variables in parent scopes. The only exception of that are anonymous functions created with the => notation which simply count as subscope (see below) but not as a new functional scope.

Omake does not only create new scopes for functions, but also for sections within functions. These subscopes need to be managed.

All subscopes of a function share the same frame, but the variables with the same names use different array indices. Example:

f() =
    private.x = foo               # index 0
    section
        x = bar                   # index 1

The export directive (which is still necessary) copies the variables within the same frame:

f() =
    private.x = foo               # index 0
    section
        x = bar                   # index 1
        export x                  # copy index 1 to index 0

Discussion

The following works correctly with the new concept:

private.x = foo
f() =
    println($(x))
f()                   # prints: foo
x = bar
f()                   # prints: bar

In the old implementation, the second call of f resets the static environment to the state when the function f was defined. Hence, both invocations of f print foo. In the new implementation, the frame is directly mutated, and f sees the update.

Loops of the following kind were initially the reason to investigate the problems with private variables (see https://github.com/gerdstolpmann/omake/issues/40):

f(vals) =
    private.l[] = 
    foreach(n => ..., $(vals))
        l += $(n)
        export l
    value $(l)

The body of foreach is now considered as a body taking parameters, but no longer as an anonymous function. In the new implementation of private variables this body is considered as a subscope of the function f. The instance of l in the body gets a separate index in the frame. The export copies this frame value to the index of l in the main function.

Note that foreach is a primitive, but nevertheless a function. The question is what the target of the export is. For dynamic variables, it is always the direct caller, i.e. l as dynamic variable would be exported to foreach which would have to reexport l to its own caller, f. For private variables, the target of the export is always the enclosing (sub) scope, in the static sense. The difference becomes clearer when we define a wrapper for foreach:

wrap(body, list) =
    value $(foreach $(body), $(list))

f(vals) =
    private.l[] = 
    wrap(n => ..., $(vals))
        l += $(n)
        export l
    value $(l)

With private variables, this continues to work. With dynamic variables, we need to add further code to pass the updates on:

wrap(body, list) =
    export                       # export all dynamic updates we receive from `foreach`
    value $(foreach $(body), $(list))

f(vals) =
    public.l[] =                 # l has now dynamic scope
    wrap(n => ..., $(vals))
        l += $(n)
        export l
    value $(l)

The => notation no longer creates a function, but just a parameterized body. The main difference is that it shares the frame with the enclosing function, and does not create a subframe when it is called. This is visible to the programmer, as this example demonstrates:

tricky(body, k) =
    body($(body), $(k))
f() =
    tricky(self,k => ..., 0)
        declare private.j
        if $(eq $k, 0)
            j = foo
            self($(self), 1)
        else
            value $j

The tricky helper calls the body with itself as first argument. This allows it to do a recursive call (when k=0). We also set j=foo to have something to remember from the first invocation of the body. In the self call, the value j is still the same (foo) because the frame is the same, and the result is foo. If the body was executed as a real function, a new frame would be allocated, and j would be again undefined.

Note that there is still the fun keyword to define real anonymous functions, but for these the same restrictions apply as for named functions, in particular we don't allow exports of private variables at the end of the function to the enclosing scope.

Is there anything bad with mutable frames?

Omake tries to avoid mutable state as much as possible for two reasons:

A subscope can be created by just sharing the current definition of variables with the parent.
Windows only: Sub processes are emulated by threads, and immutable content makes it easy to keep the state of the threads separated.

Are mutable frames incompatible with these techniques?

Subscopes: The dynamic scope is unaffected by our change. For the static scope we define explicit ways how to deal with the problem of creating subscopes.
Threads-emulate-processes: When a new thread is created, we need to copy the whole stack. This is quite expensive, but this operation is rare, and it is unlikely to reduce the measurable performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly