mercury/compiler/notes/ALLOCATION

This document describes the allocation system we are moving towards;
it is not implemented yet.

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------

FORWARD LIVENESS

Each goal has four sets of variables associated with it to give information
about changes in liveness on forward execution. (Backward execution is a
different matter; see a later part of this document.) These four sets are

- the pre-birth set
- the pre-death set
- the post-birth set
- the post-death set

The goal that contains the first occurrence of a variable on a particular
computation path will have that variable in its pre-birth set;
the goal that contains the last occurrence of a variable on a particular
computation path will have that variable in its post-death set.
(The different arms of a disjunction or a switch are different computation
paths. The condition and then parts of an if-then-else on the one hand
and the else part of that if-then-else on the other hand are also different
computation paths.)

Mode correctness requires that all branches of a branched control structure
define the same set of variables; the exceptions are branches that cannot
succeed, as indicated by the instmap at the end of the branch being
unreachable. Such branches are considered by mode analysis to "produce"
any variable they are required to produce by parallel branches.
To make it easier to write code that tracks the liveness of variables,
we implement this fiction by filling the post-birth sets of goals representing
such non-succeed branches with the set of variables that must "magically"
become live at the unreachable point at end of the branch in order to
match the set of live variables at the ends of the other branches.
(Variables that have become live in the ordinary way before the unreachable
point will not be included.) The post-birth sets of all other goals will be
empty.

This guarantees that the set of variables born in each branch of a branched
control structure will be the same, modulo variables local to each branch.

We can optimize the treatment of variables that are live inside a branched
control structure but not after, because it is possible for the variable
to occur in one branch without occurring in the other branches.
Each variable that is live before the branched structure but not after
must die in the branched structure. Branches in which the variable occurs
will include the variable in the post-death set of one of their subgoals.
As far as branches in which the variable does not occur are concerned, the
variable becomes dead to forward execution as soon as control enters the
branch.  In such circumstances, we therefore include the variable in the
pre-death set of the goal representing the branch. (See below for the method
we use for making sure that the values of such "dead" variables are still
available to later branches into which we may backtrack and which may need
them.)

This guarantees that the set of variables that die in each branch of a branched
control structure will be the same, modulo variables local to each branch.

It is an invariant that in each goal_info, a variable will be included
in zero, one or two of these four sets; and that if it is included in
two sets, then these must be the pre-birth and post-death sets. (This
latter will occur for singleton variables.)

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------

STORE MAPS

There are four kinds of situations in which the code generator must
associate specific locations with every live variable, either to put
those variables in those locations or to update its data structures
to say that those variables are "magically" in those locations.

1. At the ends of branched control structures, i.e. if-then-elses, switches
   and disjunctions. All branches of a branched structure must agree exactly
   on these locations.

2. At the start and end of the procedure.

3. At points at which execution may resume after a failure, i.e. at the
   start of the else parts of if-then-elses, at the start of the second and
   later disjuncts in disjunctions, and after negated goals.

4. Just before and just after calls and higher-order calls (but not
   pragma_c_codes).

-----------------------------------------------------------------------------

Ends of branched control structures

We handle these by including a store_map field to if_then_else, switch and disj
goal expressions (this field used to be called the follow_vars field). This
field is ignored through most of the compilation process; it is meaningful
only from the follow_vars pass onwards.

The follow_vars pass fills these fields with advisory information, saying
where things that will be used in code following the branched structure
should be. This advisory information may include duplicates (two variables
mapped to the same location), it may miss some variables that are live at
the end of the branched structure, and it may include variables that are
not live at that point.

The store_map pass uses the advisory information left by the follow_vars pass
to fill in these fields with definitive information. The definitive store maps
cover exactly the set of variables live at the end of the branched structure,
and they guarantee that no two variables are allocated the same location.

The passes of the backend following store_map must not do anything to
invalidate this invariant, which means that they must not rearrange the code
or touch the field. The code generator will use these fields to know what
variables to put where when flushing the expression cache at the ends of
each branch in a branched structure.

-----------------------------------------------------------------------------

Starts and ends of procedures

We handle these using the mechanisms we use for the ends of branched
structures, except the map of where things are at the start and where
they should be at the end are computed by the code generator from the
arg_info list.

-----------------------------------------------------------------------------

Resumption points

We handle these through the cont_lives field in goal infos. During the
store_map pass, we fill in this field for every goal that establishes
a point at which execution may resume after backtracking. This means
the conditions of if-then-elses (the resumption point is the start of
the else part), every disjunct in a disjunction except the last (the
resumption point is the start of the next disjunct), and goals inside
negations (the resumption point is the start of the code following the
negated goal). The value of this field will be the set of variables
whose values may be needed when execution resumes at that point.
Note that for the purposes of handling resumption points, it does not
matter whether any part of an if-then-else, disjunction or negation
can succeed more than once.

The cont_lives field does not assign a location to these variables.
The reason is that as an optimization, each conceptual resumption point
is associated with either one or two labels, and if there are two labels,
these will differ in where they expect these variables to be. The
the failure continuation stack entry created by the code generator
that describes the resumption point will associate a store map with
each label, with each store map assigning a location to each variable
included in the cont_lives set.

The usual case has two labels. The store map of the first label maps each
variable to its stack slot, while the store map of the second label maps
each variable to the location it was occupying on entry to the goal.
The code emitted at the resumption point will have, in order, the first
label, code that moves each variable from its location according to the
first store map to its location according to the second store map
(this will be a null operation if the two maps agree on the location
of a variable). The idea is that any failure that occurs while all these
variables are guaranteed to still be in their original locations can be
implemented as a jump directly to the second label, while failures at
other points (including those from to the right of the disjunct itself)
will jump (directly or indirectly via a redo() or fail()) to the first
label. The section on backward liveness below discusses how we make sure
that at these points all the variables in the cont-lives set are actually
in their stack slots.

We can omit the first label and the code following it up to but not including
the second label if we can guarantee that the first label will never be
jumped to, directly or indirectly. We can give this guarantee for negated
goals, conditions in if-then-elses and disjuncts in disjunctions that cannot
succeed more than once if the goal concerned contains only inline builtins.
We cannot give this guarantee for disjuncts in disjunctions that can succeed
more than once even if the goal concerned contains only inline builtins,
since in that case we may backtrack to the next disjunct after leaving
the current disjunct.

It is an invariant that in a disjunction, the cont-lives field of one
disjunct must contain all the variables included in the cont-lives fields
of later disjuncts.

When one control structure that establishes a resumption point occurs inside
another one, all the variables included in the relevant cont-lives of the
outer construct must appear in *all* the cont-lives fields associated
with the inner construct. This is necessary to make sure that in establishing
the inner resumption point, we do not destroy the values of the variables
needed to restart forward execution at the resumption point established
by the outer construct. (See the section on backward liveness below.)

When one control structure which establishes a resumption point occurs after
but not inside another one, there is no such requirement; see the section
on backward liveness below.

-----------------------------------------------------------------------------

Calls and higher order calls

We handle these by flushing all variables that are live after the call
except those produced by the call. This is equivalent to the set of
variables that are live immediately after the call, minus the pre-birth
and post-birth sets of the call, which in turn is equivalent to the set
of variables live before the call minus the the pre-death and post-death
sets of the call.

The stack allocation code and the code generator figure out the set of
variables that need to be flushed at each call independently, but based
on the same algorithm. Not attaching the set of variables to be saved
to each call reduces the space requirement of the compiler.

The same applies to higher order calls.

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------

BACKWARD LIVENESS

There are three kinds of goals that can introduce nondeterminism: nondet
disjunctions, nondet calls and nondet higher order calls. All code that
executes after one of these constructs must take care not to destroy the
variables that are needed to resume in those constructs. (We are *not*
talking here about preserving variables needed for later disjuncts.)

The variables needed to resume after nondet calls and higher order calls
are the variables saved across the call in the normal fashion. The variables
needed to resume after nondet disjunctions are the variables included in
any of the cont-lives sets associated with the disjuncts of the disjunction.

The achievement of this objective is in two parts. First, the code generator
makes sure that each of these variables is flushed to its stack slot before
control leaves the construct that introduces nondeterminism. For calls and
higher order calls this is done as part of the call mechanism. For nondet
disjunctions, by the end of every disjunct, every variable in the cont-lives
for that disjunct must be in its stack slot. If a variable is forward live
at the end of the disjunct, the code generator will put code to move it
to its stack slot (if it isn't there already). If the variable is not forward
live at the end of the disjunct, then it must have died sometime in the
disjunct. We arrange for the variable to be flushed to its stack slot
at the point(s) of its death (the disjunct may have inside it a branched
control structure; if the variable dies inside that structure, it will die
inside each branch). The way we do this is that when entering a disjunct
of a nondet disjunction (except the last), we push the cont-lives onto
a stack maintained in code_info. When we make a variable dead, we consult
the top entry on this stack. If the variable being made dead is in that set,
we emit code to flush it to its stack slot. For this scheme to work, it must
be an invariant that each entry on this stack includes all the other entries
below it as subsets. Apart from this check, the code generator hangs onto
the values of a variable only if the variable is forward live.

Second, the stack slot allocation pass makes sure that each of the variables
needed to resume in a construct that introduces nondeterminism is allocated
a stack slot that is not reused in any following code from which one can
backtrack to that construct. Normally, this is all following code, but if
the construct that introduced the nondeterminism is inside a cut (a some
that changes determinism), then it means only the following code inside
the cut.

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------

FOLLOW VARS

When the code generator emits code to materialize the value of a variable,
it ought to put it directly into the location where it is required to be next.

The code generator maintains a field in the code_info structure that records
advisory information about this. The information comes from the follow_vars
pass, which fills in the follow_vars field in the goal info structure of some
goals. Whenever the code generator starts processing a goal, it sets the field
in the code_info structure from the field of the goal info structure of that
goal, if that field is filled in.

The follow_vars pass will fill in this field for the following goals:

- the goal representing the entire procedure definition
- each arm of a switch
- each disjunct of a disjunction
- the condition, then-part and else-part of an if-then-else
- the first goal following any non-builtin goal in a conjunction
  (the builtin goals are non-complicated unifications and calls to
  inline builtin predicates and functions)

The semantics of a filled in follow_vars field:

- If it maps a variable to a real location, that variable should be put
  in that location.

- If it maps a variable to register r(-1), that variable should be put
  in a currently free register.

- If it does not map a variable to anything, that variable should be put
  in its stack slot, if that stack slot is free; otherwise it should be put
  in a currently free register.

The follow_vars field should map a variable to a real location if the
following code will require that variable to be in exactly that location.
For example, if the variable is an input argument of a call, it will
need to be in the register holding that argument; if the variable is not
an input argument but will need to be saved across the call, it will need
to be in its stack slot.

The follow_vars field should map a variable to register r(-1) if the
variable is an input to a builtin that does not require its inputs to
be anywhere in particular. In that case, we would prefer that the
variable be in a register, since this should make the code generated
for the builtin somewhat faster.

When the code generator materializes a variable in way that requires
several accesses to the materialized location (e.g. filling in the fields
of a structure), it should put the variable into a register even if
the follow_vars field says otherwise.

Since there may be many variables that should be in their stack slots,
and we don't want to represent all of these explicitly, the follow_vars
field may omit any mention of these variables. This also makes it easier
to merge follow_vars fields at the starts of branched control structures.
If some branches want a variable in a register, their wishes should take
precedence over the wishes of the branches that wish the variable to be
in its stack slot or in which the variable does not occur at all.

When the code generator picks a random free register, it should try to avoid
registers that are needed variables in the follow_vars map.

When a variable that is current in its stack slot is supposed to be put
in any currently free register for speed of future access, the code generator
should refuse to use any virtual machine registers that are not real machine
registers. Instead, it should keep the variable in its stack slot.

--------------------------------------------------------------------------
--------------------------------------------------------------------------