mirror of
https://github.com/Mercury-Language/mercury.git
synced 2025-12-14 05:12:33 +00:00
810 lines
25 KiB
HTML
810 lines
25 KiB
HTML
<html>
|
|
<head>
|
|
<title>
|
|
Implementing tail recursion in the MLDS code generator
|
|
</title>
|
|
</head>
|
|
|
|
<body
|
|
bgcolor="#ffffff"
|
|
text="#000000"
|
|
>
|
|
|
|
<h1>Implementing tail recursion in the MLDS code generator</h1>
|
|
|
|
<h2>Tail recursion optimization versus last call optimization</h2>
|
|
|
|
<p>
|
|
Most implementations of declarative languages implement
|
|
<em>last call optimization</em> (LCO)
|
|
to allow recursive algorithms to handle arbitrary amounts of data
|
|
using constant stack space.
|
|
With LCO, when the last thing that a procedure does is
|
|
call a callee whose vector of return values is the same as
|
|
the vector of return values of the caller,
|
|
it deallocates the stack frame of the caller before the call,
|
|
allowing its space to be used to store the stack frame of the callee.
|
|
|
|
<p>
|
|
In its general form, LCO does not require any knowledge of the callee;
|
|
it works even when the identity of the callee is unknown at compile time,
|
|
as when the last call is a higher order call or a method call,
|
|
or its code is unavailable,
|
|
as when it is defined in a different compilation unit.
|
|
However, implementing this general form of LCO
|
|
requires the implementation to have
|
|
direct control over the use of the stack,
|
|
and the ability to generate jumps (not calls) to arbitrary locations.
|
|
In the Mercury compiler, the LLDS code generator can do both these things,
|
|
but the MLDS code generator can do neither.
|
|
This is why it can implement only
|
|
<em>tail recursion optimization</em> (TRO).
|
|
This differs from LCO in two ways.
|
|
<ul>
|
|
<li>
|
|
We don't deallocate the stack frame of the caller before the tail call;
|
|
instead, we reuse that stack frame to become the stack frame of the callee.
|
|
<li>
|
|
We don't use a global (between functions) jump
|
|
to the start of the code of the callee whereever it happens to be in memory;
|
|
instead, we include the code of the callee next to the code of the caller,
|
|
and use local (within a function) branches to transfer control.
|
|
</ul>
|
|
This means that TRO is a less general form of last call optimization,
|
|
because it is applicable only
|
|
when the call is a first order, so the callee is statically known,
|
|
and the caller and callee are in the same compilation unit,
|
|
so they can be compiled together into a single target language function.
|
|
|
|
<p>
|
|
Tail recursion optimization is applicable
|
|
to both self-recursion and mutual recursion.
|
|
TRO for self recursion is significantly simpler,
|
|
so we describe that first.
|
|
This is a general principle we use everywhere below:
|
|
we introduce the simplest case first,
|
|
and add the complications (and their solutions) later.
|
|
|
|
<p>
|
|
<h2>Self tail recursion</h2>
|
|
|
|
<p>
|
|
To explain how the Mercury compiler applies TRO to self-recursive calls,
|
|
we will use this example predicate:
|
|
|
|
<p>
|
|
<pre>
|
|
:- pred len(list(int)::in, int::in, int::out) is det.
|
|
|
|
len(L, Len0, Len) :-
|
|
(
|
|
L = [],
|
|
Len = Len0
|
|
;
|
|
L = [_ | T],
|
|
Len1 = Len0 + 1,
|
|
len(T, Len1, Len)
|
|
).
|
|
</pre>
|
|
|
|
<p>
|
|
Here is the C code of generated by the Mercury compiler for this predicate
|
|
without TRO:
|
|
|
|
<p>
|
|
<pre>
|
|
void MR_CALL
|
|
x__len_3_p_0(
|
|
MR_Word L_4,
|
|
MR_Integer Len0_5,
|
|
MR_Integer * Len_6)
|
|
{
|
|
if ((L_4 == ((MR_Word) MR_mkword(MR_mktag(0), MR_mkbody((MR_Integer) 0)))))
|
|
*Len_6 = Len0_5;
|
|
else
|
|
{
|
|
MR_Word T_8;
|
|
MR_Integer Len1_9;
|
|
MR_Integer Var_10;
|
|
MR_Integer Var_7;
|
|
|
|
Var_7 = ((MR_Integer) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 0)));
|
|
T_8 = ((MR_Word) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 1)));
|
|
Var_10 = (MR_Integer) 1;
|
|
Len1_9 = (Len0_5 + Var_10);
|
|
x__len_3_p_0(T_8, Len1_9, Len_6);
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
The last call is a tail recursive call.
|
|
When this predicate is compiled with TRO,
|
|
we get this C code:
|
|
|
|
<p>
|
|
<pre>
|
|
void MR_CALL
|
|
x__len_3_p_0(
|
|
MR_Word L_4,
|
|
MR_Integer Len0_5,
|
|
MR_Integer * Len_6)
|
|
{
|
|
while (MR_TRUE)
|
|
{
|
|
if ((L_4 == ((MR_Word) MR_mkword(MR_mktag(0), MR_mkbody((MR_Integer) 0)))))
|
|
*Len_6 = Len0_5;
|
|
else
|
|
{
|
|
MR_Word T_8 = ((MR_Word) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 1)));
|
|
MR_Integer Len1_9;
|
|
MR_Integer Var_10 = (MR_Integer) 1;
|
|
MR_Integer Var_7 = ((MR_Integer) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 0)));
|
|
MR_Word next_value_of_L_4;
|
|
MR_Integer next_value_of_Len0_5;
|
|
|
|
Len1_9 = (Len0_5 + Var_10);
|
|
// direct tailcall eliminated
|
|
next_value_of_L_4 = T_8;
|
|
next_value_of_Len0_5 = Len1_9;
|
|
L_4 = next_value_of_L_4;
|
|
Len0_5 = next_value_of_Len0_5;
|
|
continue;
|
|
}
|
|
break;
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
This differs from the unoptimized code in two major aspects.
|
|
|
|
<p>
|
|
The first aspect affected by TRO
|
|
is the translation of the self tail call
|
|
(or self tail calls, plural, in the general case).
|
|
TRO replaces the call with
|
|
|
|
<ul>
|
|
<li>
|
|
code that assigns the input arguments of the self-recursive call
|
|
(in this case, T and Len1),
|
|
to the corresponding input arguments in the head (L and Len0), and
|
|
<li>
|
|
code that transfers control back to the start of the procedure,
|
|
i.e. the entry point of the callee.
|
|
</ul>
|
|
|
|
<p>
|
|
There is no code for handling the output arguments,
|
|
since (by the definition of tail calls)
|
|
these must be the same in the caller and the callee.
|
|
On every non-recursive path,
|
|
we return the values of the output arguments
|
|
using the exact same code as we would use without TRO.
|
|
|
|
<p>
|
|
Note that code that passes the input arguments does so in two stages:
|
|
assignments of the actual parameter values
|
|
to the next_value_of_ forms of the input arguments,
|
|
followed by assignments of these next_value_of_ forms
|
|
to the input arguments themselves.
|
|
This is to handle the case where some variable is both
|
|
an input argument and an actual parameter of the call.
|
|
If we just assigned each actual parameter to the corresponding input directly
|
|
in (say) ascending order of argument number,
|
|
then the translation of a call such as foo(In2, In1, <i>outputs</i>)
|
|
in a predicate whose head looks like foo(In1, In2, <i>outputs</i>)
|
|
would consist of the assignments
|
|
|
|
<p>
|
|
<pre>
|
|
In1 = In2;
|
|
In2 = In1;
|
|
</pre>
|
|
|
|
and the first assignment would clobber the value to be assigned by the second.
|
|
This is the standard problem of swapping two values,
|
|
and its solution requires at least one temporary variable
|
|
(if we don't want to resort to unnecessarily complicated code using xors).
|
|
Our solution works because
|
|
the next_value_of_ forms of the input arguments are never live
|
|
outside the small blocks of code resulting from a single tail recursive call
|
|
(we simply don't generate references to them in any other context),
|
|
and inside each block, each such variable
|
|
is written exactly once and read exactly once (in that order).
|
|
The fact that we use more temporaries
|
|
than may be strictly necessary does not matter,
|
|
because the final decision on
|
|
how the assigned values end up in their target locations
|
|
is not up to the Mercury compiler;
|
|
it is up to the compiler that translates the generated C, C# or Java
|
|
to machine code.
|
|
|
|
<p>
|
|
The second aspect affected by TRO is that
|
|
the entire body of the target language (in this case C) code
|
|
we generate for the procedure is wrapped up in a loop.
|
|
|
|
<p>
|
|
The usual way we wrap the procedure body is with a while loop:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type func_name(args)
|
|
{
|
|
while (MR_TRUE)
|
|
{
|
|
// procedure body
|
|
// in which tail calls transfer control using "continue"
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
However, we can also use gotos:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type func_name(args)
|
|
{
|
|
top_of_proc:
|
|
{
|
|
// procedure body
|
|
// in which tail calls transfer control using "goto top_of_proc"
|
|
}
|
|
}
|
|
</pre>
|
|
<p>
|
|
|
|
<h2>Mutual tail recursion</h2>
|
|
|
|
<p>
|
|
The MLDS code generator partitions the procedures of a module
|
|
into a sequence of SCCs,
|
|
where each SCC (strongly connected component)
|
|
consists of a set of procedures
|
|
that are all reachable from each other via calls, whether tail or non-tail.
|
|
Since TRO applies only to tail calls,
|
|
it also partitions each SCC further into one or more TSCCs (tail SCCs),
|
|
which are strongly connected components of a graph
|
|
whose nodes represent procedures
|
|
and in which there are edges only for <em>tail</em> calls.
|
|
This means that by definition, every procedure in a TSCC
|
|
is reachable from every procedure in that TSCC using only tail calls.
|
|
It then implements tail recursion optimization
|
|
in each TSCC that contains tail calls.
|
|
|
|
<p>
|
|
Most TSCCs contain only one procedure,
|
|
which means that we can implement TRO for them
|
|
using only the techniques above,
|
|
without using any of the techniques below.
|
|
The techniques below are needed only for TSCCs that contain
|
|
two or more procedures.
|
|
|
|
<p>
|
|
Note that two (or more) mutually recursive procedures
|
|
can end up in <em>different</em> TSCCs
|
|
even if there is a tail call between them,
|
|
if the tail calls go only one way,
|
|
e.g. if procedure p calls procedure q using tail calls,
|
|
but q calls p using only ordinary nontail calls.
|
|
The LLDS backend can optimize the tail calls to q in p,
|
|
but the MLDS backend cannot do so,
|
|
because it cannot generate nonlocal gotos.
|
|
|
|
<p>
|
|
To implement mutual tail recursion between the procedures of a nontrivial TSCC,
|
|
we need to generalize
|
|
<ul>
|
|
<li>
|
|
the mechanism for parameter passing input arguments at tail calls,
|
|
<li>
|
|
the mechanism for returning output arguments on nonrecursive paths, and
|
|
<li>
|
|
the mechanism for transferring control.
|
|
</ul>
|
|
|
|
<h3>Transfers of control</h3>
|
|
|
|
<p>
|
|
The easiest to generalize is the last one: the transfer of control.
|
|
To see how it is done,
|
|
consider a small TSCC containing two procedures, tscc_a and tscc_b.
|
|
Since we need to translate tail calls into <em>local</em> transfers of control,
|
|
we translate each TSCC together,
|
|
either using labels and gotos like this:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type_a tscc_a(args_a)
|
|
{
|
|
goto top_of_proc_1;
|
|
top_of_proc_1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return ...
|
|
}
|
|
|
|
ret_type_b tscc_b(args_b)
|
|
{
|
|
goto top_of_proc_2;
|
|
top_of_proc_1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return ...
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
or using while loops and switches like this:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type_a tscc_a(args_a)
|
|
{
|
|
int tscc_selector = 1;
|
|
switch (tscc_selector)
|
|
{
|
|
case 1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
case 2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
}
|
|
|
|
return ...
|
|
}
|
|
|
|
ret_type_b tscc_b(args_b)
|
|
{
|
|
int tscc_selector = 2;
|
|
switch (tscc_selector)
|
|
{
|
|
case 1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
case 2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
}
|
|
|
|
return ...
|
|
}
|
|
</pre>
|
|
|
|
In both cases,
|
|
each procedure in the TSCC has its own number in the TSCC
|
|
(in this case, tscc_a is procedure 1 in the TSCC
|
|
and tscc_b is procedure 2 in the TSCC).
|
|
We call this number the procedure's in-TSCC id number.
|
|
|
|
<p>
|
|
We translate each procedure in the TSCC into MLDS code just once,
|
|
yielding the code represented by "body of procedure ..." above.
|
|
We call these <em>inner</em> or <em>wrapped</em> procedures.
|
|
If the TSCC contains N procedures,
|
|
then each C function we generate will contain N wrapped procedures.
|
|
We call the entirety of each C function
|
|
an <em>outer</em> or <em>container</em> procedure,
|
|
since each contains two or more wrapped procedures.
|
|
We must generate a container procedure
|
|
for every member of the TSCC
|
|
that may be called by a non-tail call from anywhere;
|
|
from other modules,
|
|
from other (higher) SCCs in the current module,
|
|
from procedures in the current SCC that are not in the TSCC,
|
|
and via non-tail calls from any procedure in the TSCC itself.
|
|
This means that
|
|
the code of every procedure in a TSCC that contains N procedures
|
|
will be present up to N times in the executable.
|
|
Since mutually-tail-recursive procedures are relatively rare,
|
|
and most TSCCs contain only two or three procedures,
|
|
this increase in the total code memory requirement
|
|
is usually a more than acceptable price to pay
|
|
for the ability to handle arbitrarily deep recursion in constant stack space.
|
|
(In fact, the increased memory requirement is probably not as important
|
|
as the reduction of the effectiveness of the instruction cache:
|
|
the cache misses that bring in the code of a wrapped procedure from main memory
|
|
have to be incurred for <em>each</em> one of its executed copies.)
|
|
|
|
<h3>Parameter passing</h3>
|
|
|
|
<p>
|
|
Parameter passing between the procedures of a TSCC at tail calls
|
|
is not as simple as parameter passing at self-tail-recursive calls,
|
|
because (except in the case of self-tail-calls)
|
|
the actual parameters in the caller and the formal parameters of the callee
|
|
will come from two different procedures, and thus from two different varsets.
|
|
Since every procedure's varset contains variables
|
|
whose numbers are allocated consecutively from zero,
|
|
the sets of variable numbers in two different procedures
|
|
will of course greatly overlap,
|
|
and it is possible for a variable with a given number
|
|
to have the same name in both varsets as well.
|
|
We don't want any such accidental name collisions
|
|
to result in the generated code using the same C variable
|
|
to represent <em>both</em> of the colliding variables,
|
|
since that would be semantically wrong.
|
|
(For starters, the two variables could even have different types,
|
|
but the sharing of their storage would be a bug
|
|
even if they had the same type.)
|
|
We therefore need a mechanism to avoid this problem.
|
|
|
|
<p>
|
|
One possible solution would be to rename (or renumber) apart
|
|
either the varsets of the procedures in the TSCC before code generation,
|
|
or their sets of MLDS variables either during or after code generation.
|
|
Both are problematic.
|
|
HLDS procedures contain <em>lots</em> of fields that contain variables,
|
|
so the code for renaming or renumbering variables in all of them
|
|
would be big (which would pose a program maintenance burden)
|
|
and would turn over a lot of memory (a performance problem).
|
|
And some compiler-generated MLDS variable names
|
|
have fixed names and no changeable number.
|
|
|
|
<p>
|
|
Our chosen solution sidesteps such problems altogether
|
|
by inventing a new set of compiler-generated MLDS variables
|
|
specifically for parameter passing in TSCCs.
|
|
|
|
<ul>
|
|
<li>
|
|
For every input argument of a nondummy type
|
|
for every procedure in the TSCC,
|
|
we create the MLDS variable tscc_proc_N_input_M_VarName,
|
|
where N is the procedure's in-TSCC id number,
|
|
M is input argument's position
|
|
in the list of the procedure's list of input arguments
|
|
(i.e. this is the procedure's Mth input argument),
|
|
and VarName is the name of the argument.
|
|
The VarName part is not needed for correctness;
|
|
it is there only to make the generated MLDS easier to read.
|
|
<li>
|
|
For every output argument
|
|
in the vector of output arguments of the procedures of the TSCC
|
|
(which must be the same in all procedures of the TSCC),
|
|
we create the MLDS variable tscc_output_M_VarName.
|
|
M is the output argument's position in this vector,
|
|
while VarName is the output argument's name
|
|
in <em>one</em> of the procedures of the TSCC.
|
|
The fact that this name need not match the name of the output argument
|
|
in the other procedures does not matter, because again
|
|
the name is there only to make the generated MLDS easier to read.
|
|
<li>
|
|
All the procedures in a TSCC must have the same code model,
|
|
which may be either model_det or model_semi.
|
|
In the latter case, we also create the MLDS variable tscc_output_succeeded.
|
|
Model semi procedures return the value of the succeeded MLDS variable
|
|
as if it were a sort-of output argument;
|
|
the tscc_output_succeeded variable
|
|
has the same relationship to the succeeded variable
|
|
as the tscc_output_M_VarName variables have
|
|
to the MLDS output variables they correspond to.
|
|
</ul>
|
|
|
|
<p>
|
|
In every procedure,
|
|
every argument that participates in parameter passing
|
|
(i.e. every argument that is not of a dummy type and whose mode is not unused)
|
|
has either a corresponding tscc_proc_N_input_M_VarName variable
|
|
(if it is an input argument)
|
|
or a corresponding tscc_output_M_VarName variable
|
|
(if it is an output argument).
|
|
In each such pair of corresponding variables,
|
|
we call the MLDS variable representing the argument
|
|
the procedure's own variable,
|
|
and we call the other the tscc variable.
|
|
|
|
<p>
|
|
Suppose both tscc_a and tscc_b are det functions
|
|
whose argument vectors are tscc_a(AIn1, AIn2) = AOut1
|
|
and tscc_b(BIn1) = BOut1 respectively,
|
|
and the name of the MLDS type of each variable
|
|
is the name of the variable with a "Type" added to it.
|
|
However, since AOutType1 must be the same as BOutType1,
|
|
we will replace both with just "OutType1".
|
|
Then the parameter passing code we generate will look like this:
|
|
|
|
<p>
|
|
<pre>
|
|
OutType1
|
|
tscc_a(
|
|
AInType1 tscc_proc_1_input_1_AIn1,
|
|
AInType2 tscc_proc_1_input_2_AIn2)
|
|
{
|
|
BInType1 tscc_proc_1_input_2_BIn1;
|
|
OutType1 tscc_output_1_AOut1;
|
|
|
|
goto top_of_proc_1;
|
|
top_of_proc_1:
|
|
{
|
|
AInType1 AIn1 = tscc_proc_1_input_1_AIn1;
|
|
AInType2 AIn2 = tscc_proc_1_input_2_AIn2;
|
|
OutType1 AOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to tscc_a look like this:
|
|
// tscc_proc_1_input_1_AIn1 = input arg 1 of tail call;
|
|
// tscc_proc_1_input_2_AIn2 = input arg 2 of tail call;
|
|
// goto top_of_proc_1;
|
|
// tail calls to tscc_b look like this:
|
|
// tscc_proc_1_input_2_BIn1 = input arg 1 of tail call;
|
|
// goto top_of_proc_2;
|
|
// and base cases assign to AOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = AOut1;
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
BInType1 BIn1 = tscc_proc_1_input_2_BIn1;
|
|
OutType1 BOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to both tscc_a and tscc_b look like they do above
|
|
// and base cases assign to BOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = BOut1;
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return tscc_output_1_AOut1;
|
|
}
|
|
|
|
OutType1
|
|
tscc_b(
|
|
BInType1 tscc_proc_2_input_1_BIn1)
|
|
{
|
|
AInType1 tscc_proc_1_input_1_AIn1;
|
|
AInType2 tscc_proc_1_input_2_AIn2;
|
|
OutType1 tscc_output_1_AOut1;
|
|
|
|
goto top_of_proc_2;
|
|
top_of_proc_1:
|
|
{
|
|
AInType1 AIn1 = tscc_proc_1_input_1_AIn1;
|
|
AInType2 AIn2 = tscc_proc_1_input_2_AIn2;
|
|
OutType1 AOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to tscc_a look like this:
|
|
// tscc_proc_1_input_1_AIn1 = input arg 1 of tail call;
|
|
// tscc_proc_1_input_2_AIn2 = input arg 2 of tail call;
|
|
// goto top_of_proc_1;
|
|
// tail calls to tscc_b look like this:
|
|
// tscc_proc_1_input_2_BIn1 = input arg 1 of tail call;
|
|
// goto top_of_proc_2;
|
|
// and base cases assign to AOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = AOut1;
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
BInType1 BIn1 = tscc_proc_1_input_2_BIn1;
|
|
OutType1 BOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to both tscc_a and tscc_b look like they do above
|
|
// and base cases assign to BOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = BOut1;
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return tscc_output_1_AOut1;
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
The general principles of our parameter passing scheme are as follows.
|
|
|
|
<ul>
|
|
<li>
|
|
The MLDS variables that
|
|
we would generate for a procedure in the absence of TRO,
|
|
which we call the procedure's <em>own</em> variables,
|
|
are visible <em>only</em> in the scope
|
|
containing the wrapped body of that procedure.
|
|
(These are the scopes after the top_of_proc_N labels above.)
|
|
Since these scopes never overlap,
|
|
there is never any place in the generated MLDS code
|
|
where the own variables of more than one TSCC procedure are visible at once.
|
|
<li>
|
|
Each scope containing a wrapped procedure occurs
|
|
either after the label corresponding to the procedure
|
|
(if we are using labels and gotos)
|
|
or in the switch case corresponding to the procedure
|
|
(if we are using while loops and switches),
|
|
and consists of
|
|
<ul>
|
|
<li>
|
|
the declarations of the procedure's own MLDS variables
|
|
for both the input and output arguments, followed by
|
|
<li>
|
|
assignments that set the value of each own MLDS input argument
|
|
from the value of the corresponding tscc_proc_N_input_M_VarName variable
|
|
(the above code examples show these
|
|
merged with the definitions of the variables being set), followed by
|
|
<li>
|
|
the wrapped body of the procedure, followed by
|
|
<li>
|
|
assignments that copy the value of each own MLDS output argument
|
|
to the corresponding tscc_output_M_VarName variable, followed by
|
|
<li>
|
|
a jump to the end of the function, via either goto or a break.
|
|
</ul>
|
|
Note that in this case,
|
|
the tail calls do not need to use any next_value_of_ variables.
|
|
The actual parameters can only be the procedure's own variables,
|
|
and the formal parameters being assigned to
|
|
are all tscc_proc_N_input_M_VarName variables,
|
|
so there cannot be any overlap between them.
|
|
<li>
|
|
For procedures whose outputs are all returned by value
|
|
(which includes both tscc_a and tscc_b above),
|
|
their container function consists of the following.
|
|
<ul>
|
|
The declaration of the function signature,
|
|
giving the type (or in general, types)
|
|
of the output argument(s) returned by value,
|
|
and the types and variable names of the input arguments.
|
|
The variables in the signature
|
|
are the tscc_proc_N_input_M_VarName variables
|
|
of the procedure that the container is for,
|
|
which this signature declares.
|
|
<li>
|
|
The body of the container function starts by declaring
|
|
all the tscc_proc_N_input_M_VarName variables
|
|
of all the <em>other</em> procedures in the TSCC,
|
|
and all the tscc_output_M_VarName variables
|
|
(each of which is shared by all the procedures in the TSCC).
|
|
<p>
|
|
The tscc_proc_N_input_M_VarName variables of the procedure
|
|
that the container is for start out initialized by the caller.
|
|
The tscc_proc_N_input_M_VarName variables
|
|
of the other procedures in the TSCC start out uninitialized,
|
|
but their contents will be read
|
|
only by code in the corresponding wrapped procedure body,
|
|
which is reachable only by tail call,
|
|
and every such tail call will first set those variables
|
|
to the appropriate values for the tail call.
|
|
<li>
|
|
Either a jump to the label at the start of the wrapped version
|
|
of the procedure that the container function is for,
|
|
or a setting of the tscc_selector variable that achieves
|
|
the same effect.
|
|
The target of the jump is the wrapped procedure
|
|
whose tscc_proc_N_input_M_VarName variables
|
|
are listed in the container function's signature,
|
|
and whose values are therefore defined for us by the caller.
|
|
<li>
|
|
The bodies of the procedures of the TSCC,
|
|
either with each being preceded by its own label,
|
|
or wrapped up in a loop and a switch,
|
|
as shown above.
|
|
<li>
|
|
A return statement for all the tscc_output_M_VarName variables
|
|
corresponding to output arguments returned by value.
|
|
(In the above example, all output arguments are returned by value.)
|
|
Execution reaches here only after a branch here
|
|
from the end of one of the wrapped procedures,
|
|
each of which, immediately before the branch,
|
|
assigns to all the tscc_output_M_VarName variables.
|
|
</ul>
|
|
</ul>
|
|
|
|
<p>
|
|
For model semi predicates,
|
|
the succeeded (own) variable
|
|
and the tscc_output_succeeded variable corresponding to it
|
|
effectively function as an output argument returned by value.
|
|
By its position in the vector of output arguments,
|
|
it is effectively output argument 0.
|
|
|
|
<p>
|
|
The final detail is the treatment of output arguments
|
|
that are passed by reference.
|
|
Our chosen approach is designed to work even in cases where
|
|
the procedures of the TSCC,
|
|
although they return the same vector of outputs,
|
|
return different subsets of them by reference.
|
|
|
|
<p>
|
|
The basic idea is to generate the wrapped procedures
|
|
as if all the output arguments were returned by value,
|
|
exactly as shown in the example above,
|
|
and to handle the difference at the container function level.
|
|
|
|
<p>
|
|
The parts of a container function
|
|
corresponding to an output argument passed by value are the following.
|
|
|
|
<ul>
|
|
<li>
|
|
The type of the output argument is part of the vector of return types
|
|
in the function signature.
|
|
<li>
|
|
The tscc_output_M_VarName variable of the output is declared
|
|
at the start of the container function.
|
|
<li>
|
|
The epilogue of each wrapped procedure
|
|
assigns to the tscc_output_M_VarName variable of the output.
|
|
<li>
|
|
The epilogue of the container function
|
|
returns the value of this tscc_output_M_VarName variable
|
|
as part of the vector of return values.
|
|
</ul>
|
|
|
|
<p>
|
|
When an output argument is passed by reference,
|
|
we create a tscc_output_ptr_M_VarName variable for it
|
|
as well as a tscc_output_M_VarName variable.
|
|
The parts of a container function
|
|
corresponding to such an output argument are the following.
|
|
|
|
<ul>
|
|
<li>
|
|
The type of the output argument
|
|
is <em>not</em> part of the vector of return types in the function signature.
|
|
Instead, the argument vector contains tscc_output_ptr_M_VarName,
|
|
and its type is a pointer to the type of the output argument.
|
|
The position of the tscc_output_ptr_M_VarName is derived from
|
|
the position of the argument in the HLDS.
|
|
<li>
|
|
The tscc_output_M_VarName variable of the output is declared
|
|
at the start of the container function, as before.
|
|
<li>
|
|
The epilogue of each wrapped procedure
|
|
assigns to the tscc_output_M_VarName variable of the output,
|
|
as before.
|
|
<li>
|
|
The epilogue of the container function
|
|
returns the value of this tscc_output_M_VarName variable,
|
|
but <em>not</em> as part of the vector of return values.
|
|
Instead, just before the return statement,
|
|
we include the assignment *tscc_output_ptr_M_VarName = tscc_output_M_VarName.
|
|
</ul>
|
|
|
|
</body>
|
|
</html>
|