mirror of
https://github.com/Mercury-Language/mercury.git
synced 2025-12-16 06:14:59 +00:00
Generalize the mechanism we use to implement mutual tail recursion optimization
in the MLDS backend to handle TSCCs that contain both predicates and functions.
This generalization also simplifies the split of responsibilities between
the MLDS functions that implement each TSCC procedure for external callers
(which we now call the container function) on the one hand, and their main
components, the bodies of the procedures themselves (which we now call the
wrapped procedures, since each container function wraps up the bodies
of *all* the procedures in the TSCC).
In the new scheme, wrapped functions always give output arguments
to container functions by value. It is the job of the container functions
to return these output arguments to the caller according to the requirements
imposed by the container function's calling convention. This allows
different container functions to return output arguments differently
(some may return an output by value, while some may do so by reference)
while still allowing the wrapped procedure bodies to be generated just once
and then duplicated for each container function.
compiler/notes/mlds_tail_recursion.html:
A new file explaining both the scheme we use to generate code for
TSCCs, and the reasons why we use that scheme.
compiler/notes/Mmakefile:
Include the new file in the list of compiler notes files.
compiler/ml_args_util.m:
Update the code that generated code fragments handling arguments
for TSCCs to follow the updated scheme. Use the terminology in the
new notes file to clarify variable names where relevant. Group
related arguments together.
compiler/ml_proc_gen.m:
Update the code that created wrapped procedures and container functions
to follow the updated scheme. Use the terminology in the new notes file
to clarify both function and variable names where relevant. Delete the
documentation which is now in notes/mlds_tail_recursion.html (in greatly
enhanced form).
Split the predicate for adding local variable definitions to MLDS
functions, since when generating code for TSCCs using the new scheme,
we only need one of its two halves.
compiler/mlds.m:
Add the new forms of compiler generated variables needed by the new
translation scheme.
compiler/ml_gen_info.m:
Change the type of the field containing the byref output vars
from a list to a set. All its users want to treat it as a set,
so it is simpler and faster to convert it just once, when it is set,
instead of on every use.
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
Conform to the change in ml_gen_info.m.
810 lines
25 KiB
HTML
810 lines
25 KiB
HTML
<html>
|
|
<head>
|
|
<title>
|
|
Implementing tail recursion in the MLDS code generator
|
|
</title>
|
|
</head>
|
|
|
|
<body
|
|
bgcolor="#ffffff"
|
|
text="#000000"
|
|
>
|
|
|
|
<h1>Implementing tail recursion in the MLDS code generator</h1>
|
|
|
|
<h2>Tail recursion optimization versus last call optimization</h2>
|
|
|
|
<p>
|
|
Most implementations of declarative languages implement
|
|
<em>last call optimization</em> (LCO)
|
|
to allow recursive algorithms to handle arbitrary amounts of data
|
|
using constant stack space.
|
|
With LCO, when the last thing that a procedure does is
|
|
call a callee whose vector of return values is the same as
|
|
the vector of return values of the caller,
|
|
it deallocates the stack frame of the caller before the call,
|
|
allowing its space to be used to store the stack frame of the callee.
|
|
|
|
<p>
|
|
In its general form, LCO does not require any knowledge of the callee;
|
|
it works even when the identity of the callee is unknown at compile time,
|
|
as when the last call is a higher order call or a method call,
|
|
or its code is unavailable,
|
|
as when it is defined in a different compilation unit.
|
|
However, implementing this general form of LCO
|
|
requires the implementation to have
|
|
direct control over the use of the stack,
|
|
and the ability to generate jumps (not calls) to arbitrary locations.
|
|
In the Mercury compiler, the LLDS code generator can do both these things,
|
|
but the MLDS code generator can do neither.
|
|
This is why it can implement only
|
|
<em>tail recursion optimization</em> (TRO).
|
|
This differs from LCO in two ways.
|
|
<ul>
|
|
<li>
|
|
We don't deallocate the stack frame of the caller before the tail call;
|
|
instead, we reuse that stack frame to become the stack frame of the callee.
|
|
<li>
|
|
We don't use a global (between functions) jump
|
|
to the start of the code of the callee whereever it happens to be in memory;
|
|
instead, we include the code of the callee next to the code of the caller,
|
|
and use local (within a function) branches to transfer control.
|
|
</ul>
|
|
This means that TRO is a less general form of last call optimization,
|
|
because it is applicable only
|
|
when the call is a first order, so the callee is statically known,
|
|
and the caller and callee are in the same compilation unit,
|
|
so they can be compiled together into a single target language function.
|
|
|
|
<p>
|
|
Tail recursion optimization is applicable
|
|
to both self-recursion and mutual recursion.
|
|
TRO for self recursion is significantly simpler,
|
|
so we describe that first.
|
|
This is a general principle we use everywhere below:
|
|
we introduce the simplest case first,
|
|
and add the complications (and their solutions) later.
|
|
|
|
<p>
|
|
<h2>Self tail recursion</h2>
|
|
|
|
<p>
|
|
To explain how the Mercury compiler applies TRO to self-recursive calls,
|
|
we will use this example predicate:
|
|
|
|
<p>
|
|
<pre>
|
|
:- pred len(list(int)::in, int::in, int::out) is det.
|
|
|
|
len(L, Len0, Len) :-
|
|
(
|
|
L = [],
|
|
Len = Len0
|
|
;
|
|
L = [_ | T],
|
|
Len1 = Len0 + 1,
|
|
len(T, Len1, Len)
|
|
).
|
|
</pre>
|
|
|
|
<p>
|
|
Here is the C code of generated by the Mercury compiler for this predicate
|
|
without TRO:
|
|
|
|
<p>
|
|
<pre>
|
|
void MR_CALL
|
|
x__len_3_p_0(
|
|
MR_Word L_4,
|
|
MR_Integer Len0_5,
|
|
MR_Integer * Len_6)
|
|
{
|
|
if ((L_4 == ((MR_Word) MR_mkword(MR_mktag(0), MR_mkbody((MR_Integer) 0)))))
|
|
*Len_6 = Len0_5;
|
|
else
|
|
{
|
|
MR_Word T_8;
|
|
MR_Integer Len1_9;
|
|
MR_Integer Var_10;
|
|
MR_Integer Var_7;
|
|
|
|
Var_7 = ((MR_Integer) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 0)));
|
|
T_8 = ((MR_Word) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 1)));
|
|
Var_10 = (MR_Integer) 1;
|
|
Len1_9 = (Len0_5 + Var_10);
|
|
x__len_3_p_0(T_8, Len1_9, Len_6);
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
The last call is a tail recursive call.
|
|
When this predicate is compiled with TRO,
|
|
we get this C code:
|
|
|
|
<p>
|
|
<pre>
|
|
void MR_CALL
|
|
x__len_3_p_0(
|
|
MR_Word L_4,
|
|
MR_Integer Len0_5,
|
|
MR_Integer * Len_6)
|
|
{
|
|
while (MR_TRUE)
|
|
{
|
|
if ((L_4 == ((MR_Word) MR_mkword(MR_mktag(0), MR_mkbody((MR_Integer) 0)))))
|
|
*Len_6 = Len0_5;
|
|
else
|
|
{
|
|
MR_Word T_8 = ((MR_Word) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 1)));
|
|
MR_Integer Len1_9;
|
|
MR_Integer Var_10 = (MR_Integer) 1;
|
|
MR_Integer Var_7 = ((MR_Integer) (MR_hl_field(MR_mktag(1), L_4, (MR_Integer) 0)));
|
|
MR_Word next_value_of_L_4;
|
|
MR_Integer next_value_of_Len0_5;
|
|
|
|
Len1_9 = (Len0_5 + Var_10);
|
|
// direct tailcall eliminated
|
|
next_value_of_L_4 = T_8;
|
|
next_value_of_Len0_5 = Len1_9;
|
|
L_4 = next_value_of_L_4;
|
|
Len0_5 = next_value_of_Len0_5;
|
|
continue;
|
|
}
|
|
break;
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
This differs from the unoptimized code in two major aspects.
|
|
|
|
<p>
|
|
The first aspect affected by TRO
|
|
is the translation of the self tail call
|
|
(or self tail calls, plural, in the general case).
|
|
TRO replaces the call with
|
|
|
|
<ul>
|
|
<li>
|
|
code that assigns the input arguments of the self-recursive call
|
|
(in this case, T and Len1),
|
|
to the corresponding input arguments in the head (L and Len0), and
|
|
<li>
|
|
code that transfers control back to the start of the procedure,
|
|
i.e. the entry point of the callee.
|
|
</ul>
|
|
|
|
<p>
|
|
There is no code for handling the output arguments,
|
|
since (by the definition of tail calls)
|
|
these must be the same in the caller and the callee.
|
|
On every non-recursive path,
|
|
we return the values of the output arguments
|
|
using the exact same code as we would use without TRO.
|
|
|
|
<p>
|
|
Note that code that passes the input arguments does so in two stages:
|
|
assignments of the actual parameter values
|
|
to the next_value_of_ forms of the input arguments,
|
|
followed by assignments of these next_value_of_ forms
|
|
to the input arguments themselves.
|
|
This is to handle the case where some variable is both
|
|
an input argument and an actual parameter of the call.
|
|
If we just assigned each actual parameter to the corresponding input directly
|
|
in (say) ascending order of argument number,
|
|
then the translation of a call such as foo(In2, In1, <i>outputs</i>)
|
|
in a predicate whose head looks like foo(In1, In2, <i>outputs</i>)
|
|
would consist of the assignments
|
|
|
|
<p>
|
|
<pre>
|
|
In1 = In2;
|
|
In2 = In1;
|
|
</pre>
|
|
|
|
and the first assignment would clobber the value to be assigned by the second.
|
|
This is the standard problem of swapping two values,
|
|
and its solution requires at least one temporary variable
|
|
(if we don't want to resort to unnecessarily complicated code using xors).
|
|
Our solution works because
|
|
the next_value_of_ forms of the input arguments are never live
|
|
outside the small blocks of code resulting from a single tail recursive call
|
|
(we simply don't generate references to them in any other context),
|
|
and inside each block, each such variable
|
|
is written exactly once and read exactly once (in that order).
|
|
The fact that we use more temporaries
|
|
than may be strictly necessary does not matter,
|
|
because the final decision on
|
|
how the assigned values end up in their target locations
|
|
is not up to the Mercury compiler;
|
|
it is up to the compiler that translates the generated C, C# or Java
|
|
to machine code.
|
|
|
|
<p>
|
|
The second aspect affected by TRO is that
|
|
the entire body of the target language (in this case C) code
|
|
we generate for the procedure is wrapped up in a loop.
|
|
|
|
<p>
|
|
The usual way we wrap the procedure body is with a while loop:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type func_name(args)
|
|
{
|
|
while (MR_TRUE)
|
|
{
|
|
// procedure body
|
|
// in which tail calls transfer control using "continue"
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
However, we can also use gotos:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type func_name(args)
|
|
{
|
|
top_of_proc:
|
|
{
|
|
// procedure body
|
|
// in which tail calls transfer control using "goto top_of_proc"
|
|
}
|
|
}
|
|
</pre>
|
|
<p>
|
|
|
|
<h2>Mutual tail recursion</h2>
|
|
|
|
<p>
|
|
The MLDS code generator partitions the procedures of a module
|
|
into a sequence of SCCs,
|
|
where each SCC (strongly connected component)
|
|
consists of a set of procedures
|
|
that are all reachable from each other via calls, whether tail or non-tail.
|
|
Since TRO applies only to tail calls,
|
|
it also partitions each SCC further into one or more TSCCs (tail SCCs),
|
|
which are strongly connected components of a graph
|
|
whose nodes represent procedures
|
|
and in which there are edges only for <em>tail</em> calls.
|
|
This means that by definition, every procedure in a TSCC
|
|
is reachable from every procedure in that TSCC using only tail calls.
|
|
It then implements tail recursion optimization
|
|
in each TSCC that contains tail calls.
|
|
|
|
<p>
|
|
Most TSCCs contain only one procedure,
|
|
which means that we can implement TRO for them
|
|
using only the techniques above,
|
|
without using any of the techniques below.
|
|
The techniques below are needed only for TSCCs that contain
|
|
two or more procedures.
|
|
|
|
<p>
|
|
Note that two (or more) mutually recursive procedures
|
|
can end up in <em>different</em> TSCCs
|
|
even if there is a tail call between them,
|
|
if the tail calls go only one way,
|
|
e.g. if procedure p calls procedure q using tail calls,
|
|
but q calls p using only ordinary nontail calls.
|
|
The LLDS backend can optimize the tail calls to q in p,
|
|
but the MLDS backend cannot do so,
|
|
because it cannot generate nonlocal gotos.
|
|
|
|
<p>
|
|
To implement mutual tail recursion between the procedures of a nontrivial TSCC,
|
|
we need to generalize
|
|
<ul>
|
|
<li>
|
|
the mechanism for parameter passing input arguments at tail calls,
|
|
<li>
|
|
the mechanism for returning output arguments on nonrecursive paths, and
|
|
<li>
|
|
the mechanism for transferring control.
|
|
</ul>
|
|
|
|
<h3>Transfers of control</h3>
|
|
|
|
<p>
|
|
The easiest to generalize is the last one: the transfer of control.
|
|
To see how it is done,
|
|
consider a small TSCC containing two procedures, tscc_a and tscc_b.
|
|
Since we need to translate tail calls into <em>local</em> transfers of control,
|
|
we translate each TSCC together,
|
|
either using labels and gotos like this:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type_a tscc_a(args_a)
|
|
{
|
|
goto top_of_proc_1;
|
|
top_of_proc_1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return ...
|
|
}
|
|
|
|
ret_type_b tscc_b(args_b)
|
|
{
|
|
goto top_of_proc_2;
|
|
top_of_proc_1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using "goto top_of_proc_N"
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return ...
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
or using while loops and switches like this:
|
|
|
|
<p>
|
|
<pre>
|
|
ret_type_a tscc_a(args_a)
|
|
{
|
|
int tscc_selector = 1;
|
|
switch (tscc_selector)
|
|
{
|
|
case 1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
case 2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
}
|
|
|
|
return ...
|
|
}
|
|
|
|
ret_type_b tscc_b(args_b)
|
|
{
|
|
int tscc_selector = 2;
|
|
switch (tscc_selector)
|
|
{
|
|
case 1:
|
|
{
|
|
// body of procedure tscc_a
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
case 2:
|
|
{
|
|
// body of procedure tscc_b
|
|
// in which tail calls transfer control using
|
|
// "tscc_selector = N; continue"
|
|
}
|
|
break;
|
|
}
|
|
|
|
return ...
|
|
}
|
|
</pre>
|
|
|
|
In both cases,
|
|
each procedure in the TSCC has its own number in the TSCC
|
|
(in this case, tscc_a is procedure 1 in the TSCC
|
|
and tscc_b is procedure 2 in the TSCC).
|
|
We call this number the procedure's in-TSCC id number.
|
|
|
|
<p>
|
|
We translate each procedure in the TSCC into MLDS code just once,
|
|
yielding the code represented by "body of procedure ..." above.
|
|
We call these <em>inner</em> or <em>wrapped</em> procedures.
|
|
If the TSCC contains N procedures,
|
|
then each C function we generate will contain N wrapped procedures.
|
|
We call entirety of each C function
|
|
an <em>outer</em> or <em>container</em> procedure,
|
|
since each contains two or more wrapped procedures.
|
|
We must generate a container procedure
|
|
for every member of the TSCC
|
|
that may be called by a non-tail call from anywhere;
|
|
from other modules,
|
|
from other (higher) SCCs in the current module,
|
|
from procedures in the current SCC that are not in the TSCC,
|
|
and via non-tail calls from any procedure in the TSCC itself.
|
|
This means that
|
|
the code of every procedure in a TSCC that contains N procedures
|
|
will be present up to N times in the executable.
|
|
Since mutually-tail-recursive procedures are relatively rare,
|
|
and most TSCCs contain only two or three procedures,
|
|
this increase in the total code memory requirement
|
|
is usually a more than acceptable price to pay
|
|
for the ability to handle arbitrarily deep recursion in constant stack space.
|
|
(In fact, the increased memory requirement is probably not as important
|
|
as the reduction of the effectiveness of the instruction cache:
|
|
the cache misses that bring in the code of a wrapped procedure from main memory
|
|
have to be incurred for <em>each</em> one of its executed copies.)
|
|
|
|
<h3>Parameter passing</h3>
|
|
|
|
<p>
|
|
Parameter passing between the procedures of a TSCC at tail calls
|
|
is not as simple as parameter passing at self-tail-recursive calls,
|
|
because (except in the case of self-tail-calls)
|
|
the actual parameters in the caller and the formal parameters of the callee
|
|
will come from two different procedures, and thus from two different varsets.
|
|
Since every procedure's varset contains variables
|
|
whose numbers are allocated consecutively from zero,
|
|
the sets of variable numbers in two different procedures
|
|
will of course greatly overlap,
|
|
and it is possible for a variable with a given number
|
|
to have the same name in both varsets as well.
|
|
We don't want any such accidental name collisions
|
|
to result in the generated code using the same C variable
|
|
to represent <em>both</em> of the colliding variables,
|
|
since that would be semantically wrong.
|
|
(For starters, the two variables could even have different types,
|
|
but the sharing of their storage would be a bug
|
|
even if they had the same type.)
|
|
We therefore need a mechanism to avoid this problem.
|
|
|
|
<p>
|
|
One possible solution would be to rename (or renumber) apart
|
|
either the varsets of the procedures in the TSCC before code generation,
|
|
or their sets of MLDS variables either during or after code generation.
|
|
Both are problematic.
|
|
HLDS procedures contain <em>lots</em> of fields that contain variables,
|
|
so the code for renaming or renumbering variables in all of them
|
|
would be big (which would pose a program maintenance burden)
|
|
and would turn over a lot of memory (a performance problem).
|
|
And some compiler-generated MLDS variable names
|
|
have fixed names and no changeable number.
|
|
|
|
<p>
|
|
Our chosen solution sidesteps such problems altogether
|
|
by inventing a new set of compiler-generated MLDS variables
|
|
specifically for parameter passing in TSCCs.
|
|
|
|
<ul>
|
|
<li>
|
|
For every input argument of a nondummy type
|
|
for every procedure in the TSCC,
|
|
we create the MLDS variable tscc_proc_N_input_M_VarName,
|
|
where N is the procedure's in-TSCC id number,
|
|
M is input argument's position
|
|
in the list of the procedure's list of input arguments
|
|
(i.e. this is the procedure's Mth input argument),
|
|
and VarName is the name of the argument.
|
|
The VarName part is not needed for correctness;
|
|
it is there only to make the generated MLDS easier to read.
|
|
<li>
|
|
For every output argument
|
|
in the vector of output arguments of the procedures of the TSCC
|
|
(which must be the same in all procedures of the TSCC),
|
|
we create the MLDS variable tscc_output_M_VarName.
|
|
M is the output argument's position in this vector,
|
|
while VarName is the output argument's name
|
|
in <em>one</em> of the procedures of the TSCC.
|
|
The fact that this name need not match the name of the output argument
|
|
in the other procedures does not matter, because again
|
|
the name is there only to make the generated MLDS easier to read.
|
|
<li>
|
|
All the procedures in a TSCC must have the same code model,
|
|
which may be either model_det or model_semi.
|
|
In the latter case, we also create the MLDS variable tscc_output_succeeded.
|
|
Model semi procedures return the value of the succeeded MLDS variable
|
|
as if it were a sort-of output argument;
|
|
the tscc_output_succeeded variable
|
|
has the same relationship to the succeeded variable
|
|
as the tscc_output_M_VarName variables have
|
|
to the MLDS output variables they correspond to.
|
|
</ul>
|
|
|
|
<p>
|
|
In every procedure,
|
|
every argument that participates in parameter passing
|
|
(i.e. every argument that is not of a dummy type and whose mode is not unused)
|
|
has either a corresponding tscc_proc_N_input_M_VarName variable
|
|
(if it is an input argument)
|
|
or a corresponding tscc_output_M_VarName variable
|
|
(if it is an output argument).
|
|
In each such pair of corresponding variables,
|
|
we call the MLDS variable representing the argument
|
|
the procedure's own variable,
|
|
and we call the other the tscc variable.
|
|
|
|
<p>
|
|
Suppose both tscc_a and tscc_b are det functions
|
|
whose argument vectors are tscc_a(AIn1, AIn2) = AOut1
|
|
and tscc_b(BIn1) = BOut1 respectively,
|
|
and the name of the MLDS type of each variable
|
|
is the name of the variable with a "Type" added to it.
|
|
However, since AOutType1 must be the same as BOutType1,
|
|
we will replace both with just "OutType1".
|
|
Then the parameter passing code we generate will look like this:
|
|
|
|
<p>
|
|
<pre>
|
|
OutType1
|
|
tscc_a(
|
|
AInType1 tscc_proc_1_input_1_AIn1,
|
|
AInType2 tscc_proc_1_input_2_AIn2)
|
|
{
|
|
BInType1 tscc_proc_1_input_2_BIn1;
|
|
OutType1 tscc_output_1_AOut1;
|
|
|
|
goto top_of_proc_1;
|
|
top_of_proc_1:
|
|
{
|
|
AInType1 AIn1 = tscc_proc_1_input_1_AIn1;
|
|
AInType2 AIn2 = tscc_proc_1_input_2_AIn2;
|
|
OutType1 AOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to tscc_a look like this:
|
|
// tscc_proc_1_input_1_AIn1 = input arg 1 of tail call;
|
|
// tscc_proc_1_input_2_AIn2 = input arg 2 of tail call;
|
|
// goto top_of_proc_1;
|
|
// tail calls to tscc_b look like this:
|
|
// tscc_proc_1_input_2_BIn1 = input arg 1 of tail call;
|
|
// goto top_of_proc_2;
|
|
// and base cases assign to AOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = AOut1;
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
BInType1 BIn1 = tscc_proc_1_input_2_BIn1;
|
|
OutType1 BOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to both tscc_a and tscc_b look like they do above
|
|
// and base cases assign to BOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = BOut1;
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return tscc_output_1_AOut1;
|
|
}
|
|
|
|
OutType1
|
|
tscc_b(
|
|
BInType1 tscc_proc_2_input_1_BIn1)
|
|
{
|
|
AInType1 tscc_proc_1_input_1_AIn1;
|
|
AInType2 tscc_proc_1_input_2_AIn2;
|
|
OutType1 tscc_output_1_AOut1;
|
|
|
|
goto top_of_proc_2;
|
|
top_of_proc_1:
|
|
{
|
|
AInType1 AIn1 = tscc_proc_1_input_1_AIn1;
|
|
AInType2 AIn2 = tscc_proc_1_input_2_AIn2;
|
|
OutType1 AOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to tscc_a look like this:
|
|
// tscc_proc_1_input_1_AIn1 = input arg 1 of tail call;
|
|
// tscc_proc_1_input_2_AIn2 = input arg 2 of tail call;
|
|
// goto top_of_proc_1;
|
|
// tail calls to tscc_b look like this:
|
|
// tscc_proc_1_input_2_BIn1 = input arg 1 of tail call;
|
|
// goto top_of_proc_2;
|
|
// and base cases assign to AOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = AOut1;
|
|
goto tscc_end;
|
|
}
|
|
top_of_proc_2:
|
|
{
|
|
BInType1 BIn1 = tscc_proc_1_input_2_BIn1;
|
|
OutType1 BOut1;
|
|
|
|
// body of procedure tscc_a in which
|
|
// tail calls to both tscc_a and tscc_b look like they do above
|
|
// and base cases assign to BOut1 as usual
|
|
|
|
tscc_output_1_AOut1 = BOut1;
|
|
goto tscc_end;
|
|
}
|
|
tscc_end:
|
|
return tscc_output_1_AOut1;
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
The general principles of our parameter passing scheme are as follows.
|
|
|
|
<ul>
|
|
<li>
|
|
The MLDS variables that
|
|
we would generate for a procedure in the absence of TRO,
|
|
which we call the procedure's <em>own</em> variables,
|
|
are visible <em>only</em> in the scope
|
|
containing the wrapped body of that procedure.
|
|
(These are the scopes after the top_of_proc_N labels above.)
|
|
Since these scopes never overlap,
|
|
there is never any place in the generated MLDS code
|
|
where the own variables of more than one TSCC procedure are visible at once.
|
|
<li>
|
|
Each scope containing a wrapped procedure occurs
|
|
either after the label corresponding to the procedure
|
|
(if we are using labels and gotos)
|
|
or in the switch case corresponding to the procedure
|
|
(if we are using while loops and switches),
|
|
and consists of
|
|
<ul>
|
|
<li>
|
|
the declarations of the procedure's own MLDS variables
|
|
for both the input and output arguments, followed by
|
|
<li>
|
|
assignments that set the value of each own MLDS input argument
|
|
from the value of the corresponding tscc_proc_N_input_M_VarName variable
|
|
(the above code examples show these
|
|
merged with the definitions of the variables being set), followed by
|
|
<li>
|
|
the wrapped body of the procedure, followed by
|
|
<li>
|
|
assignments that copy the value of each own MLDS output argument
|
|
to the corresponding tscc_output_M_VarName variable, followed by
|
|
<li>
|
|
a jump to the end of the function, via either goto or a break.
|
|
</ul>
|
|
Note that in this case,
|
|
the tail calls do not need to use any next_value_of_ variables.
|
|
The actual parameters can only be the procedure's own variables,
|
|
and the formal parameters being assigned to
|
|
are all tscc_proc_N_input_M_VarName variables,
|
|
so there cannot be any overlap between them.
|
|
<li>
|
|
For procedures whose outputs are all returned by value
|
|
(which includes both tscc_a and tscc_b above),
|
|
their container function consists of the following.
|
|
<ul>
|
|
The declaration of the function signature,
|
|
giving the type (or in general, types)
|
|
of the output argument(s) returned by value,
|
|
and the types and variable names of the input arguments.
|
|
The variables in the signature
|
|
are the tscc_proc_N_input_M_VarName variables
|
|
of the procedure that the container is for,
|
|
which this signature declares.
|
|
<li>
|
|
The body of the container function starts by declaring
|
|
all the tscc_proc_N_input_M_VarName variables
|
|
of all the <em>other</em> procedures in the TSCC,
|
|
and all the tscc_output_M_VarName variables
|
|
(each of which is shared by all the procedures in the TSCC).
|
|
<p>
|
|
The tscc_proc_N_input_M_VarName variables of the procedure
|
|
that the container is for start out initialized by the caller.
|
|
The tscc_proc_N_input_M_VarName variables
|
|
of the other procedures in the TSCC start out uninitialized,
|
|
but their contents will be read
|
|
only by code in the corresponding wrapped procedure body,
|
|
which is reachable only by tail call,
|
|
and every such tail call will first set those variables
|
|
to the appropriate values for the tail call.
|
|
<li>
|
|
Either a jump to the label at the start of the wrapped version
|
|
of the procedure that the container function is for,
|
|
or a setting of the tscc_selector variable that achieves
|
|
the same effect.
|
|
The target of the jump is the wrapped procedure
|
|
whose tscc_proc_N_input_M_VarName variables
|
|
are listed in the container function's signature,
|
|
and whose values are therefore defined for us by the caller.
|
|
<li>
|
|
The bodies of the procedures of the TSCC,
|
|
either with each being preceded by its own label,
|
|
or wrapped up in a loop and a switch,
|
|
as shown above.
|
|
<li>
|
|
A return statement for all the tscc_output_M_VarName variables
|
|
corresponding to output arguments returned by value.
|
|
(In the above example, all output arguments are returned by value.)
|
|
Execution reaches here only after a branch here
|
|
from the end of one of the wrapped procedures,
|
|
each of which, immediately before the branch,
|
|
assigns to all the tscc_output_M_VarName variables.
|
|
</ul>
|
|
</ul>
|
|
|
|
<p>
|
|
For model semi predicates,
|
|
the succeeded (own) variable
|
|
and the tscc_output_succeeded variable corresponding to it
|
|
effectively function as an output argument returned by value.
|
|
By its position in the vector of output arguments,
|
|
it is effectively output argument 0.
|
|
|
|
<p>
|
|
The final detail is the treatment of output arguments
|
|
that are passed by reference.
|
|
Our chosen approach is designed to work even in cases where
|
|
the procedures of the TSCC,
|
|
although they return the same vector of outputs,
|
|
return different subsets of them by reference.
|
|
|
|
<p>
|
|
The basic idea is to generate the wrapped procedures
|
|
as if all the output arguments were returned by value,
|
|
exactly as shown in the example above,
|
|
and to handle the difference at the container function level.
|
|
|
|
<p>
|
|
The parts of a container function
|
|
corresponding to an output argument passed by value are the following.
|
|
|
|
<ul>
|
|
<li>
|
|
The type of the output argument is part of the vector of return types
|
|
in the function signature.
|
|
<li>
|
|
The tscc_output_M_VarName variable of the output is declared
|
|
at the start of the container function.
|
|
<li>
|
|
The epilogue of each wrapped procedure
|
|
assigns to the tscc_output_M_VarName variable of the output.
|
|
<li>
|
|
The epilogue of the container function
|
|
returns the value of this tscc_output_M_VarName variable
|
|
as part of the vector of return values.
|
|
</ul>
|
|
|
|
<p>
|
|
When an output argument is passed by reference,
|
|
we create a tscc_output_ptr_M_VarName variable for it
|
|
as well as a tscc_output_ptr_M_VarName variable.
|
|
The parts of a container function
|
|
corresponding to such an output argument are the following.
|
|
|
|
<ul>
|
|
<li>
|
|
The type of the output argument
|
|
is <em>not</em> part of the vector of return types in the function signature.
|
|
Instead, the argument vector contains tscc_output_ptr_M_VarName,
|
|
and its type is a pointer to the type of the output argument.
|
|
The position of the tscc_output_ptr_M_VarName is derived from
|
|
the position of the argument in the HLDS.
|
|
<li>
|
|
The tscc_output_M_VarName variable of the output is declared
|
|
at the start of the container function, as before.
|
|
<li>
|
|
The epilogue of each wrapped procedure
|
|
assigns to the tscc_output_M_VarName variable of the output,
|
|
as before.
|
|
<li>
|
|
The epilogue of the container function
|
|
returns the value of this tscc_output_M_VarName variable,
|
|
but <em>not</em> as part of the vector of return values.
|
|
Instead, just before the return statement,
|
|
we include the assignment *tscc_output_ptr_M_VarName = tscc_output_M_VarName.
|
|
</ul>
|
|
|
|
</body>
|
|
</html>
|