Estimated hours taken: _____
Take the code generator a big step closer to notes/ALLOCATION.
The new code generator emits code that is smaller and faster than
the code we used to emit.
Nondet liveness is no longer used; nondet live sets are always empty.
In code that was being modified anyway, remove its handling. Other
uses will be removed later (this keeps this change from being far too big;
as it is it is merely too big). Similarly for cont-lives.
In several places, clarify the code that gathers several code pieces together.
call_gen:
Unset the failure continuation and flush the resume vars to
their stack slots before nondet calls.
Move the code that decides whether a nondet call can be a tailcall
to code_info.
code_aux:
Remove the code to handle resume points, since these are now
handled in the specific constructs that need them. Replace it
with a sanity check.
code_exprn:
Add a predicate to place multiple vars.
code_gen:
Remove the predicate code_gen__generate_forced_goal, since it
packaged together some operations that should be executed at different
times.
Don't unset the failure continuation after every nondet goal;
this is now done in the constructs that need it.
Modify the handling of negation to use resume point info
according to notes/ALLOCATION.
Remove the predicate code_gen__ensure_vars_are_saved which was
use to save all lives variables to the stack before nondet
disjunctions and if-then-elses; we don't do that anymore.
code_info:
Significantly simplify and document the handling of failure
continuations, and make the types involved abstract types.
Factor out common code in the handling of det and semi commits.
Keep track of "zombies", variables that are dead wrt forward
execution but whose values we need because they may be needed
at a resume point we can reach.
Remove several now unneeded predicates, and introduce new
predicates to help other modules.
code_util:
Add a couple of predicates to check whether ia goal cannot fail before
flushing all variables to the stack, and whether a goal cannot flush
any variables to the stack. These are used in liveness to decide
which entry labels will be needed at resume points.
disj_gen:
Unify the handling of det and semi disjunctions. Model the code
handling of nondet disjunctions on the code handling pruned
disjunctions. It is possible that the handling of nondet and pruned
disjunctions can also be unified; the new code should make this
significantly easier.
Make the code conform to notes/ALLOCATION. This means saving
only the variables mentioned in the resume_point field, not
flushing all live variables to the stack at the start of a
nondet disjunction, handling zombies, and using the new method
of flushing variables at the ends of branched structures.
ite_gen:
Unify the handling of det and semi if-then-elses. Model the code
handling of nondet if-then-elses on the code handling det/semi
if-then-elses. It is possible that the handling of nondet and pruned
if-then-elses can also be unified; the new code should make this
significantly easier.
Make the code conform to notes/ALLOCATION. This means saving
only the variables mentioned in the resume_point field, not
flushing all live variables to the stack at the start of a
nondet if-then-else, handling zombies, and using the new method
of flushing variables at the ends of branched structures.
Apply the new rules about liveness in if-then-elses, which say that
the else part is parallel not to the then part but to the conjunction
of the condition and the then part.
dense_switch, lookup_switch, string_switch, switch_gen, tag_switch, middle_rec:
Use the new method of flushing variables at the ends of branched
structures. Don't call remake_with_store map; switch_gen will do so.
Fix an old bug in lookup_switch.
The code in switch_gen which looked for the special case of a two-way
switch used to use a heuristic to decide which one was recursive and
which one was a base case. We now check the codes of the cases.
hlds_goal:
Adjust the structure of the resume_point field to make it easier
to use. Add a more convenient access predicate.
hlds_out:
Don't print the nondet liveness and cont live fields, since they are
not used anymore. Comment out the printing of the context field,
which is rarely useful. Modify the printing of the resume_point field
to conform to its new definition.
live_vars:
Use the resume_point field, not the nondetlives field, to decide
which variables may be needed on backward execution. Remove some
code copied from liveness.m.
liveness:
Put the several pieces of information we thread through the traversal
predicates into a single tuple.
Don't put variables which are local to one branch of a branched
structure into the post-birth sets of other branches.
Apply the new rules about liveness in if-then-elses, which say that
the else part is parallel not to the then part but to the conjunction
of the condition and the then part. Variables that are needed in the
else part but not in the condition or the then part now die in at the
start of the condition (they will be protected by the resume point on
the condition).
We now treat pruned and non-pruned disjunctions the same way
wrt deadness; the old way was too conservative (it had to be).
We still mishandle branches which produce some variables but
can't succeed.
mercury_compile:
Liveness now prints its own progress message with -V; support this.
store_alloc:
When figuring out what variables need to be saved across calls,
make sure that we put in interference arcs between those variables
and those that are required by enclosing resume points.
Don't compute cont-lives, since they are not used anymore.
livemap:
Fix the starting comment.
Estimated hours taken: 2.5
Switch from using a stack of store_maps in the code_info to govern what
goes where at the end of each branched structure to using the store map
fields of the goal expressions of those structures.
Fix variable names where they resembled the wrong kind of map(var, lval).
code_info:
Remove the operations on stacks of store maps.
Modify the generate_forced_saves and remake_with_store_map operations
to take a store_map parameter.
When making variables magically live, pick random unused variables
to hold them, since we can no longer use the guidance of the top
store map stack entry. This may lead to the generation of some
excess move instructions at non-reachable points in the code;
this will be fixed later.
code_gen:
Remove the store map push and pop invocations.
Modify the generate_forced_goal operation to take a store_map parameter.
code_exprn:
Export a predicate for use by code_info.
middle_rec, disj_gen, ite_gen, switch_gen,
dense_switch, lookup_switch, string_switch, tag_switch:
Pass the store map around to get it to invocations of the primitives
in code_gen and code_info that now need it.
goal_util:
Name apart the new follow_vars field in hlds__goal_infos.
(This should have been in the change that introduced that field.)
common, constraint, cse_detection, det_analysis, dnf, excess, follow_code,
intermod, lambda, lco, liveness, make_hlds, mode_util, modes, polymorphism,
quantification, simplify, switch_detection, typecheck, unique_modes,
unused_args:
Fix variable names.
follow_vars, store_alloc:
Add comments.
Estimated hours taken: 15
hlds_data:
Rename address_const to code_addr_const, and add base_type_info_const
as a new alternative in cons_id, and make corresponding changes
to cons_tag.
Make hlds_type__defn an abstract type.
llds:
Rename address_const to code_addr_const, and add data_addr_const
as a new alternative in rval_const.
Change type "label" to have four alternatives, not three:
local/2 (for internal labels), c_local (local to a C module),
local/1 (local a Mercury module but not necessarily to a C module,
and exported.
llds_out:
Keep track of the things declared previously, and don't declare them
again unnecessarily. Associate indentation with the following item
rather than the previous item (the influence of 244); this results
in braces being put in different places than previously, but should be
easier to maintain. Handle the new forms of addresses and labels.
Refer to c_local labels as STATIC when not using --split-c-files.
code_info:
Use a presently junk field to store a cell counter, which is used
to allocate distinguishing numbers to create'd cells. Previously
we used the label counter, which meant that label numbers changed
when we optimized away some creates. Handle the new forms of
addresses and labels.
exprn_aux:
Handle the new forms of addresses and labels. We are now more
precise in figuring out what label address forms will be considered
constants by the C compilers.
others:
Changes to handle the new forms of addresses and labels, and/or to
access hlds_type__defn as an abstract type.
Estimated hours taken: 0.1
Spotted a bizzare piece of C code - a COMPUTED_GOTO with a single
label. Bug was a `>=' instead of a `<' when deciding whether or not
to use a computed goto in tag_switch.
compiler/tag_switch.m:
Fix a tiny bug which meant we were not using computed gotos
in some cases where we should.
Estimated hours taken: 3
options:
Add a new option, --branch-delay-slot, intended for use by mc on
the basis of the configuattion script. It says whether the machine
architecture has delays slots on branches.
The setting of option should affect whether we set
--optimize-delay-slots at -O2, but this doesn't work yet.
hlds_goal:
Add an extra field to hold follow_vars infromation to disjunctions,
switches and if-then-elses. I intend to use this information to
generate better code.
*.m:
Changes to accommodate the extra field.
Estimated hours taken: 16
options:
Replace the word_size option with the two options bits_per_word and
bytes_per_word. The former is needed by lookup_switch, the latter by
value numbering.
lookup_switch:
Use the new option instead of word_size.
vn_type, vn_cost, vn_block, value_number:
Add a new type, vn_params, containing information such as the number
of bytes per word (from the option) and cost parameters. Use these
cost parameters to make more realistic decisions.
vn_filter:
New module to filter out unnecessary uses of temporary variables,
which gcc does unnecessarily badly on.
value_number, vn_verify:
Move verification completely to vn_verify. Tighten the verification
rules relating to tags where it concerns code sequences in which
the tag of an rval is taken in a statement before an if_val, but
loosen them to avoid spurious rejections of code sequences containing
arithmetic comparisons. Fix some missing cases from semidet switches
that may have lead to overly conservative decisions.
value_number, vn_order:
Vn_order was making an overly conservative assumption about where
to split an extended basic block if it couldn't be optimized together.
Move the decision to value_number and try to make it better. The new
heuristic is not enabled yet.
vn_debug:
Change the conditions under which one type of message is printed.
vn_flush:
Wrap some too long lines.
llds:
Fix a bug that would prevent profiling from working correctly on
value numbered code: we weren't scanning instructions inside blocks
when looking for return addresses.
peephole:
Enable an optimization previously left disabled by accident.
switch_detection, tag_switch:
Eliminate an unused argument.
Estimated hours taken: 10
hlds, hlds_module, hlds_pred, hlds_goal, hlds_data:
Divided the old hlds.m into four files:
hlds_module.m defines the data structures that deal with issues
that are wider than a single predicate. These data structures are
the module_info structure, dependency_info, the predicate table
and the shape table.
hlds_pred.m defined pred_info and proc_info, pred_id and proc_id.
hlds_goal.m defines hlds__goal, hlds__goal_{expr,info}, and the
other parts of goal structures.
hlsd_data.m defines the HLDS types that deal with issues related
to data and its representation: function symbols, types, insts, modes.
It also defines the types related to determinism.
hlds.m is now an empty module. I have not removed it from CVS
because we may need the name hlds.m again, and CVS does not like
the reuse of a name once removed.
other modules:
Import the necessary part of hlds.
det_analysis:
Define a type that was up to now improperly defined in hlds.m.
prog_io:
Move the definition of type determinism to hlds_data. This decision
may need to be revisited when prog_io is broken up.
dnf, lambda:
Simplify the task of defining predicates.
llds:
Fix some comments.
mercury_compile:
If the option -d all is given, dump all HLDS stages.
shape, unused_args:
Fix formatting.
Estimated hours taken: 4
Changed the way configuration parameters are handled so that we
can avoid bootstrapping problems. Instead of getting configuration
paramters from `conf.m.in', they are now passed via the `mc' script.
Also renamed the `num_real_regs' option to `num_real_r_regs',
to avoid confusion with the NUM_REAL_REGS macro set in runtime/machdeps/*.h
(which has a different meaning).
compiler/conf.m.in:
Removed this module, which used to define the old
conf__low_tags_bits/1 predicate.
compiler/Mmake:
Removed references to conf.m*.
compiler/options.m:
Added conf_low_tag_bits option, to replace the old
conf__low_tag_bits/1 predicate.
Rename num_real_regs option as num_real_r_regs.
compiler/tag_switch.m:
Rename num_real_regs option as num_real_r_regs.
compiler/mercury_compile.pp:
Use the conf_low_tag_bits option rather than calling
the old conf__low_tag_bits/1 predicate.
excess:
A new pass to remove unnecessary assignment unifications.
mercury_compile:
Call the new excess assignment module.
options:
Add a new option, excess_assign, to control the new optimization.
Add another, num-real-regs, to specify how many of r1, r2 etc are
actually real registers. The default is now set to 5 for kryten;
later it should be supplied by the mc script, with a value determined
at configuration time.
tag_switch:
Use num-real-regs to figure out whether it is likely to be worthwhile
to eliminate the common subexpression of taking the primary tag of
a variable. Also fix an old performance bug: the test for when a
jump table is worthwhile was reversed.
value_number, vn_block:
Do value numbering on extended basic blocks, not basic blocks.
vn_debug:
Modify an information message.
labelopt:
Clean up an export an internal predicate for value numbering. Replace
bintree_set with set.
middle_rec:
Prepare for the generalization of middle recursion optimization
to include predicates with an if-then-else structure.
cse_detection:
Fix a bug: when hoisting a common desconstruction X = f(Yi), create
new variables for the Yi. This avoids problems with any of the Yis
appearing in other branches of the code.
goal_util:
Add a new predicate for use by cse_detection.
common:
Fix a bug: recompute instmap deltas, since they may be affected by the
optimization of common structures.
code_info:
Make an error message more explicit.
det_analysis:
Restrict import list to the needed modules.
*.m:
Import assoc_list.
code_info:
Expose the predicate for producing a variable not into an arbitrary
location, but into a register.
tag_switch:
Produce the switched-on variable into a register, since we will
need it several times (to extract the primary tag, probably to
extract a secondary tag, and then -usually- to get some of its
fields).
tag_switch:
Fixed two bugs. First, if a primary tag value did not have cases for
all its secondary tag values, we now emit a goto the failure label
if the secondary tag does not match any case; we used to just fall
through. Second, the failure code itself used to be generated in
the context of the end of one of the cases; this should now be fixed,
although I want to go over it with Tom to make sure.
The computation of the secondary tag is now done once, instead of
being repeated at every secondary tag test.
options:
Set tag_switch_size to 4 by default, reduced from 8. It was this change
that exposed the two bugs above. After the fix, the compiler is smaller
by about 2 Kb.
switch_gen:
Add some comments.
code_util:
Fixed nonstandard indentation.
llds:
Optionally generate while (1) loops instead of short backward branches.
This is faster in the absence of fast jumps.
options:
Add a new option, --no-emit-c-loops.
middle_rec:
We now check if the LLDS code after the recursive call is empty.
If yes, we don't generate the downward loop.
code_aux:
Minor cleanup associated with previous change.
frameopt:
Instead of blindly assuming that any code before an if_val will be
able to fill the delay slot, we check whether it computes a value
that is used in the condition. We now also allow a slightly wider
range of user instructions to fill delay slots.
opt_util:
Some new preds to support the new funcionality in frameopt.
tag_switch:
Compute the tag of the switched-on value into a register at the
start, instead of computing it in each if_val.
instructions, and the last argument from local labels. All these were
placeholders for info put in there by prof.m and used when emitting C
code.
The set of labels that serve as return points are now calculated in llds.m
just before each procedure has its C code generated. This set is passed to
output_instruction along with the label at the start of the procedure.
*.m:
Changed the way the extra field in the label type is defined. Now
all labels are initially assumed to be 'unknown' and a seperate
profiling pass (to be implemented) will determine whether the label can
be accessed externally.
tag_switch.m:
Fix a Zoltan bug! ;-)
This was the bug causing the "Software error: no failure continuation"
errors. The problem was that when Zoltan changed the field
in switches from a determism to can_fail/cannot_fail, he accidentally
inverted the sense of a test, causing the code to generate the
"fail" case only for tag switches which can never fail.
make_hlds:
Fix one-character bug that rejected all non-det modes.
tag_switch:
Minor cleanup.
det_analysis:
Prepare for error diagnosis.
hlds_out:
Print out determinisms, not code_models.
prog_io, hlds: Added the functor "multidet" to the type determinism.
Added types and predicates to relate determinism to its
two components, can_fail and soln_count.
Removed the functor "unspecified" from the type determinism,
substituting maybe(determinism) for determinism in proc_info.
Replaced the type category with the type code_model,
and added predicates to compute it from determinism.
det_analysis: Redone the analyses to work with determinism, not category
(or code_model). This should enable programmers to write
their own erroneous (and failure) predicates.
other files: Use the new and renamed types and access predicates.
compiler/*:
Add copyright messages.
Change all occurences of *.nl in comments to *.m.
compiler/mercury_compile.pp:
Change the output to the .dep files to use *.m rather than *.nl.
(NOTE: this means that `mmake' will not work any more if you
call your files *.nl!!!)
io.nl:
Introduced a new predicate which ignore's any whitespace in the input.
Needs to have all the whitespace character's added to it.
*.nl and *.pp:
Changed the implementation of time profiling. Now during a compile,
the compiler identifies all the internal labels which can be accessed
externally, and marks them. At the moment, these are the continuation
labels of calls and the next disjunct in nondet disjunctions. Then
at the .mod output, it places a macro 'update_prof_current_proc' to
restore the profiling counter.
llds.nl:
Introduced an extra argument to the LLDS goto. It is the label
address of the Caller and is used for the profiling of tailcall's.
*.nl and *.pp:
Propagated the extra argument to all the appropiate files.
llds.nl, *.nl:
Change field(int, rval, int) to field(int, rval, rval), so
that the field number can be calculated at runtime
(We need this for predicate closures).
Also, remove the incr_hp(int) instruction and replace it
with a heap_alloc(rval) rval.
(We need to determine the space allocated at runtime
for predicate closures, and also we want it to be an rval
not an instruction so we can get conservative garbage collection
to work.)
unify_gen.nl:
More work for higher-order predicate closures.
std_util.nl:
Recode `bool__and' and `bool__or' more elegantly.
value_number, vn_util, atsort:
big strides towards geetting value_number working. Not there yet.
Vn_util will be added next checkin.
llds, code_info, middle_rec, tag_switch, unify_gen, opt_util:
Changed the handling of fields. They are now only lvals, with the
base being an rval.
det_analysis:
Factored out some common code in problem reporting.
jumpopt:
Cosmetic changes.
tag_switch.nl:
Fix bug: the original code assumed that code_info__produce_variable
would always produce an lvalue. That assumption was not correct -
if the variable's value is known, code_info__produce_variable
will return a constant, rather than a reference to some storage.
For example, it failed for the following code.
:- type t ---> f(int) ; g(int) ; h.
:- pred p(t::out) is det.
p(Y) :- X = h, (X = h ; X = f(_) ; X = g(_)), Y = X.
now in dense_switch, string_switch and tag_switch, with the original
if-then-else implementation and the code that decides on optimizations
still in switch_gen.
Added options to replace the magic numbers governing the choice of switch
method.
Added comments to frameopt, jumpopt, labelopt and peephole.