Estimated hours taken: 16
options:
Replace the word_size option with the two options bits_per_word and
bytes_per_word. The former is needed by lookup_switch, the latter by
value numbering.
lookup_switch:
Use the new option instead of word_size.
vn_type, vn_cost, vn_block, value_number:
Add a new type, vn_params, containing information such as the number
of bytes per word (from the option) and cost parameters. Use these
cost parameters to make more realistic decisions.
vn_filter:
New module to filter out unnecessary uses of temporary variables,
which gcc does unnecessarily badly on.
value_number, vn_verify:
Move verification completely to vn_verify. Tighten the verification
rules relating to tags where it concerns code sequences in which
the tag of an rval is taken in a statement before an if_val, but
loosen them to avoid spurious rejections of code sequences containing
arithmetic comparisons. Fix some missing cases from semidet switches
that may have lead to overly conservative decisions.
value_number, vn_order:
Vn_order was making an overly conservative assumption about where
to split an extended basic block if it couldn't be optimized together.
Move the decision to value_number and try to make it better. The new
heuristic is not enabled yet.
vn_debug:
Change the conditions under which one type of message is printed.
vn_flush:
Wrap some too long lines.
llds:
Fix a bug that would prevent profiling from working correctly on
value numbered code: we weren't scanning instructions inside blocks
when looking for return addresses.
peephole:
Enable an optimization previously left disabled by accident.
switch_detection, tag_switch:
Eliminate an unused argument.
Estimated hours taken: 10
hlds, hlds_module, hlds_pred, hlds_goal, hlds_data:
Divided the old hlds.m into four files:
hlds_module.m defines the data structures that deal with issues
that are wider than a single predicate. These data structures are
the module_info structure, dependency_info, the predicate table
and the shape table.
hlds_pred.m defined pred_info and proc_info, pred_id and proc_id.
hlds_goal.m defines hlds__goal, hlds__goal_{expr,info}, and the
other parts of goal structures.
hlsd_data.m defines the HLDS types that deal with issues related
to data and its representation: function symbols, types, insts, modes.
It also defines the types related to determinism.
hlds.m is now an empty module. I have not removed it from CVS
because we may need the name hlds.m again, and CVS does not like
the reuse of a name once removed.
other modules:
Import the necessary part of hlds.
det_analysis:
Define a type that was up to now improperly defined in hlds.m.
prog_io:
Move the definition of type determinism to hlds_data. This decision
may need to be revisited when prog_io is broken up.
dnf, lambda:
Simplify the task of defining predicates.
llds:
Fix some comments.
mercury_compile:
If the option -d all is given, dump all HLDS stages.
shape, unused_args:
Fix formatting.
Estimated hours taken: 4
Changed the way configuration parameters are handled so that we
can avoid bootstrapping problems. Instead of getting configuration
paramters from `conf.m.in', they are now passed via the `mc' script.
Also renamed the `num_real_regs' option to `num_real_r_regs',
to avoid confusion with the NUM_REAL_REGS macro set in runtime/machdeps/*.h
(which has a different meaning).
compiler/conf.m.in:
Removed this module, which used to define the old
conf__low_tags_bits/1 predicate.
compiler/Mmake:
Removed references to conf.m*.
compiler/options.m:
Added conf_low_tag_bits option, to replace the old
conf__low_tag_bits/1 predicate.
Rename num_real_regs option as num_real_r_regs.
compiler/tag_switch.m:
Rename num_real_regs option as num_real_r_regs.
compiler/mercury_compile.pp:
Use the conf_low_tag_bits option rather than calling
the old conf__low_tag_bits/1 predicate.
excess:
A new pass to remove unnecessary assignment unifications.
mercury_compile:
Call the new excess assignment module.
options:
Add a new option, excess_assign, to control the new optimization.
Add another, num-real-regs, to specify how many of r1, r2 etc are
actually real registers. The default is now set to 5 for kryten;
later it should be supplied by the mc script, with a value determined
at configuration time.
tag_switch:
Use num-real-regs to figure out whether it is likely to be worthwhile
to eliminate the common subexpression of taking the primary tag of
a variable. Also fix an old performance bug: the test for when a
jump table is worthwhile was reversed.
value_number, vn_block:
Do value numbering on extended basic blocks, not basic blocks.
vn_debug:
Modify an information message.
labelopt:
Clean up an export an internal predicate for value numbering. Replace
bintree_set with set.
middle_rec:
Prepare for the generalization of middle recursion optimization
to include predicates with an if-then-else structure.
cse_detection:
Fix a bug: when hoisting a common desconstruction X = f(Yi), create
new variables for the Yi. This avoids problems with any of the Yis
appearing in other branches of the code.
goal_util:
Add a new predicate for use by cse_detection.
common:
Fix a bug: recompute instmap deltas, since they may be affected by the
optimization of common structures.
code_info:
Make an error message more explicit.
det_analysis:
Restrict import list to the needed modules.
*.m:
Import assoc_list.
code_info:
Expose the predicate for producing a variable not into an arbitrary
location, but into a register.
tag_switch:
Produce the switched-on variable into a register, since we will
need it several times (to extract the primary tag, probably to
extract a secondary tag, and then -usually- to get some of its
fields).
tag_switch:
Fixed two bugs. First, if a primary tag value did not have cases for
all its secondary tag values, we now emit a goto the failure label
if the secondary tag does not match any case; we used to just fall
through. Second, the failure code itself used to be generated in
the context of the end of one of the cases; this should now be fixed,
although I want to go over it with Tom to make sure.
The computation of the secondary tag is now done once, instead of
being repeated at every secondary tag test.
options:
Set tag_switch_size to 4 by default, reduced from 8. It was this change
that exposed the two bugs above. After the fix, the compiler is smaller
by about 2 Kb.
switch_gen:
Add some comments.
code_util:
Fixed nonstandard indentation.
llds:
Optionally generate while (1) loops instead of short backward branches.
This is faster in the absence of fast jumps.
options:
Add a new option, --no-emit-c-loops.
middle_rec:
We now check if the LLDS code after the recursive call is empty.
If yes, we don't generate the downward loop.
code_aux:
Minor cleanup associated with previous change.
frameopt:
Instead of blindly assuming that any code before an if_val will be
able to fill the delay slot, we check whether it computes a value
that is used in the condition. We now also allow a slightly wider
range of user instructions to fill delay slots.
opt_util:
Some new preds to support the new funcionality in frameopt.
tag_switch:
Compute the tag of the switched-on value into a register at the
start, instead of computing it in each if_val.
instructions, and the last argument from local labels. All these were
placeholders for info put in there by prof.m and used when emitting C
code.
The set of labels that serve as return points are now calculated in llds.m
just before each procedure has its C code generated. This set is passed to
output_instruction along with the label at the start of the procedure.
*.m:
Changed the way the extra field in the label type is defined. Now
all labels are initially assumed to be 'unknown' and a seperate
profiling pass (to be implemented) will determine whether the label can
be accessed externally.
tag_switch.m:
Fix a Zoltan bug! ;-)
This was the bug causing the "Software error: no failure continuation"
errors. The problem was that when Zoltan changed the field
in switches from a determism to can_fail/cannot_fail, he accidentally
inverted the sense of a test, causing the code to generate the
"fail" case only for tag switches which can never fail.
make_hlds:
Fix one-character bug that rejected all non-det modes.
tag_switch:
Minor cleanup.
det_analysis:
Prepare for error diagnosis.
hlds_out:
Print out determinisms, not code_models.
prog_io, hlds: Added the functor "multidet" to the type determinism.
Added types and predicates to relate determinism to its
two components, can_fail and soln_count.
Removed the functor "unspecified" from the type determinism,
substituting maybe(determinism) for determinism in proc_info.
Replaced the type category with the type code_model,
and added predicates to compute it from determinism.
det_analysis: Redone the analyses to work with determinism, not category
(or code_model). This should enable programmers to write
their own erroneous (and failure) predicates.
other files: Use the new and renamed types and access predicates.
compiler/*:
Add copyright messages.
Change all occurences of *.nl in comments to *.m.
compiler/mercury_compile.pp:
Change the output to the .dep files to use *.m rather than *.nl.
(NOTE: this means that `mmake' will not work any more if you
call your files *.nl!!!)
io.nl:
Introduced a new predicate which ignore's any whitespace in the input.
Needs to have all the whitespace character's added to it.
*.nl and *.pp:
Changed the implementation of time profiling. Now during a compile,
the compiler identifies all the internal labels which can be accessed
externally, and marks them. At the moment, these are the continuation
labels of calls and the next disjunct in nondet disjunctions. Then
at the .mod output, it places a macro 'update_prof_current_proc' to
restore the profiling counter.
llds.nl:
Introduced an extra argument to the LLDS goto. It is the label
address of the Caller and is used for the profiling of tailcall's.
*.nl and *.pp:
Propagated the extra argument to all the appropiate files.
llds.nl, *.nl:
Change field(int, rval, int) to field(int, rval, rval), so
that the field number can be calculated at runtime
(We need this for predicate closures).
Also, remove the incr_hp(int) instruction and replace it
with a heap_alloc(rval) rval.
(We need to determine the space allocated at runtime
for predicate closures, and also we want it to be an rval
not an instruction so we can get conservative garbage collection
to work.)
unify_gen.nl:
More work for higher-order predicate closures.
std_util.nl:
Recode `bool__and' and `bool__or' more elegantly.
value_number, vn_util, atsort:
big strides towards geetting value_number working. Not there yet.
Vn_util will be added next checkin.
llds, code_info, middle_rec, tag_switch, unify_gen, opt_util:
Changed the handling of fields. They are now only lvals, with the
base being an rval.
det_analysis:
Factored out some common code in problem reporting.
jumpopt:
Cosmetic changes.
tag_switch.nl:
Fix bug: the original code assumed that code_info__produce_variable
would always produce an lvalue. That assumption was not correct -
if the variable's value is known, code_info__produce_variable
will return a constant, rather than a reference to some storage.
For example, it failed for the following code.
:- type t ---> f(int) ; g(int) ; h.
:- pred p(t::out) is det.
p(Y) :- X = h, (X = h ; X = f(_) ; X = g(_)), Y = X.
now in dense_switch, string_switch and tag_switch, with the original
if-then-else implementation and the code that decides on optimizations
still in switch_gen.
Added options to replace the magic numbers governing the choice of switch
method.
Added comments to frameopt, jumpopt, labelopt and peephole.