excess:
A new pass to remove unnecessary assignment unifications.
mercury_compile:
Call the new excess assignment module.
options:
Add a new option, excess_assign, to control the new optimization.
Add another, num-real-regs, to specify how many of r1, r2 etc are
actually real registers. The default is now set to 5 for kryten;
later it should be supplied by the mc script, with a value determined
at configuration time.
tag_switch:
Use num-real-regs to figure out whether it is likely to be worthwhile
to eliminate the common subexpression of taking the primary tag of
a variable. Also fix an old performance bug: the test for when a
jump table is worthwhile was reversed.
value_number, vn_block:
Do value numbering on extended basic blocks, not basic blocks.
vn_debug:
Modify an information message.
labelopt:
Clean up an export an internal predicate for value numbering. Replace
bintree_set with set.
middle_rec:
Prepare for the generalization of middle recursion optimization
to include predicates with an if-then-else structure.
cse_detection:
Fix a bug: when hoisting a common desconstruction X = f(Yi), create
new variables for the Yi. This avoids problems with any of the Yis
appearing in other branches of the code.
goal_util:
Add a new predicate for use by cse_detection.
common:
Fix a bug: recompute instmap deltas, since they may be affected by the
optimization of common structures.
code_info:
Make an error message more explicit.
det_analysis:
Restrict import list to the needed modules.
*.m:
Import assoc_list.
The changes made allow declarations of the form:
:- pragma(c_code, predname(Varname1::mode1, Varname2::mode2, ...),
"Some C code to execute instead of a mercury clause;").
There are still a couple of minor problems to be fixed in the near future:
If there is a regular clause given as well as a pragma(c_code, ...) dec, it
is not handled well, and variables names '_' are not handled well.
prog_io.m:
parse the pragma(c_code, ...) dec.
hlds.m:
define a new hlds__goal_expr 'pragma_c_code'.
make_hlds.m:
insert the pragma(c_code, ...) dec. as a pragma_c_code into the hlds.
det_analysis.m:
infer that pragma_c_code goals are det.
modes.m:
convince the mode checker that the correct pragma variables are bound
etc.
quantification.m:
quantify the variables in the pragma(c_code, ...) dec.
code_gen.pp:
convert pragma_c_code into pragma_c (in the llds).
llds.m:
define a new instr, pragma_c. Output the pragma_c
hlds_out.m:
mercury_to_mercury.m:
mercury_to_goedel.m:
spit out pragma(c_code, ...) decs properly
*.m: handle the new pragma_c_code in the hlds or the new pragma_c in the llds
jumpopt:
Added last call optimization for nondet predicates.
llds:
Added a new lval type to represent the succip slot of nondet
stack frames.
other files:
Changes required by the change to llds (there is a minor unrelated
change in vn_cost as well).
Tyson: please check my changes to code_info__get_shape_num and
garbage_out__write_liveval.
instructions, and the last argument from local labels. All these were
placeholders for info put in there by prof.m and used when emitting C
code.
The set of labels that serve as return points are now calculated in llds.m
just before each procedure has its C code generated. This set is passed to
output_instruction along with the label at the start of the procedure.
This set of changes includes most of the work necessary for
mode and determinism checking of higher-order predicates.
prog_io.m:
Change the syntax for lambda expressions: they need
to have a determinism declaration. Lambda
expressions must now look like this:
lambda([X::in, Y::out] is det, ...goal...).
^^^^^^
Note that both the modes and the determinism are mandatory,
not optional.
hlds.m:
Insert a determinism field in the lambda_goal structure.
hlds_out.m, inlining.m, make_hlds.m, modes.m, polymorphism.m, quantification.m,
switch_detection.m, typecheck.m:
Modified to use lambda_goal/4 rather than lambda_goal/3.
prog_io.m:
Add a new field to the `ground' inst, of type `maybe(pred_inst_info)'.
We use this to store the modes and determinism of higher-order
predicate terms.
code_info.m, inst_match.m, mercury_to_mercury.m, mode_util.m, modes.m,
polymorphism.m, shapes.m, undef_modes.m:
Modified to handle higher-order pred modes:
use ground/2 rather than ground/1.
(Note that modes.m still requires a bit more work on this.)
llds.m:
Add a new field to the call_closure/3 instruction to hold the
caller address for use with profiling, since the C macros
require a caller address.
dup_elim.m, frame_opt.m, garbage_out.m, live_map.m, middle_rec.m, opt_debug.m,
opt_util.m, value_number.m, vn_*.m:
Modified to use call_closure/4 rather than call_closure/3.
mercury_to_mercury.m:
Export mercury_output_det for use by hlds_out.m.
frameopt:
Make the teardown map bidirectional, and export it.
peephole:
Add a new pattern to handle cases generated by fulljump optimization.
This pattern uses the teardownmap, but it is disabled for the moment.
optimize:
Pass the teardown map where it is needed, and make sure we do a
peephole pass immediately after frameopt to use the teardownmap
while it is still valid.
jumpopt:
Rename a variable.
labelopt:
A block being eliminated may have the last remaining reference
to the label starting another block. Therefore on the last invocation
of labelopt, we iterate to a fixpoint before returning.
opt_debug:
Add a predicate to help debug bidirectional teardown maps.
opt_util:
Liberalized some optimizations, but the changes are disabled for the
moment.
value_number:
Pass an empty teardown map to peephole. Loosen the sanity check on
tags a bit.
vn_order:
If the new value of a location depends on its old value,
avoid creating a circularity in the preferred order relation.
Such circularity may be broken arbitrarily, even though we have
a clear preference.
vn_flush:
Modify the criteria for saving the old value stored in a location
about to be overwritten, in an effort to eliminate useless copies
of old values.
vn_util:
Tighten the requirement for classifying a use as a "real" use,
also in the effort to eliminate useless saves of old values.
Recognize some more expression patterns as yieldsing known
results. Move some functionality from vn_flush to vn_util,
since it is needed by the other modification.
mercury_compile:
Sort the list of interface files before printing them to a .d file.
opt_util, peephole:
Fix a bug tickled by value numbering. Some sequences of code were
recognized as having no access to nondet stack control slots even
in the presence of such accesses, which lead to the incorrect
introduction of succeed_discards.
value_number:
Loosen the value correspondence sanity check, which was failing
needlessly, and tighten the tag sanity check, which was passing
incorrect code.
Do not try value numbering on blocks containing structures such as
"if (tag(x) == X && field(X, x, X) == X) goto X", since these will
definitely lead to tag sanity check violations.
vn_flush:
If a shared node has no uses left when flushed, leave it be.
When generating a mkframe, reflect its update of the top redoip slot
in the data structures.
vn_order:
Some hacks to get the relmaps partway to where I want them. This
code needs cleaning up.
vn_debug:
New debugging routines to support my changes to vn_order.
vn_type:
Deleted the vn_modframe vn_instr, since its role has been taken over
by assignments to redoip(maxfr).
opt_debug:
Reflect the change to vn_type, print address constants in vn_rvals,
and fix a typo.
vn_block, vn_util:
Reflect the change to vn_type.
code_info.m:
Bug fix: change generate_pre_commit and generate_commit so that
the values which need to be saved and restored are always pushed
onto the det stack, even in nondet predicates. The reason is
that if the committed goal fails, curfr is not valid, so we
can't restore the fields from the nondet stack.
(This way may well be more efficient anyway.)
disj_gen.m, ite_gen.m:
Handle the case when the current failure continuation is unknown
on entry to the disjunction or nondet if-then-else by creating
a new frame on the nondet stack. (Originally we just aborted
in this case; recently we "fixed" this, but it turned out that
the fix was not correct, for the same reason as the above-mentioned
bug in pre_commit/commit.
llds.m:
Add succfr/1 and prevfr/1 to the rval type in llds.m,
since they were needed by the above bug fixes.
(This caused dozens of changes elsewhere to handle the
new types.)
Also fix a trivial bug that I recently introduced which
prevented --mod-comments from working.
live_vars.m:
Fix bug in allocation of stack slots for nondet code.
(This is the one that caused the bug that ksiew and I found
when writing a calculator program.)
peephole.m:
Disable the succeed_discard() optimization, since it
causes incorrect code to be generated. It was replacing
modframe(do_fail) ... succeed() with
modframe(do_fail) ... succeed_discard() even when there were
instructions such as mkframe() in between.
modes.m, hlds.m:
When modechecking switches, record the binding of the switch variable
as we enter each case, so that we get the determinism analysis
right.
mercury_compile.pp:
Make sure that we set the exit status to be non-zero if we
find any errors.
typecheck.m, modes.m, undef_types.m, undef_modes.m:
Don't invoke type-checking if there are undefined types.
Don't invoke mode-checking if there are undefined modes.
This avoids the problem of the compiler aborting with an
internal error if there are undefined types/modes.
arg_info.m call_gen.m hlds.m hlds_out.m llds.m opt_debug.m vn_type.m:
Implement solutions/2. (We still haven't implemented mode/determinism
checking for higher-order preds, though, so the compiler doesn't
diagnose errors correctly - if you use the wrong mode you will probably
just get a core dump.)
This required moving the definition of code_model from hlds.m to llds.m.
mercury_to_goedel.m:
Avoid determinism warning.
frameopt:
fix the problem with destroying stack frames and creating
them again later, accessing detstackvars that were earlier
nominally destroyed.
vn_livemap:
renamed it to livemap since frameopt now uses it also.
value_number, vn_*:
Fixed some bugs. Reorganized the handling of blocks: they are now
put in at the last minute before llds writes out the code.
Made a start towards exploiting info about cheaper copies of
values.
optimize, options:
Made value_numbering an iterated optimization. Added a new
option to control how many times it is iterated together
with other the jumpopt, peephole and labelopt.
llds, call_gen, code_gen, code_info, middle_rec, opt_debug:
changed type of the argument of livevals to plain set.
Warning: in more than a week I haven't been able to fully test this change,
dur to kryten's flakiness and bugs upstream of the optimizer.
compiler/*:
Add copyright messages.
Change all occurences of *.nl in comments to *.m.
compiler/mercury_compile.pp:
Change the output to the .dep files to use *.m rather than *.nl.
(NOTE: this means that `mmake' will not work any more if you
call your files *.nl!!!)
code_util.nl, float.nl, llds.nl, mercury_builtin.nl, opt_debug.nl,
parser.nl, polymorphism.nl, sp_lib.nl, string.nl, string.nu.nl,
type_util.nl, typecheck.nl, unify_gen.nl:
Implement floating point.
Makefile.common:
Remove `-include test.dep' line. Use Mmake.
int.nl:
Update a few of the comments.
io.nu.nl:
For Sicstus Prolog, if main/2 is not defined then enter the
debugger.
io.nl:
Introduced a new predicate which ignore's any whitespace in the input.
Needs to have all the whitespace character's added to it.
*.nl and *.pp:
Changed the implementation of time profiling. Now during a compile,
the compiler identifies all the internal labels which can be accessed
externally, and marks them. At the moment, these are the continuation
labels of calls and the next disjunct in nondet disjunctions. Then
at the .mod output, it places a macro 'update_prof_current_proc' to
restore the profiling counter.
opt_debug, jumpopt, vn_order:
Removed the predicates that need NU-Prolog, both the definitions
and the (already commented out) calls to them.
value_number:
Updated the comments.
vn_block:
When creating parallels, ordered the rval list corresponding to an
lval by the cost of the rvals.
unify_gen:
Whenever we do a test of a variable against a non-constant functor,
we now try to turn it into a negated test on a constant functor.
This is possible if these two functors are the only ones.
code_aux:
Added an extra predicate to look up type definitions to make the
previous change easier.
llds, code_gen, opt_util, opt_debug, frameopt, jumpopt, peephole:
Added a boolean argument to do_succeed to say whether the nondet
frame should be discarded on success or not. The default is no,
but peephole has an optimization that tries to turn on this flag.
optimize, value_number, vn*:
Restructured the top level of value numbering as part of an effort
to identify blocks that could be optimized further given our knowledge
that the contents of e.g. stackvars is also in registers when we
jump to those blocks. Redone the interface between value_number and
frameopt to allow value_number to be iterated, which is necessary
to take advantage of the previously mentioned capability. Threated
the I/O state through the relevant predicates; value numbering doesn't
use non-logical I/O any more.
llds.nl:
Introduced an extra argument to the LLDS goto. It is the label
address of the Caller and is used for the profiling of tailcall's.
*.nl and *.pp:
Propagated the extra argument to all the appropiate files.
llds, code_info, opt_*, vn*:
Replaced curredoip with redoip(rval) to make references to other
redoips more efficient. Also, by turning modframe(L) into
redoip(curfr) = const(address_const(L)), value_number can now
optimize hijacking code better.
vn*:
If a disagreement on the desired value of an lvalue prevents value
number, try again after skipping to the first control point, since
this may cure the problem.
peephole, opt_util:
Now looking for successive modframes to optimize out.
disj_gen:
Put deterministic alternatives before others, mainly to make
the back mode of append easier to explain in the paper. :-(
mode_util:
Fixed scope error.
garbage_out:
Fixed some spelling and formatting errors.
code_info.nl hlds.nl hlds_out.nl io.nl llds.int llds.nl opt_debug.nl
polymorphism.nl shapes.nl switch_gen.nl unify_gen.nl:
The fields in a `type_info' structure should be just
procedure addresses, not closures.
Makefile.common:
Add new targets `mercury_compile.sicstus' (the Mercury compiler
compiled with Sicstus) and `mercury_compile.sicstus.debug'
(debugging version of the above).
*.nl:
Use Sicstus-compatible char and string escapes.
Avoid the use of explicit existential quantification.
Various other hacks to get things to parse correctly under Sicstus.
prog_io.nl:
Don't allow (A -> B) in DCGs, since NU-Prolog and Mercury give
it different semantics to Sicstus.
sp_builtin.nl, sp_lib.nl:
Split sp_builtin.nl into sp_builtin.nl and sp_lib.nl.
sp_conv.sed:
Add sed script which converts some character escapes so that
they work with Sicstus.
term_io.nl:
Remove term_io__prefix_op etc. since they aren't used anymore.
vn_*:
Got value numbering working. Isolated diagnostic messages in separate
file.
llds:
During output, transform x + -const into x - const, since some
compilers may not recognize the pattern and may use several
instructions to build up a negative constant.
atsort:
Added predicates for transitive closure.
frameopt:
Fixed a performance bug introduced by previous change.
value_number, vn_util, opt_util, opt_debug:
Fixed a bug with allowed incr_hp to overwrite its target without saving
it. Reorganized the handling of incr_sp and decr_sp to make sure they
never get reordered with respect to control flow instructions.
llds:
Fixed the output of temp declarations for blocks.
frameopt:
Generalized the set up patterns accepted as starting a det procedure.
labelopt:
Added a source-level option to remove eliminated instructions instead
of turning them into comments, and made it the default.
optimize:
After value numbering, perform jump optimization as well as peepholing
and label optimization.
mercury_compile:
Added back an old garbage collection point.
optimize:
Added some new garbage collection points, and made the file into .pp.
value_number, vn_util, opt_debug:
Changed the way access vns are counted. Pushed noop vnlvals towards
the front of the flush order; this can improve the speed of the
generated code.
middle_rec:
Clarified a bit of code.
value_number, vn_util:
Value numbering should now work with code that allocates memory.
frameopt:
Create a stack frame before an if_val if both continuations need one.
optimize, mercury_compile:
-V now prints a message for each optimization pass for each proc.
llds, call_gen, code_gen, middle_rec etc:
Change livevals/2 back into livevals1.
jumpopt:
Rename instmap to instrmap at fjh's request.
frameopt:
Now handles det/semidet procedures that refer to failure.
opt_util:
Moved some frameopt-specific stuff to frameopt.
vn_util:
Set up for better rval source selection.
value_number:
Fix references to mark_hp.
opt_debug:
An extra predicate for debugging value_number.
llds:
Minor change in formatting the output; diff --side-by-side should now
truncate fewer lines.
value_number, vn_util, atsort:
Value numbering now works for several examples. It does not yet
attempt to handlecode that manipulates the heap; that's for tomorrow.
opt_debug:
Much better facilities for debugging value numbering :-)
llds and others:
replaced the heap_alloc rval with two new instructions. The first
is incr_hp(lval, rval); it takes a size (the rval) and allocates
that much memory, returning its address in the lval. The idea is
that code sequences such as
r1 = hp;
incr_hp(4);
will now become
incr_hp(r1, 4)
The other instruction is restore_hp(rval). Incr_sp returns an rval
by putting it into the given lval; only rvals thus supplied should
be given as arguments to restore_hp. What used to be written as
hp = stackvar(5)
should now be
restore_hp(stackvar(5))
This scheme makes it possible to use gc_malloc instead of heap
discipline for memory allocation, preserves the referential
transparency of rvals, and still allows the efficient use of heap.
value_number, vn_util:
Closer to working than before :-)
opt_util:
Moved all value_number-related functionality from there to vn_util.
llds and other files:
Renamed live_lvalues to liveinfo, and added a list of these as a
third argument to call_closure as well. Fergus and Tom, please generate
this third argument in call_gen.
llds.nl, *.nl:
Change field(int, rval, int) to field(int, rval, rval), so
that the field number can be calculated at runtime
(We need this for predicate closures).
Also, remove the incr_hp(int) instruction and replace it
with a heap_alloc(rval) rval.
(We need to determine the space allocated at runtime
for predicate closures, and also we want it to be an rval
not an instruction so we can get conservative garbage collection
to work.)
unify_gen.nl:
More work for higher-order predicate closures.
std_util.nl:
Recode `bool__and' and `bool__or' more elegantly.
- optimize.nl is the main loop of the optimizer. It calls functions in
jumpopt, peephole, labelopt, frameopt and value_number.
- jumpopt.nl does short-circuiting of jumps to jumps and finds tailcalls.
- peephole.nl now just does the local pattern-match optimizations.
- labelopt.nl eliminates dead labels and dead code.
Renamed the options related to optimization.
*** IMPORTANT: the --optimize flag now enables optimize.nl, not C optimization.
C optimization is signalled via the -c-optimize flag. Both flags are on by
default.
switch_gen:
generate code to test primary tags from most shared to least shared.
frameopt:
separated out the pass that removes superfluous saves of succip.
llds, opt_debug:
added two new unary operators, unmktag and unmkbody, that reverse
the effects of mktag and mkbody.
graph, value_number:
fixed occurrences of implied modes.
code_info:
fixed a spelling error in a comment.
*** Tom, please fix the mismatch in the names of the unification procedures.
code_gen, opt_util:
prepared then for middle_rec.
peephole, opt_debug:
old changes.
printlist.int:
shouldn't have been here in the first place.
source_stats.awk:
turned comments into awk syntax.
value_number.nl:
a lot less functionality is missing now :-)
atsort.nl:
approximate topological sort for value numbering
opt_*.nl:
changes to supprt value numbering
llds.nl, *_gen.nl:
changed livevals/1 into livevals/2 to support value numbering
list.nl:
added a predicate to delete a list (set) of items from a list
peephole.nl:
implement shortcircuiting inside computed goto label lists. Not tested
since the only place we use computed gotos has no opportunities for
short circuiting.
options.nl:
added option to control the maximum number of repetitions of
the peephole optimizations; default value is 2.
dir.nl:
fixed spelling error
Makefile.mercury:
Override the MERCURY_LIB_OBJS variable when invoking ml.
This avoids some bootstrapping problems.
Also, add mercury_compile.nu.
Makefile.common:
Bump NU-Prolog's -u option up to 2000 (8M), to avoid some memory
problems.
array.nl, bintree.nl, char.nl, dir.nl, globals.nl, list.nl, map.nl, modes.nl,
prog_util.nl, stack.nl, std_util.nl, string.nl, term.nl:
Avoid the use of implied modes.
code_info.nl, bimap.nl, make_hlds.nl, mercury_compile.nl,
mercury_to_mercury.nl, unify_proc.nl:
Fix determinism errors which had previously not been discovered
because of either implied modes or running out of memory.
(Note that I had to change the interface to bimap__lookup, since
it's not possible to make it bidirectional.)
code_util.nl, llds.nl, opt_debug.nl, value_number.nl:
Rename `operator' as `binary_op'.
hlds.nl, code_info.nl, unify_gen.nl, llds.nl, opt_debug.nl, switch_gen.nl:
*** Handle simple cases of higher-order pred terms. ***
(We don't yet handle taking the address of an overloaded
predicate or a predicate with multiple modes.
We don't handle closures. call/1 and call/N are not yet implemented.
This has not yet been tested.)
make_hlds.nl:
Modify the mode priority ordering so that semidet modes get
selected before det ones.
llds.nl:
Don't include the priority part of the mode number in the mangled
label name. *** Note: this will break some things! ***
mercury_compile.nl:
Move the NU-Prolog hacks into mercury_compile.nu.nl.
switch_gen.nl:
Fix a simple logic bug in handling the grab/slap of the code_info.
prog_io.nl, builtins.nl, int.nl:
Fix bugs and omissions with handling of the new arithmetic operators.
prog_io.nl:
As a quick hack, strip off calls to io__gc_call
(this avoids spurious error messages which are due to
the fact that we don't get mode analysis right in those cases).
call_gen.nl:
Make the handling of builtins a little more general.
code_info.nl, unify_gen.nl:
Use code_info__get_next_label_number rather than
the lower-level routines code_info__get_label_count
and set_label_count.
code_util.nl:
Rame atom_to_operator as code_util__atom_to_binop
and add code_util__atom_to_unop.
prog_io.nl, code_util.nl, llds.nl, int.nl, opt_debug.
Add bitwise operators.
Add array_index binary operator.
Add hash_string unary operator.
Add int__log2 predicate.
Cast operands to (int) in llds.nl, so that we
get integer comparisons and integer operations.
string.nl:
Add string__hash predicate.
interpreter.nl:
Use disjunction in semidet preds.
options.nl:
Add --smart-indexing option (enabled by default).
switch_gen.nl:
**** Generate a hash table lookup for string switches. ****
llds.nl:
Separate procedures with newlines in .mod files.
opt_util.nl, peephole.nl:
Recognize semidet and nondet procedure epilogs. Branches to these
epilogs are replaced by the epilogs themselves.
peephole.nl:
Migrate if_val(Test, do_fail) before mkframe if possible.
value_number.nl:
It now computes livevals sets at starts of blocks and uses them
to build up a set of tables containing value number info. This
info is not yet used, but the tables are inserted into the generated
code if --peephole-value-number is used.
opt_debug{,.nu}.nl:
New files containing code to support debugging of peephole and
value_number.
Makefile.common, mercury_compile.dep:
Accommodate new files opt_debug{,.nu}.nl.
You may want to undo the changes to the compiler invocation
in Makefile.common.