Estimated hours taken: 4
bytecode*.m:
Almost to first draft.
optimize:
When --debug-opt is given, print each instruction sequence only
if it differs from the previous sequence.
vn_block:
Do not create parallels for backward jumps. Without this precaution,
pred-value-number may create incorrect code. For example, given the
code L1: r1 = detstackvar(1), ... goto L1, it may create a specialized
variant of L1 which assumes detstackvar(1) is in r1. This is true
the first time around, but false on later times.
With this fix, the compiler now passes bootcheck at -O5. (It still
causes misreporting of singleton variables, and my changes to binary
can't track it down. Arrrrgghh.)
options:
Add a (deliberately) undocumented option --vn-fudge <n>, to try to
make up for the inadequacy of the value numbering cost function.
value_number, vn_debug:
Changes to accommodate --vn-fudge.
Estimated hours taken: 5
peephole:
Fixed a bug that caused restores of succip to be put in the wrong
place, but only after predicate-wide value numbering.
opt_debug:
Added a couple of debugging predicates used in tracking down this bug.
value_number:
Fix a bug that left a livevals pseudo-op in the wrong place if a
single instruction sequence contained more than one such pseudo-op.
options:
Add --debug-opt. Rename --vndebug to --debug-vn.
Add --generate-bytecode.
optimize, vn_debug:
Use the new routines in opt_debug, and use the new/renamed options.
store_alloc:
Don't thread follow_vars through the module, since the follow_vars
information is not attached directly to branched structures. We
now also use the same slot to hold the store map computed by this
pass; this should allow the later deletion of the store map slot
from goal_infos.
follow_code:
Removed dead predicate.
livemap:
Added a comment.
Estimated hours taken: 6
arg_info:
Add support for the compact argument passing convention. The proper
handling of higher order calls etc is still missing.
globals:
Added a third global type, args_method, current either "old" or
"compact".
passes_aux:
Moved some auxiliary predicates from mercury_compile and options
to passes_aux.
constraint, det_analysis, make_hlds, modules, optimize, undef_types:
Import and refer to passes_aux.
mercury_compile, handle_options:
Remove the predicates moved to passes_aux. Also move the option
postprocessing code to a new module, handle_options.
Another change is that we stop after syntax errors if the new option
--halt-at-syntax-errors is set.
handle_option:
New module for option postprocessing.
options:
Remove the option lookup predicates, which were obsolete.
Add new options --args, --halt-at-syntax-errors and --opt-level.
Add a special handler for --opt-level.
lookup_switch:
Call getopt to look up options, not options.
value_number, vn_block:
Extended basic blocks with more than one incr_hp pose a problem for
value numbering when using boehm_gc, because value numbering coalesces
all the allocations into one. Previously we did not optimize such
sequences. I modified value numbering to divide up such blocks into
smaller blocks, each with at most one incr_hp, and optimize these.
At the moment, some of these blocks contain deeply nested field
refs, which value numbering is very slow to handle; code_exprn
should be modified to fix these.
value_number:
Rename usemap to useset, since this is more accurate.
Fixed a bug in --pred-value-number, which manifested itself as
the generation of duplicate labels.
labelopt:
Rename usemap to useset, since this is more accurate.
Estimated hours taken: 1.5
Undo dylan's changes in the names of some library entities,
by applying the following sed script
s/term_atom/term__atom/g
s/term_string/term__string/g
s/term_integer/term__integer/g
s/term_float/term__float/g
s/term_context/term__context/g
s/term_functor/term__functor/g
s/term_variable/term__variable/g
s/_term__/_term_/g
s/std_util__bool_/bool__/g
to all the `.m' and `.pp' files in the compiler and library directories.
The reason for undoing these changes was to minimize incompatibilities
with 0.4 (and besides, the changes were not a really good idea in the first
place).
I also moved `bool' to a separate module.
The main reason for that change is to ensure that the `__' prefix is
only used when it genuinely represents a module qualifier.
(That's what dylan's changes were trying to acheive, but `term__'
does genuinely represent a module qualifier.)
compiler/*.m:
Apply sed script above;
where appropriate, add `bool' to the list of imported modules.
det_analysis:
Added some code to fix up disjunctions that have at most one solution.
We now transform a disjunction to an if-then-else only if the
disjunction is locally nondet. If the disjunction cannot fail, we
replace it with a disjunct that cannot fail and issue a warning;
we issue a warning in several other cases as well.
mercury_to_mercury:
Fix two duplicate fact bugs pointed out by the new det_analysis.
peephole:
Add a new optimization: a stack frame teardown followed by a
conditional branch to a label that builds a stack frame is now
replaced by code that starts with the conditional branch to the
code after the stack frame setup, and has the stack frame teardown
only in the fall through code. This optimization is applied only
after frameopt.
opt_util:
Export a previously internal predicate for use by peephole.
optimize, value_number:
Conform to the new interface of peephole.
frameopt:
Add some debugging code (now commented out) that helped in making
the above optimization.
prog_io:
Cosmetic changes.
options:
Add a new option, --optimize-delay-slot, on by default, that is
used in the next change, and another, --specialize, for later use.
Also fix an old oversight.
frameopt:
Now we fill delays slots only if the new option is on. This helps
on e.g. x86s, which don't HAVE delay slots.
optimize:
Call frameopt with the option value.
options, code_gen:
Add an option, --no-simple-neg, to disable the generation of
simplified code for simple negations, since sometimes the more
complex code is better (e.g. for queens) due to branch frequencies.
peephole, jumpopt:
Move the detection of tailcalls from peephole to jumpopt. This
allows us to avoid building some maps in peephole. The code in
jumpopt is also somewhat more general, but this is unlikely
to lead to better code.
opt_util:
Some changes to support the previous modifications. We also
allow framevars in code that looks for stackvars, since the
two kinds of variables can both occur in code that does commits.
optimize:
The main predicate of peephole has a new name, call it by that name.
Also remove Tom's comment asking for my inspection of his change.
value_number:
The main predicate of peephole has a new name, call it by that name.
Also loosen a too-tight sanity check.
quantification.m:
Make implicit quantification rename apart vars that
are local to distinct scopes. This will help in the
singleton variable warning pass once the latter has
been changed to work on the HLDS.
These changes also allow goals of the form:
.... X ....,
some [X] Goal
which were previously not allowed.
cse_detection.m:
A 1 line bugfix from Zoltan.
det_analysis.m:
Rather than redoing quantification, construct
a correct goal_info directly in det__disj_to_ite/3.
optimize.pp:
Fix a singleton variable. Zoltan, there is an
XXX for you to read and remove if the fix is
correct.
common.m, cse_detection.m, det_analysis.m,
follow_code.m, make_hlds.m, polymorphism.m,
unify_proc.m:
Fix the calls to implicitly_quantify_clause_body and
implicity_quantify_goal.
TODO:
Update a couple of things.
parser.m:
Add a map(string, var) to the state so that varset
can be simplified.
varset.m:
Simplfy the varset structure so that the binding
of names to variables is cheaper.
frameopt:
Make the teardown map bidirectional, and export it.
peephole:
Add a new pattern to handle cases generated by fulljump optimization.
This pattern uses the teardownmap, but it is disabled for the moment.
optimize:
Pass the teardown map where it is needed, and make sure we do a
peephole pass immediately after frameopt to use the teardownmap
while it is still valid.
jumpopt:
Rename a variable.
labelopt:
A block being eliminated may have the last remaining reference
to the label starting another block. Therefore on the last invocation
of labelopt, we iterate to a fixpoint before returning.
opt_debug:
Add a predicate to help debug bidirectional teardown maps.
opt_util:
Liberalized some optimizations, but the changes are disabled for the
moment.
value_number:
Pass an empty teardown map to peephole. Loosen the sanity check on
tags a bit.
vn_order:
If the new value of a location depends on its old value,
avoid creating a circularity in the preferred order relation.
Such circularity may be broken arbitrarily, even though we have
a clear preference.
vn_flush:
Modify the criteria for saving the old value stored in a location
about to be overwritten, in an effort to eliminate useless copies
of old values.
vn_util:
Tighten the requirement for classifying a use as a "real" use,
also in the effort to eliminate useless saves of old values.
Recognize some more expression patterns as yieldsing known
results. Move some functionality from vn_flush to vn_util,
since it is needed by the other modification.
typecheck:
Improved the format of the message about calls with wrong arity.
jumpopt, optimize:
A goto whose target is the predicate entry label is replaced by
the pointed-to code only if a flag is set; optimize sets the flag
only after value numbering and frameopt. This means the pointed-to
code is in better shape when it is "inlined".
peephole, opt_util:
When optimizing incr_sp/decr_sp pairs, allow a restoration of succip
between them to be optimized away. This works because the only way
this can happen is if the store of succip in its slot was promoted
before the incr_sp, and no calls may have been in the meantime,
so the original copy is still in succip.
frameopt, optimize:
Postponed the check for whether succip is ever restored, since
peephole may affect the decision.
follow_code:
We now push code from the outside context into this context before
pushing code from this context into nested contexts, since this may
give us more code to push. I also removed redundant references to
ModuleInfo.
prog_io:
Small formatting change.
options, optimize:
Add a new option, --optimize-fulljumps, which defaults on.
jumpopt, opt_util:
If --optimize-fulljumps is set, replace unconditional gotos with
the instruction sequence they point to. This not only avoids a jump
at runtime, but also increases basic block length and makes value
numbering more effective.
peephole:
Fulljump optimization can replace a recursive tailcall with the
initial part of the code of the procedure. Therefore peephole now
looks for a decr_sp followed by an incr_sp, and removes such pairs
from the instruction sequence.
frameopt:
Do not consider a decr_sp followed by an incr_sp to be a fatal error
(just in case peephole is switched off).
vn_block:
Fix a big tickled by fulljump optimization: maxfr, curfr and succip
were not required to be made up to date before an if_val exited
the extended basic block.
vn_util:
Simplify some more patterns of vnrvals. The extra patterns are
involved in testing conditions that are known to be true or false.
These patterns can arise when fulljump optimization replaces a
recursive tailcall.
frameopt, opt_util:
Attempt to fill delay slots with the instruction after an if_val
in preference to the saving of the succip.
optimize:
Fix a typo in earlier change.
value_number:
Check that the last node in the order is a control node.
vn_order:
If two registers or stackvars can be generated in any order,
prefer to generate them in numerical sequence for neatness.
vn_debug:
Add routine for printing the initial and final ordering of
unrelated nodes.
code_gen.pp:
Put the comment about the contents of stack slots before the initial
label, since this way it will be preserved by optimizations.
cse_detection.m:
Extended the search to look for cses in if-then-elses and switches
as well as disjunctions. Removed InstmapDelta from preds in which it
was not being used.
det_analysis.m:
Make the diagnosis routines more robust. The changes here avoid the
Philip's problems with lexical.m.
jumpopt.m:
Minor formatting changes.
livemap.m:
Avoid duplicating livevals instructions when optimizations are
repeated, since this can confuse some optimizations.
llds.m:
Minor documentation change.
make_hlds.m:
Minor formatting change.
mercury_compile.pp:
Do not map arguments to registers if any semantic errors have been
found.
middle_rec.m and code_aux.m:
Apply middle recursion only if tail recursion is not possible,
since tail recursion yields more efficient code.
opt_util.m:
Added a predicate to recognize constant conditions in if_vals.
Modified a predicate to make it better suited for frameopt.
optimize.pp:
Changed the way optimizations were repeated to allow better control.
Repeat peephole once more after frameopt, since the new frameopt
can benefit from this.
options.m:
Removed the --compile-to-c option, which was obsolete. Added an
option for predicate-wide value numbering, which is off by default.
Changed some of the default values of optimization flags to reduce
compilation time while holding the loss of speed of generated code
to a minimum.
peephole.m:
Look for if_vals whose conditions are constants, and eliminate the
if_val or turn it into a goto depending on the value of the constant.
Generalized the condition for optimizing incr_sp/decr_sp pairs.
value_number.m:
Added a prepass to separate primary tag tests in if-then-elses from
the test of the secondary tag, which requires dereferencing the
pointer.
Added sanity check routines to test two aspects of the generated code.
First, whether it produces the same values for the live variables as
the original code, and second, whether it has moved any dereferences
of a pointer before a test of the tag of that pointer. If either test
fails, we use the old instruction sequence.
vn_debug.m:
New messages to announce the failure of the sanity checks. They are
enabled by default, but of course can only appear if value numbering
is turned on (it is still off by default).
vn_flush.m:
Threaded a list of forbidden lvals (lvals that may not be assigned to)
through the flushing routines. When saving the old value of an lval
that is being assigned to, we use this list to avoid modifying any of
the values used on the right hand side of the assignment, even if the
saving of an old value results in assignment that requires another
save, and so on recursively.
When the flushing of a node_lval referred to a shared vn, the uses of
the access vns of the node_lvals were not being adjusted properly.
Now they are.
vn_order.m:
The ctrl_vn phase of the ordering was designed to ensure that all
nodes that need not come before a control node come after it. However,
nodes were created after this phase operated, causing leakage of some
value nodes in front of control nodes. Some of these led to pointer
dereferences before tag tests, causing bus errors. The ctrl_vn phase
is now last to avoid this problem.
vn_table.m:
Added an extra interface predicate to support the sanity checks in
value_number.
vn_util.m:
The transformation of c1-e2 into (0-e2)+c1 during vnrval simplification
could lead to an infinite loop in the compiler if c1 was zero. A test
for this case now prevents the loop.
mercury_compile:
Perform arg_info only if we are generating code.
llds:
Handle redo and fail by outputing a branch to their labels in the
runtime, since this is smaller than the code for the macro itself.
dupelim, options:
Added an extra optimization pass to eliminate duplicate blocks of
code. Reduces compiler size by half a percent.
frameopt:
fix the problem with destroying stack frames and creating
them again later, accessing detstackvars that were earlier
nominally destroyed.
vn_livemap:
renamed it to livemap since frameopt now uses it also.
value_number, vn_*:
Fixed some bugs. Reorganized the handling of blocks: they are now
put in at the last minute before llds writes out the code.
Made a start towards exploiting info about cheaper copies of
values.
optimize, options:
Made value_numbering an iterated optimization. Added a new
option to control how many times it is iterated together
with other the jumpopt, peephole and labelopt.
llds, call_gen, code_gen, code_info, middle_rec, opt_debug:
changed type of the argument of livevals to plain set.
Warning: in more than a week I haven't been able to fully test this change,
dur to kryten's flakiness and bugs upstream of the optimizer.
compiler/*:
Add copyright messages.
Change all occurences of *.nl in comments to *.m.
compiler/mercury_compile.pp:
Change the output to the .dep files to use *.m rather than *.nl.
(NOTE: this means that `mmake' will not work any more if you
call your files *.nl!!!)
unify_gen:
Whenever we do a test of a variable against a non-constant functor,
we now try to turn it into a negated test on a constant functor.
This is possible if these two functors are the only ones.
code_aux:
Added an extra predicate to look up type definitions to make the
previous change easier.
llds, code_gen, opt_util, opt_debug, frameopt, jumpopt, peephole:
Added a boolean argument to do_succeed to say whether the nondet
frame should be discarded on success or not. The default is no,
but peephole has an optimization that tries to turn on this flag.
optimize, value_number, vn*:
Restructured the top level of value numbering as part of an effort
to identify blocks that could be optimized further given our knowledge
that the contents of e.g. stackvars is also in registers when we
jump to those blocks. Redone the interface between value_number and
frameopt to allow value_number to be iterated, which is necessary
to take advantage of the previously mentioned capability. Threated
the I/O state through the relevant predicates; value numbering doesn't
use non-logical I/O any more.
Makefile.common:
Add new targets `mercury_compile.sicstus' (the Mercury compiler
compiled with Sicstus) and `mercury_compile.sicstus.debug'
(debugging version of the above).
*.nl:
Use Sicstus-compatible char and string escapes.
Avoid the use of explicit existential quantification.
Various other hacks to get things to parse correctly under Sicstus.
prog_io.nl:
Don't allow (A -> B) in DCGs, since NU-Prolog and Mercury give
it different semantics to Sicstus.
sp_builtin.nl, sp_lib.nl:
Split sp_builtin.nl into sp_builtin.nl and sp_lib.nl.
sp_conv.sed:
Add sed script which converts some character escapes so that
they work with Sicstus.
term_io.nl:
Remove term_io__prefix_op etc. since they aren't used anymore.
value_number, vn_util, opt_util, opt_debug:
Fixed a bug with allowed incr_hp to overwrite its target without saving
it. Reorganized the handling of incr_sp and decr_sp to make sure they
never get reordered with respect to control flow instructions.
llds:
Fixed the output of temp declarations for blocks.
frameopt:
Generalized the set up patterns accepted as starting a det procedure.
labelopt:
Added a source-level option to remove eliminated instructions instead
of turning them into comments, and made it the default.
optimize:
After value numbering, perform jump optimization as well as peepholing
and label optimization.
mercury_compile:
Added back an old garbage collection point.
optimize:
Added some new garbage collection points, and made the file into .pp.
value_number, vn_util, opt_debug:
Changed the way access vns are counted. Pushed noop vnlvals towards
the front of the flush order; this can improve the speed of the
generated code.
middle_rec:
Clarified a bit of code.