Estimated hours taken: 2
value_number:
Add two safety checks whose absence was bug. One of these replaces
an earlier, inadequate safety check in vn_block.
vn_block:
Remove that inadequate safety check, since now a more comprehensive
one is applied in value_number.
opt_util:
Opt_util__instr_labels now returns even those labels inside
code_addrs. This was needed for value_number.
labelopt:
Simplify the code based on the new capability of
opt_util__instr_labels.
dense_switch:
Break a too long line.
Estimated hours taken: 1
jumpopt:
Optimize the sequence
if (cond) L1
r1 = TRUE
stuff
proceed
...
L1: r1 = FALSE
same stuff
proceed
into
r1 = not cond
stuff
proceed
r1 = TRUE
stuff
proceed
...
L1: r1 = FALSE
same stuff
proceed
Dead code elimination will get rid of the code after the first proceed,
and probably of the code at L1 as well.
The optimization also applies if the L1 branch sets r1 to TRUE and
the fallthrough sets it to FALSE; in that case we don't negate
the condition.
opt_util:
Export a predicate for jumpopt.
Estimated hours taken: 12
value_number, opt_util:
Fix a bug triggered the conjunction of (1) value numbering being
repeated in -O5 (2) middle recursion optimization and (3) the
current code of modules.m. The problem was that although value
numbering was producing correct code, the livevals annotations
in the generated code were left unchange although they were
no longer correct.
The fix is a new predicate in opt_util to update the annotations
and to call it in value numbering.
vn_util:
Fix the bug reported by Tom in compiling scene.m: simplify
several kinds of patterns involving floats. These assume that
floats obey the laws of reals. Later we will have to add a mechanism
to prevent such simplication and reordering. (We already assume
that integers obey the laws of whole numbers.)
opt_debug, vn_debug:
Make the format of debugging output more suitable.
Estimated hours taken: 2.5
Mangle function names differently, so that we don't get
symbol name clashes between a function of arity N and
a predicate of arity N+1. (This fixes a bug which resulted
in error messages from the assembler.)
compiler/llds.m:
Add a new pred_or_func field to the proc/4 functor in
the proc_label type.
compiler/code_util.m:
Initialize this new field.
compiler/llds.m:
Use this new field when printing out labels.
compiler/{opt_debug.m,opt_util.m}:
Ignore this new field.
Estimated hours taken: 15
hlds_data:
Rename address_const to code_addr_const, and add base_type_info_const
as a new alternative in cons_id, and make corresponding changes
to cons_tag.
Make hlds_type__defn an abstract type.
llds:
Rename address_const to code_addr_const, and add data_addr_const
as a new alternative in rval_const.
Change type "label" to have four alternatives, not three:
local/2 (for internal labels), c_local (local to a C module),
local/1 (local a Mercury module but not necessarily to a C module,
and exported.
llds_out:
Keep track of the things declared previously, and don't declare them
again unnecessarily. Associate indentation with the following item
rather than the previous item (the influence of 244); this results
in braces being put in different places than previously, but should be
easier to maintain. Handle the new forms of addresses and labels.
Refer to c_local labels as STATIC when not using --split-c-files.
code_info:
Use a presently junk field to store a cell counter, which is used
to allocate distinguishing numbers to create'd cells. Previously
we used the label counter, which meant that label numbers changed
when we optimized away some creates. Handle the new forms of
addresses and labels.
exprn_aux:
Handle the new forms of addresses and labels. We are now more
precise in figuring out what label address forms will be considered
constants by the C compilers.
others:
Changes to handle the new forms of addresses and labels, and/or to
access hlds_type__defn as an abstract type.
Estimated hours taken: 2
llds.m:
Add a boolean argument to the create rval, which should be set to true
if the cell created must have a unique reference.
vn_type.m:
Add a corresponding argument to vn_create.
others:
Fix references to creates and vn_creates.
Estimated hours taken: 4
passes_aux:
Flesh out the code already here for traversing module_infos,
making it suitable to handle all the passes of the back end.
mercury_compile:
Use the traversal code in passes_aux to invoke the back end passes
over each procvedure in turn. Print a one-line message for each
predicate if -v is given (this fixes a long-standing bug).
excess.m, follow_code.m, follow_vars.m, live_vars.m, lveness.m, store_alloc.m:
Remove the code to traverse module_infos, since it is now unnecessary.
export.m:
Remove an unused argument from export__produce_header_file_2.
others:
Move imports from interfaces to implementations, or in some cases
remove them altogether.
Estimated hours taken: 2
Add support for det stack trace dumps in debugging grades, so programmers
can find out which predicates are causing det stack overflows.
llds.m:
Add an extra argument to the llds instruction incr_sp.
This argument says what predicate the stack frame belongs to.
vn_type.m:
Add a corresponding argument to the control node vn_incr_sp.
the other files:
Handle the extra argument of incr_sp or vn_incr_sp.
Estimated hours taken: 45
Module qualification of types, insts and modes.
Added a new interface file - <module>.int3. This contains the
short interface qualified as much as possible given the information
in the current module.
When producing the .int and .int2 files for a module, the compiler uses
the information in the .int3 files of modules imported in the interface
to fully module qualify all items. The .int2 file is just a fully
qualified version of the .int3 file. The .int3 file cannot be overwritten
by the fully qualified version in the .int2 file because then mmake would
not be able to tell when the interface files that depend on that .int3
file really need updating.
The --warn-interface-imports option can be used to check whether
a module imported in the interface really needs to be imported in
the interface.
compiler/module_qual.m
Module qualify all types, insts and modes. Also checks for modules
imported in the interface of a module that do not need to be.
compiler/modules.m
The .int file for a module now depends on the .int3 files of imported
modules. Added code to generate the make rule for the .int file in the
.d file. There is now a file .date2 which records the last time the
.int2 file was updated.
The .int3 files are made using the --make-short-interface option
introduced a few weeks ago.
compiler/options.m
Added option --warn-interface-imports to enable warning about interface
imports which need not be in the interface. This is off by default
because a lot of modules in the library import list.m when they only
need the type list, which is defined in mercury_builtin.m.
Removed option --builtin-module, since the mercury_builtin name is wired
into the compiler in a large number of places.
compiler/prog_util.m
Added a predicates construct_qualified_term/3 and construct_qualfied_term/4
which take a sym_name, a list of argument term and a context for the /4
version and give a :/2 term.
compiler/type_util.m
Modified type_to_type_id to handle qualified types. Also added predicates
construct_type/3 and construct_type/4 which take a sym_name and a list of
types and return a type by calling prog_util:construct_qualified_term.
compiler/modes.m
On the first iteration of mode analysis, module qualify the modes of
lambda expressions.
compiler/mode_info.m
Added field to mode_info used to decide whether or not to module qualify
lambda expressions.
compiler/mode_errors.m
Added dummy mode error for when module qualification fails so that mode
analysis will stop.
Added code to strip mercury_builtin qualifiers from error messages to
improve readability.
compiler/typecheck.m
Strip builtin qualifiers from error messages.
compiler/llds.m
compiler/llds_out.m
compiler/opt_util.m
compiler/opt_debug.m
Change the format of labels produced for the predicates to use the
qualified version of the type name.
compiler/mercury_compile.pp
Call module_qual__module_qualify_items and make_short_interface.
Remove references to undef_modes.m and undef_types.m
compiler/undef_modes.m
compiler/undef_types.m
Removed, since their functionality is now in module_qual.m.
compiler/prog_io.m
Changed to qualify the subjects of type, mode and inst declarations.
compiler/*.m
Changes to stop various parts of the compiler from throwing away
module qualifiers.
Qualified various mercury_builtin builtins, e.g. in, out, term etc.
where they are wired in to the compiler.
compiler/hlds_data.m
The mode_table and user_inst_table are now abstract types each
storing the {mode | inst}_id to hlds__{mode | inst}_defn maps
and a list of mode_ids or inst_ids. This was done to improve the
efficiency of module qualifying the modes of lambda expressions
during mode analysis.
module_info_optimize/2 now sorts the lists of ids.
The hlds_module interface to the mode and inst tables has not changed.
compiler/hlds_module.m
Added yet another predicate to search the predicate table.
predicate_table_search_pf_sym_arity searches for predicates or
functions matching the given sym_name, arity and pred_or_func.
compiler/higher_order.m
Changed calls to solutions/2 to list__filter/3. Eliminated unnecessary
requantification of goals.
compiler/unused_args.m
Improved abstraction slightly.
Estimated hours taken: 30+
arg_info:
Fix allocation to work properly for --args compact.
bytecode*:
Handle complex deconstruction unifications. Not really tested because
I can't find a test case.
bytecode_gen, call_gen, code_util:
Use the new method to handle builtin predicates/functions. We now
handle reverse mode arithmetic and unary plus/minus as builtins.
code_gen, code_init, follow_vars, hlds_pred:
Put back the initial follow_vars field of the proc_info, since this
may allow the code generator to emit better code at the starts of
of predicates.
inlining:
Don't inline recursive predicates.
goals_util:
Add a predicate to find out if a goal calls a particular predicate.
Used in inlining to find out if a predicate is recursive.
unused_args:
Remove code that used to set the mode of unused args to free->free.
Since this changes the arg from top_in to top_unused *without* code
in other modules being aware of the change, this screws up --args
compact.
llds, llds_out, garbage_out:
Prepare for the move to the new type_info structure by adding a new
"module" type for defining structures holding type_infos. Not
currently generated or output.
llds, opt_debug, opt_util, vn_type, vn_cost, vn_temploc:
Change the argument of temp to be a reg, not an int, allowing
floating point temporaries.
vn_type:
Add information about the number of floating point registers and
temporaries to the parameter structure (these are currently unused).
llds, dupelim, frameopt, livemap, middle_rec, value_number, vn_filter,
vn_verify:
Add an extra field to blocks giving the number of float temporaries.
options:
Add parameters to configure the number of floating point registers
and temporaries.
mercury_compile:
Add an extra excess assign phase at the start of the middle pass.
This should reduce the size of the code manipulated by the other
phases, and gives more accurate size information to inlining.
(The excess assign phase before code generation is I think still
needed since optimizations can introduce such assignments.)
value_number:
Optimize code sequences before and after assignments to curfr
separately, since such assignments change the meaning of framevars.
This fixes the bug that caused singleton variable warnings to contain
garbage.
vn_block, vn_flush, vn_order, vn_util:
Add special handling of assignments to curfr. This is probably
unnecessary after my change to value_number, and will be removed
again shortly :-(
vn_flush:
Improve the code generated by value numbering (1) by computing values
into the place that needs them in some special circumstances, and
(2) by fixing a bug that did not consider special registers to be
as fast as r1 etc.
vn_util:
Improve the code generated by value numbering by removing duplicates
from the list of uses of a value before trying to find out if there
is more than one use.
simplify:
Avoid overzealous optimization of main --> { ..., error(...) }.
handle_options:
Fix an error message.
code_aux:
Break an excessive long line.
Estimated hours taken: 1.5
Split llds into two parts. llds.m defines the data types, while llds_out.m
has the predicates for printing the code.
Removed the call_closure instruction. Instead, we use calls to the
system-defined addresses do_call_{det,semidet,nondet}_closure. This is
how call_closure was implemented already. The advantage of the new
implementation is that it allows jump optimization of what used to be
call_closures, without new code in jumpopt.
Estimated hours taken: 20
vn_block:
Fix a typo which reflected a fundamental design error. When finding
cheaper copies of live lvals, for use in creating specialized copies
(parallels) of blocks jumped to from the current location, we used
to use the map reflecting the contents of lvals at the start of the
block, not at the point of the jump.
--pred-value-number, which uses the information computed by the
buggy predicate, actually bootstrapped some time ago despite
this fundamental bug!
value_number:
Fix a bug in the creation of parallel code sequences for computed
gotos. Add some more opprtunities for printing diagnostics.
Move code concerning final verification to vn_verify.
vn_verify:
Move the remaining code concerned with final verification from
value_number to vn_verify.
peephole:
Add a new pattern, which transforms the sequence
incr_sp N; goto L2; L1; incr_sp N; L2
into just
L1; incr_sp N; L2
The pattern is of course more broadly applicable, but I have seen
it only when it involves a single incr_sp between the two labels.
(The longer pattern can be introduced by frameopt.)
opt_util:
Look inside blocks when checking whether an instruction can fall
through. This improves the performance of labelopt.
vn_table:
Make the type vn_table abstract; add, export and use access functions.
vn_util:
Remove a noop predicate, since now it won't ever be made to do
anything.
vn_cost:
Refine debugging output.
vn_debug:
Add some more debugging routines.
opt_debug:
Add some more debugging routines.
det_analysis:
Remove an unused argument.
labelopt:
Formatting change.
A Constraint Solver Interface For Mercury
<thunderous applause>
Estimated hours taken: 1 summer studentship
This is the implementation of a fairly general constraint solver interface. If
using a library grade *.cnstr, we emit C instructions to keep track of the
solver's implicit state. This is done by storing and restoring 'tickets' -
abstract handles on the solver's state.
We emit a store_ticket() macro:
-when entering the first disjunct of a disjunction
-when entering the condition of an if-then-else
We emit a restore_ticket() macro:
-when entering a disjunct other than the first of a disjunction
-when entering the else part of an if-then-else
We emit a discard_ticket() macro:
-after the restore_ticket() in the final disjunct of a disjunction
-at the start of the 'then' part of an if-then-else
The rules for emitting the macros is slightly more complicated than that shown
above for if-then-elses (determinism of the parts must be taken into account).
compiler/code_info.m:
Get an llds store_ticket/restore_ticket etc. instruction
compiler/disj_gen.m:
Emit ticket macros in the appropriate places in a disjunction.
compiler/dupelim.m:
Handle the new llds instruction.
compiler/frameopt.m:
Handle the new llds instruction.
compiler/handle_options.m:
If the grade is *.cnstr, set the constraints option on.
compiler/ite_gen.m:
Emit ticket macros in the appopriate places in an if-then-else.
compiler/livemap.m:
Handle the new llds instruction.
compiler/llds.m:
Output the ticket macros.
compiler/make_hlds.m:
An irrelevant tidy-up.
compiler/mercury_compile.pp:
If the grade is *.cnstr, pass -DCONSTRAINTS to mgnuc
compiler/middle_rec.m:
Handle the new llds instruction.
compiler/opt_*.m:
Handle the new llds instruction.
compiler/options.m:
Introduce a new boolean option 'constraints'.
compiler/shapes.m:
Output a new shape - 'ticket'.
compiler/unify_proc.m:
Handle the new llds instruction.
compiler/v*.m:
Handle the new llds instruction.
Estimated hours taken: 1.5
Undo dylan's changes in the names of some library entities,
by applying the following sed script
s/term_atom/term__atom/g
s/term_string/term__string/g
s/term_integer/term__integer/g
s/term_float/term__float/g
s/term_context/term__context/g
s/term_functor/term__functor/g
s/term_variable/term__variable/g
s/_term__/_term_/g
s/std_util__bool_/bool__/g
to all the `.m' and `.pp' files in the compiler and library directories.
The reason for undoing these changes was to minimize incompatibilities
with 0.4 (and besides, the changes were not a really good idea in the first
place).
I also moved `bool' to a separate module.
The main reason for that change is to ensure that the `__' prefix is
only used when it genuinely represents a module qualifier.
(That's what dylan's changes were trying to acheive, but `term__'
does genuinely represent a module qualifier.)
compiler/*.m:
Apply sed script above;
where appropriate, add `bool' to the list of imported modules.
Estimated hours taken: _2___
Change names with badly placed double underscores (ie where the part of
a name before a double underscore is not the same as the module name.)
Reflect changes in the library interface.
compiler/*:
Use the newer, more correct form of the term and bool names.
Predicates "bool__" are now "std_util__bool" and labels of
the term ADT are now "term_" instead of "term__".
compiler/vn*.m:
change all names "vn__*" to a correct module prefix. All the
names remain qualified.
compiler/hlds.m:
s/\<is_builtin__/hlds__is_builtin_/g
s/\<dependency_info__/hlds__dependency_info_/g
compiler/unify_proc.m:
s/\<unify_proc_info__/unify_proc__info_/g
compiler/transform.m:
s/\<reschedule__conj/transform__reschedule_conj/g
det_analysis, det_report:
Split the old det_analysis module, which was getting too big,
by moving the error diagnosis predicates to a new module.
value_number:
Convert each if statement that contains one of the boolean operators
{and, or, not} at the top level to eliminate the operator, introducing
additional if statements if necessary. The reason that this is a good
idea is that
if_val(tag(r1) == 1 && field(1, r1, N) = X)
get transformed into two ifs, and the field reference can be extracted
as a common subexpression in an assignment between the two ifs, after
the primary tag has been tested. This is necessary to avoid an
unaligned memory reference. Before this change, we simply did not
optimize code sequences containing such ifs.
vn_order:
Prepare for an optimization (to come later this week) whereby if
a block contains multiple exit points with inconsistent bindings,
we can optimize the front part separately as well as the back part.
vn_debug:
Added a message to help me find the most profitable way to do the
above change.
opt_util, frameopt:
Moved some code for dealing with det procedure prologues from
frameopt to opt_util, since now value_number needs its also.
options:
Make tag_switch apply in more cases.
det_analysis:
Added some code to fix up disjunctions that have at most one solution.
We now transform a disjunction to an if-then-else only if the
disjunction is locally nondet. If the disjunction cannot fail, we
replace it with a disjunct that cannot fail and issue a warning;
we issue a warning in several other cases as well.
mercury_to_mercury:
Fix two duplicate fact bugs pointed out by the new det_analysis.
peephole:
Add a new optimization: a stack frame teardown followed by a
conditional branch to a label that builds a stack frame is now
replaced by code that starts with the conditional branch to the
code after the stack frame setup, and has the stack frame teardown
only in the fall through code. This optimization is applied only
after frameopt.
opt_util:
Export a previously internal predicate for use by peephole.
optimize, value_number:
Conform to the new interface of peephole.
frameopt:
Add some debugging code (now commented out) that helped in making
the above optimization.
prog_io:
Cosmetic changes.
The changes made allow declarations of the form:
:- pragma(c_code, predname(Varname1::mode1, Varname2::mode2, ...),
"Some C code to execute instead of a mercury clause;").
There are still a couple of minor problems to be fixed in the near future:
If there is a regular clause given as well as a pragma(c_code, ...) dec, it
is not handled well, and variables names '_' are not handled well.
prog_io.m:
parse the pragma(c_code, ...) dec.
hlds.m:
define a new hlds__goal_expr 'pragma_c_code'.
make_hlds.m:
insert the pragma(c_code, ...) dec. as a pragma_c_code into the hlds.
det_analysis.m:
infer that pragma_c_code goals are det.
modes.m:
convince the mode checker that the correct pragma variables are bound
etc.
quantification.m:
quantify the variables in the pragma(c_code, ...) dec.
code_gen.pp:
convert pragma_c_code into pragma_c (in the llds).
llds.m:
define a new instr, pragma_c. Output the pragma_c
hlds_out.m:
mercury_to_mercury.m:
mercury_to_goedel.m:
spit out pragma(c_code, ...) decs properly
*.m: handle the new pragma_c_code in the hlds or the new pragma_c in the llds
llds:
Optionally generate while (1) loops instead of short backward branches.
This is faster in the absence of fast jumps.
options:
Add a new option, --no-emit-c-loops.
middle_rec:
We now check if the LLDS code after the recursive call is empty.
If yes, we don't generate the downward loop.
code_aux:
Minor cleanup associated with previous change.
frameopt:
Instead of blindly assuming that any code before an if_val will be
able to fill the delay slot, we check whether it computes a value
that is used in the condition. We now also allow a slightly wider
range of user instructions to fill delay slots.
opt_util:
Some new preds to support the new funcionality in frameopt.
tag_switch:
Compute the tag of the switched-on value into a register at the
start, instead of computing it in each if_val.
jumpopt:
Added last call optimization for nondet predicates.
llds:
Added a new lval type to represent the succip slot of nondet
stack frames.
other files:
Changes required by the change to llds (there is a minor unrelated
change in vn_cost as well).
Tyson: please check my changes to code_info__get_shape_num and
garbage_out__write_liveval.
instructions, and the last argument from local labels. All these were
placeholders for info put in there by prof.m and used when emitting C
code.
The set of labels that serve as return points are now calculated in llds.m
just before each procedure has its C code generated. This set is passed to
output_instruction along with the label at the start of the procedure.
options, code_gen:
Add an option, --no-simple-neg, to disable the generation of
simplified code for simple negations, since sometimes the more
complex code is better (e.g. for queens) due to branch frequencies.
peephole, jumpopt:
Move the detection of tailcalls from peephole to jumpopt. This
allows us to avoid building some maps in peephole. The code in
jumpopt is also somewhat more general, but this is unlikely
to lead to better code.
opt_util:
Some changes to support the previous modifications. We also
allow framevars in code that looks for stackvars, since the
two kinds of variables can both occur in code that does commits.
optimize:
The main predicate of peephole has a new name, call it by that name.
Also remove Tom's comment asking for my inspection of his change.
value_number:
The main predicate of peephole has a new name, call it by that name.
Also loosen a too-tight sanity check.
frameopt:
Look inside blocks introduced by value numbering when looking
restorations of succip.
value_number, opt_util:
If we are using conservative garbage collection, disable value
numbering for blocks that allocate more than one cell on the heap.
This allows value numbering of most blocks to work in the absence
of -DALL_INTERIOR_POINTERS.
all other source files:
Clean up "blank" lines that nevertheless contain space or tab
characters.
opt_util:
In between incr_sp and decr_sp instructions that are being eliminated,
allow assignments to stackvars as long as the stackvars are not used
as input. These assignments are also eliminated.
This set of changes includes most of the work necessary for
mode and determinism checking of higher-order predicates.
prog_io.m:
Change the syntax for lambda expressions: they need
to have a determinism declaration. Lambda
expressions must now look like this:
lambda([X::in, Y::out] is det, ...goal...).
^^^^^^
Note that both the modes and the determinism are mandatory,
not optional.
hlds.m:
Insert a determinism field in the lambda_goal structure.
hlds_out.m, inlining.m, make_hlds.m, modes.m, polymorphism.m, quantification.m,
switch_detection.m, typecheck.m:
Modified to use lambda_goal/4 rather than lambda_goal/3.
prog_io.m:
Add a new field to the `ground' inst, of type `maybe(pred_inst_info)'.
We use this to store the modes and determinism of higher-order
predicate terms.
code_info.m, inst_match.m, mercury_to_mercury.m, mode_util.m, modes.m,
polymorphism.m, shapes.m, undef_modes.m:
Modified to handle higher-order pred modes:
use ground/2 rather than ground/1.
(Note that modes.m still requires a bit more work on this.)
llds.m:
Add a new field to the call_closure/3 instruction to hold the
caller address for use with profiling, since the C macros
require a caller address.
dup_elim.m, frame_opt.m, garbage_out.m, live_map.m, middle_rec.m, opt_debug.m,
opt_util.m, value_number.m, vn_*.m:
Modified to use call_closure/4 rather than call_closure/3.
mercury_to_mercury.m:
Export mercury_output_det for use by hlds_out.m.
frameopt:
Make the teardown map bidirectional, and export it.
peephole:
Add a new pattern to handle cases generated by fulljump optimization.
This pattern uses the teardownmap, but it is disabled for the moment.
optimize:
Pass the teardown map where it is needed, and make sure we do a
peephole pass immediately after frameopt to use the teardownmap
while it is still valid.
jumpopt:
Rename a variable.
labelopt:
A block being eliminated may have the last remaining reference
to the label starting another block. Therefore on the last invocation
of labelopt, we iterate to a fixpoint before returning.
opt_debug:
Add a predicate to help debug bidirectional teardown maps.
opt_util:
Liberalized some optimizations, but the changes are disabled for the
moment.
value_number:
Pass an empty teardown map to peephole. Loosen the sanity check on
tags a bit.
vn_order:
If the new value of a location depends on its old value,
avoid creating a circularity in the preferred order relation.
Such circularity may be broken arbitrarily, even though we have
a clear preference.
vn_flush:
Modify the criteria for saving the old value stored in a location
about to be overwritten, in an effort to eliminate useless copies
of old values.
vn_util:
Tighten the requirement for classifying a use as a "real" use,
also in the effort to eliminate useless saves of old values.
Recognize some more expression patterns as yieldsing known
results. Move some functionality from vn_flush to vn_util,
since it is needed by the other modification.
-------------------------------------------------------
Implement unique modes. We do not handle local aliasing yet, so this
is still not very useful, except for io__state. Destructive update is
not yet implemented. Also note that this really only implements
"mostly unique" variables that may be non-unique on backtracking - we
don't check that you don't backtrack over I/O, for example.
prog_io.m, mode_util.m, modes.m, inst_match.m:
Major changes to Handle unique modes.
mercury_to_mercury.m, polymorphism.m, prog_out.m, undef_modes.m:
Use `ground(Uniqueness)' rather than just `ground'.
compiler/*.m:
Fix compile errors now that unique modes are enforced: add a
few calls to copy/2, and comment out lots of unique mode
declarations that caused problems.
typecheck.m, mode_info.m:
Hack around the use of unique modes, which doesn't work
because we don't allow local aliasing yet: make the insts
`uniq_type_info' and `uniq_mode_info' not unique at all,
and add a call to copy/2 when extracting the io_state from
type_info or mode_info.
-------------------------------------------------------
Plus a couple of unrelated changes:
hlds.m:
Change the modes for the special predicates from `ground -> ground'
to `in', so that any error messages that show those modes
come out looking nicer.
Add a new shared_inst_table for shared versions of user-defined
insts.
mercury_to_goedel.m:
Use string__is_alnum_or_underscore.
typecheck:
Improved the format of the message about calls with wrong arity.
jumpopt, optimize:
A goto whose target is the predicate entry label is replaced by
the pointed-to code only if a flag is set; optimize sets the flag
only after value numbering and frameopt. This means the pointed-to
code is in better shape when it is "inlined".
peephole, opt_util:
When optimizing incr_sp/decr_sp pairs, allow a restoration of succip
between them to be optimized away. This works because the only way
this can happen is if the store of succip in its slot was promoted
before the incr_sp, and no calls may have been in the meantime,
so the original copy is still in succip.
frameopt, optimize:
Postponed the check for whether succip is ever restored, since
peephole may affect the decision.
follow_code:
We now push code from the outside context into this context before
pushing code from this context into nested contexts, since this may
give us more code to push. I also removed redundant references to
ModuleInfo.
prog_io:
Small formatting change.
options, optimize:
Add a new option, --optimize-fulljumps, which defaults on.
jumpopt, opt_util:
If --optimize-fulljumps is set, replace unconditional gotos with
the instruction sequence they point to. This not only avoids a jump
at runtime, but also increases basic block length and makes value
numbering more effective.
peephole:
Fulljump optimization can replace a recursive tailcall with the
initial part of the code of the procedure. Therefore peephole now
looks for a decr_sp followed by an incr_sp, and removes such pairs
from the instruction sequence.
frameopt:
Do not consider a decr_sp followed by an incr_sp to be a fatal error
(just in case peephole is switched off).
vn_block:
Fix a big tickled by fulljump optimization: maxfr, curfr and succip
were not required to be made up to date before an if_val exited
the extended basic block.
vn_util:
Simplify some more patterns of vnrvals. The extra patterns are
involved in testing conditions that are known to be true or false.
These patterns can arise when fulljump optimization replaces a
recursive tailcall.
code*.m & *gen.m:
Implement an improved method of handling negated contexts.
The new method avoids saving things onto the stack before
an if-then-else or negation if it can.
Also, fix the implementation of nondet if-then-else so that
it does the soft cut properly.
mercury_compile:
Sort the list of interface files before printing them to a .d file.
opt_util, peephole:
Fix a bug tickled by value numbering. Some sequences of code were
recognized as having no access to nondet stack control slots even
in the presence of such accesses, which lead to the incorrect
introduction of succeed_discards.
value_number:
Loosen the value correspondence sanity check, which was failing
needlessly, and tighten the tag sanity check, which was passing
incorrect code.
Do not try value numbering on blocks containing structures such as
"if (tag(x) == X && field(X, x, X) == X) goto X", since these will
definitely lead to tag sanity check violations.
vn_flush:
If a shared node has no uses left when flushed, leave it be.
When generating a mkframe, reflect its update of the top redoip slot
in the data structures.
vn_order:
Some hacks to get the relmaps partway to where I want them. This
code needs cleaning up.
vn_debug:
New debugging routines to support my changes to vn_order.
vn_type:
Deleted the vn_modframe vn_instr, since its role has been taken over
by assignments to redoip(maxfr).
opt_debug:
Reflect the change to vn_type, print address constants in vn_rvals,
and fix a typo.
vn_block, vn_util:
Reflect the change to vn_type.
frameopt, opt_util:
Attempt to fill delay slots with the instruction after an if_val
in preference to the saving of the succip.
optimize:
Fix a typo in earlier change.
value_number:
Check that the last node in the order is a control node.
vn_order:
If two registers or stackvars can be generated in any order,
prefer to generate them in numerical sequence for neatness.
vn_debug:
Add routine for printing the initial and final ordering of
unrelated nodes.
code_gen.pp:
Put the comment about the contents of stack slots before the initial
label, since this way it will be preserved by optimizations.
cse_detection.m:
Extended the search to look for cses in if-then-elses and switches
as well as disjunctions. Removed InstmapDelta from preds in which it
was not being used.
det_analysis.m:
Make the diagnosis routines more robust. The changes here avoid the
Philip's problems with lexical.m.
jumpopt.m:
Minor formatting changes.
livemap.m:
Avoid duplicating livevals instructions when optimizations are
repeated, since this can confuse some optimizations.
llds.m:
Minor documentation change.
make_hlds.m:
Minor formatting change.
mercury_compile.pp:
Do not map arguments to registers if any semantic errors have been
found.
middle_rec.m and code_aux.m:
Apply middle recursion only if tail recursion is not possible,
since tail recursion yields more efficient code.
opt_util.m:
Added a predicate to recognize constant conditions in if_vals.
Modified a predicate to make it better suited for frameopt.
optimize.pp:
Changed the way optimizations were repeated to allow better control.
Repeat peephole once more after frameopt, since the new frameopt
can benefit from this.
options.m:
Removed the --compile-to-c option, which was obsolete. Added an
option for predicate-wide value numbering, which is off by default.
Changed some of the default values of optimization flags to reduce
compilation time while holding the loss of speed of generated code
to a minimum.
peephole.m:
Look for if_vals whose conditions are constants, and eliminate the
if_val or turn it into a goto depending on the value of the constant.
Generalized the condition for optimizing incr_sp/decr_sp pairs.
value_number.m:
Added a prepass to separate primary tag tests in if-then-elses from
the test of the secondary tag, which requires dereferencing the
pointer.
Added sanity check routines to test two aspects of the generated code.
First, whether it produces the same values for the live variables as
the original code, and second, whether it has moved any dereferences
of a pointer before a test of the tag of that pointer. If either test
fails, we use the old instruction sequence.
vn_debug.m:
New messages to announce the failure of the sanity checks. They are
enabled by default, but of course can only appear if value numbering
is turned on (it is still off by default).
vn_flush.m:
Threaded a list of forbidden lvals (lvals that may not be assigned to)
through the flushing routines. When saving the old value of an lval
that is being assigned to, we use this list to avoid modifying any of
the values used on the right hand side of the assignment, even if the
saving of an old value results in assignment that requires another
save, and so on recursively.
When the flushing of a node_lval referred to a shared vn, the uses of
the access vns of the node_lvals were not being adjusted properly.
Now they are.
vn_order.m:
The ctrl_vn phase of the ordering was designed to ensure that all
nodes that need not come before a control node come after it. However,
nodes were created after this phase operated, causing leakage of some
value nodes in front of control nodes. Some of these led to pointer
dereferences before tag tests, causing bus errors. The ctrl_vn phase
is now last to avoid this problem.
vn_table.m:
Added an extra interface predicate to support the sanity checks in
value_number.
vn_util.m:
The transformation of c1-e2 into (0-e2)+c1 during vnrval simplification
could lead to an infinite loop in the compiler if c1 was zero. A test
for this case now prevents the loop.
switch_detection:
Detect partial switches, i.e. disjunctions in which not all
disjuncts form part of the switch. We give preference to full
switches, and failing that, to partial switches with the most arms.
peephole, opt_util:
Fixed the code for the introduction of succeed_discard.
code_gen:
Fixed spelling error in error message.
code_info:
Made error message somewhat more informative.
cse_detection:
Removed debugging code; we now always repeat cse detection after
finding some cses.
det_analysis:
Added some comments.
value_number, vn_debug, vn_flush:
Changes to make debugging easier.
code_info.m:
Bug fix: change generate_pre_commit and generate_commit so that
the values which need to be saved and restored are always pushed
onto the det stack, even in nondet predicates. The reason is
that if the committed goal fails, curfr is not valid, so we
can't restore the fields from the nondet stack.
(This way may well be more efficient anyway.)
disj_gen.m, ite_gen.m:
Handle the case when the current failure continuation is unknown
on entry to the disjunction or nondet if-then-else by creating
a new frame on the nondet stack. (Originally we just aborted
in this case; recently we "fixed" this, but it turned out that
the fix was not correct, for the same reason as the above-mentioned
bug in pre_commit/commit.
llds.m:
Add succfr/1 and prevfr/1 to the rval type in llds.m,
since they were needed by the above bug fixes.
(This caused dozens of changes elsewhere to handle the
new types.)
Also fix a trivial bug that I recently introduced which
prevented --mod-comments from working.
live_vars.m:
Fix bug in allocation of stack slots for nondet code.
(This is the one that caused the bug that ksiew and I found
when writing a calculator program.)
peephole.m:
Disable the succeed_discard() optimization, since it
causes incorrect code to be generated. It was replacing
modframe(do_fail) ... succeed() with
modframe(do_fail) ... succeed_discard() even when there were
instructions such as mkframe() in between.
modes.m, hlds.m:
When modechecking switches, record the binding of the switch variable
as we enter each case, so that we get the determinism analysis
right.
mercury_compile.pp:
Make sure that we set the exit status to be non-zero if we
find any errors.
typecheck.m, modes.m, undef_types.m, undef_modes.m:
Don't invoke type-checking if there are undefined types.
Don't invoke mode-checking if there are undefined modes.
This avoids the problem of the compiler aborting with an
internal error if there are undefined types/modes.
compiler/*:
Add copyright messages.
Change all occurences of *.nl in comments to *.m.
compiler/mercury_compile.pp:
Change the output to the .dep files to use *.m rather than *.nl.
(NOTE: this means that `mmake' will not work any more if you
call your files *.nl!!!)
opt_util.m, exprn_aux.m:
label_opt was erroneously optimizing away code which was not dead,
because it was not considering references to the code via
address_consts in rvals. So I've changed opt_util__instr_labels
to return those labels too, and added some new utility preds in
exprn_aux to deal with that.
exprn_aux.m:
Remove expr_aux__expr_is_constant/1, since it was not used
(and it was broken anyway).
io.nl:
Introduced a new predicate which ignore's any whitespace in the input.
Needs to have all the whitespace character's added to it.
*.nl and *.pp:
Changed the implementation of time profiling. Now during a compile,
the compiler identifies all the internal labels which can be accessed
externally, and marks them. At the moment, these are the continuation
labels of calls and the next disjunct in nondet disjunctions. Then
at the .mod output, it places a macro 'update_prof_current_proc' to
restore the profiling counter.
unify_gen:
Whenever we do a test of a variable against a non-constant functor,
we now try to turn it into a negated test on a constant functor.
This is possible if these two functors are the only ones.
code_aux:
Added an extra predicate to look up type definitions to make the
previous change easier.
llds, code_gen, opt_util, opt_debug, frameopt, jumpopt, peephole:
Added a boolean argument to do_succeed to say whether the nondet
frame should be discarded on success or not. The default is no,
but peephole has an optimization that tries to turn on this flag.
optimize, value_number, vn*:
Restructured the top level of value numbering as part of an effort
to identify blocks that could be optimized further given our knowledge
that the contents of e.g. stackvars is also in registers when we
jump to those blocks. Redone the interface between value_number and
frameopt to allow value_number to be iterated, which is necessary
to take advantage of the previously mentioned capability. Threated
the I/O state through the relevant predicates; value numbering doesn't
use non-logical I/O any more.
llds.nl:
Introduced an extra argument to the LLDS goto. It is the label
address of the Caller and is used for the profiling of tailcall's.
*.nl and *.pp:
Propagated the extra argument to all the appropiate files.
llds, code_info, opt_*, vn*:
Replaced curredoip with redoip(rval) to make references to other
redoips more efficient. Also, by turning modframe(L) into
redoip(curfr) = const(address_const(L)), value_number can now
optimize hijacking code better.
vn*:
If a disagreement on the desired value of an lvalue prevents value
number, try again after skipping to the first control point, since
this may cure the problem.
peephole, opt_util:
Now looking for successive modframes to optimize out.
disj_gen:
Put deterministic alternatives before others, mainly to make
the back mode of append easier to explain in the paper. :-(
mode_util:
Fixed scope error.
garbage_out:
Fixed some spelling and formatting errors.
code_info.nl:
Fix bug in choice-point hijacking.
llds.nl, opt_util.nl:
Add `curfr' register, needed for the above fix.
(XXX also need to add this in opt_debug.nl and vn*.nl.)
value_number, vn_util, opt_util, opt_debug:
Fixed a bug with allowed incr_hp to overwrite its target without saving
it. Reorganized the handling of incr_sp and decr_sp to make sure they
never get reordered with respect to control flow instructions.
llds:
Fixed the output of temp declarations for blocks.
frameopt:
Generalized the set up patterns accepted as starting a det procedure.
labelopt:
Added a source-level option to remove eliminated instructions instead
of turning them into comments, and made it the default.
optimize:
After value numbering, perform jump optimization as well as peepholing
and label optimization.