Estimated hours taken: 1
options:
Divide --inlining into --inline-simple, for inlining all procedures
with simple definitions (the curent practice), --inline-single-use
for inlining all procedures called exactly once, and --inline-threshold
for specifying an upper bound on the product of the number of calls
and the size of the procedure definition (roughly the number of
connectives).
The --inline-single-use option is off by default until the problem with
parse_dcg_goal_2 is fixed.
inlining:
Implement the new options.
goal_util:
Added a predicate for computing the size of a goal.
mercury_compile:
Call inlining if any one of three options is set.
call_gen:
Remove an obsolete comment (all of three hours old :-)
Estimated hours taken: 8
Bug fixes for higher_order.m and unused_args.m
NEWS
Removed the message about bugs in unused_args.m and higher_order.m
compiler/options.m
Re-enabled higher_order and unused_args.
compiler/unused_args.m
Fixed so that this now handles partially instantiated
deconstructions correctly.
compiler/higher_order.m
Two bug fixes:
Specialization of types for specialized versions of predicates.
Fixed handling of curried arguments.
compiler/inlining.m, compiler/type_util.m:
Moved inlining:apply_substitution_to_type_map and
inlining:apply_rec_substitution_to_type_map to type_util.m
for use in the higher_order.m bug fix.
library/varset.m
Added predicate varset__new_vars which returns a list of new
variables.
library/term.m
Added predicates term__apply_variable_renaming(_to_list)
to apply a variable renaming (map(var, var)) to a term
or list of terms.
library/map.m
Added map__det_insert_from_corresponding_lists to insert
multiple key-value pairs into a map.
tests/valid/{Mmake, higher_order2.m, higher_order3.m, unused_args_test2.m}
Tests for the bug fixes.
Estimated hours taken: 8
options.m:
Rename branch_delay_slot to have_delay_slot.
Set optimize_delay_slot in -O2 only if have_delay_slot was set earlier.
This is possible now because the default optimization level is now
set in mc.
mercury_compile:
Change verbose output a bit to be more consistent.
dead_proc_elim:
Export the predicates that will eventually be needed by inlining.m.
inlining.m:
Use the information about the number of times each procedure is called
to inline local nonrecursive procedures that are called exactly once.
EXCEPT that this is turned off at the moment, since the inlining of
parse_dcg_goal_2 in prog_io, which this change enables, causes the
compiler to emit incorrect code.
prog_io:
Moved the data type definitions to prog_data. (Even though prog_io.m
is ten times the size of prog_data.m, the sizes of the .c files are
not too dissimilar.)
Estimated hours taken: 1.5
dead_proc_elim:
Count the number of references to each predicate if that predicate
is a candidate for inlining.
options:
Enable --optimize-delay-slots for -O2 only if the machine architecture
actually has branch delay slots.
inlining, modules:
Fix the copyright notice.
Estimated hours taken: 3
options:
Add a new option, --branch-delay-slot, intended for use by mc on
the basis of the configuattion script. It says whether the machine
architecture has delays slots on branches.
The setting of option should affect whether we set
--optimize-delay-slots at -O2, but this doesn't work yet.
hlds_goal:
Add an extra field to hold follow_vars infromation to disjunctions,
switches and if-then-elses. I intend to use this information to
generate better code.
*.m:
Changes to accommodate the extra field.
Estimated hours taken: 16
options:
Replace the word_size option with the two options bits_per_word and
bytes_per_word. The former is needed by lookup_switch, the latter by
value numbering.
lookup_switch:
Use the new option instead of word_size.
vn_type, vn_cost, vn_block, value_number:
Add a new type, vn_params, containing information such as the number
of bytes per word (from the option) and cost parameters. Use these
cost parameters to make more realistic decisions.
vn_filter:
New module to filter out unnecessary uses of temporary variables,
which gcc does unnecessarily badly on.
value_number, vn_verify:
Move verification completely to vn_verify. Tighten the verification
rules relating to tags where it concerns code sequences in which
the tag of an rval is taken in a statement before an if_val, but
loosen them to avoid spurious rejections of code sequences containing
arithmetic comparisons. Fix some missing cases from semidet switches
that may have lead to overly conservative decisions.
value_number, vn_order:
Vn_order was making an overly conservative assumption about where
to split an extended basic block if it couldn't be optimized together.
Move the decision to value_number and try to make it better. The new
heuristic is not enabled yet.
vn_debug:
Change the conditions under which one type of message is printed.
vn_flush:
Wrap some too long lines.
llds:
Fix a bug that would prevent profiling from working correctly on
value numbered code: we weren't scanning instructions inside blocks
when looking for return addresses.
peephole:
Enable an optimization previously left disabled by accident.
switch_detection, tag_switch:
Eliminate an unused argument.
Estimated hours taken: 3
compiler/options.m:
Changed to accomodate the recent change to getopt.m.
Added some new options:
--reorder-conj, --reorder-disj, --fully-strict;
--strict-sequential (== previous three);
--reclaim-heap-on-failure (== reclaim-heap-on-semidet-failure
plus reclaim-heap-on-nondet-failure);
--everything-in-one-c-function (== --procs-per-c-function 0)
Reorganized the handling of --opt-level to avoid duplication in
the table of optimization levels. Added new optimization levels -1
and 6, documented the meaning of each option level, and changed
the options set by the various optimization levels.
XXX TODO: update the user documentation to reflect the above changes.
Estimated hours taken: 0.25
compiler/options.m:
Re-enable excess_assign by default. The problem with it (it
broke the C interface) was fixed a long time ago.
Estimated hours taken: 2
Do some more work on improving floating-point performance:
emit boxed floating point constants as static ground terms.
options.m:
Add new option --unboxed-float.
exprn_aux.m
Add --unboxed-float to the `exprn_opts' that affect whether
or not things can be static constants. If --unboxed-float
is not set, and --static-ground-terms is, then consider
float_consts to be constant.
code_exprn.m, lookup_switch.m:
Trivial changes to handle new arity of exprn_opts type.
llds.m:
If --unboxed-float is not set, and --static-ground-terms is, then
output `static const Float mercury_float_const_...' declarations
for float_consts.
Estimated hours taken: 6
NEWS:
Documented the changed interfaces to list, std_util and graph.
configure.in:
Added the number of bytes per word (calculated as sizeof(void *))
as a configuration variable.
compiler/goal_util.m:
Add an optional sanity check for ensuring that all variables in a goal
get renamed in goal_util__rename_vars_in_goal[s].
Also fixed a bug in goal_util__create_variables which was giving
wrong names to some variables (which lead to very confusing hlds dumps).
compiler/excess.m:
in the calls to goal_util__rename_vars_in_goals add the bool which
indicates that we do not want to do the sanity checking operation of
making sure that *all* variables get renamed.
compiler/inlining.m:
in the calls to goal_util__rename_vars_in_goals add the bool which
indicates that we do want to do the sanity checking operation of
making sure that *all* variables get renamed.
Also fixed up calls to goal_util__create_variables for the bug fix
described above.
compiler/quantification.m:
in the calls to goal_util__rename_vars_in_goals add the bool which
indicates that we do not want to do the sanity checking operation of
making sure that *all* variables get renamed.
Also fixed up calls to goal_util__create_variables for the bug fix
described above.
compiler/lookup_switch.m:
changed lookup_switch to use a configuration option "word_size" to
find out the number of bytes (and hence the number of bits) per
word, rather than having a magic number.
compiler/options.m:
added "word_size" for the number of bytes per word. Defaults to 4,
but my next checkin will add a configuration parameter to mc.in.
Don't port to any 16 bit machines in the next couple of days. ;-)
also changed req_density to dense_switch_req_density and added
lookup_switch_req_density for the minimum density of lookup switches.
compiler/switch_gen.m:
changed req_density to dense_switch_req_density and
lookup_switch_req_density appropriately.
library/graph.m:
Add lots of comments.
Fix the interface to make it more consistent.
Fixed some bugs.
library/list.m:
Added some HO stuff from philip:
list__filter/3, list__filter/4
list__filter_map, list_sort/3 (takes a cmp predicate).
Moved the HO interface stuff into the interface at the
top of the file.
Removed list__map_maybe/3.
library/std_util.m:
added a pair/3 predicate from philip for avoiding type ambiguities
when using -/2.
added maybe_pred/3.
doc/user_guide.texi:
added documentation for the changes to the command line options.
A Constraint Solver Interface For Mercury
<thunderous applause>
Estimated hours taken: 1 summer studentship
This is the implementation of a fairly general constraint solver interface. If
using a library grade *.cnstr, we emit C instructions to keep track of the
solver's implicit state. This is done by storing and restoring 'tickets' -
abstract handles on the solver's state.
We emit a store_ticket() macro:
-when entering the first disjunct of a disjunction
-when entering the condition of an if-then-else
We emit a restore_ticket() macro:
-when entering a disjunct other than the first of a disjunction
-when entering the else part of an if-then-else
We emit a discard_ticket() macro:
-after the restore_ticket() in the final disjunct of a disjunction
-at the start of the 'then' part of an if-then-else
The rules for emitting the macros is slightly more complicated than that shown
above for if-then-elses (determinism of the parts must be taken into account).
compiler/code_info.m:
Get an llds store_ticket/restore_ticket etc. instruction
compiler/disj_gen.m:
Emit ticket macros in the appropriate places in a disjunction.
compiler/dupelim.m:
Handle the new llds instruction.
compiler/frameopt.m:
Handle the new llds instruction.
compiler/handle_options.m:
If the grade is *.cnstr, set the constraints option on.
compiler/ite_gen.m:
Emit ticket macros in the appopriate places in an if-then-else.
compiler/livemap.m:
Handle the new llds instruction.
compiler/llds.m:
Output the ticket macros.
compiler/make_hlds.m:
An irrelevant tidy-up.
compiler/mercury_compile.pp:
If the grade is *.cnstr, pass -DCONSTRAINTS to mgnuc
compiler/middle_rec.m:
Handle the new llds instruction.
compiler/opt_*.m:
Handle the new llds instruction.
compiler/options.m:
Introduce a new boolean option 'constraints'.
compiler/shapes.m:
Output a new shape - 'ticket'.
compiler/unify_proc.m:
Handle the new llds instruction.
compiler/v*.m:
Handle the new llds instruction.
Estimated hours taken: 8
mercury_compile:
Fix the pass structure, and start using a loose sequence of stage
numbers, to make it easier to add new stages without having to fiddle
stage numbers.
THIS DOES MEAN THAT ALL STAGE NUMBERS HAVE CHANGED NOW.
The stage number assignment scheme assigns 1 to 25 to the front end,
26 to 50 to the middle passes, and 51 to 99 to the back end.
hlds:
We had two types that combined a pred_id and a proc_id. One,
pred_proc_id, used a simple pair; the other, procedure_id, had a better
definition using a specific function symbol but was not used
anywhere else. I standardized on the name pred_proc_id, but using the
definition with a dedicated function symbol (proc).
I also defined a type pred_proc_list as a list of pred_proc_id.
To prepare for memoing, I added a new field to pred_info, which is
a list of markers, each requesting a specific transformation on the
predicate or indicating that the transformation has been done.
The inline request is now represented using such a marker. However,
the interface is backwards compatible.
constraint, dead_proc_elim, dependency_graph, det_analysis, det_report,
higher_order, unused_args:
Changes to conform to the new definition of pred_proc_id.
In two places removed definitions of predproclist, whose
equivalent pred_proc_list is now defined in hlds.m.
hlds_out, make_hlds, mercury_to_mercury, prog_io:
Add code to handle memo pragma declarations, using whenever possible
a version of the existing code for handling inline requests, but
generalized for handling any pragma that sets a marker.
switch_detection:
Rename the type cases_list to sorted_cases_list. This avoids a
name clash that creates a duplicate label and therefore screws up
the profiler, and is a better name anyway.
options:
Add a new option, --opt-space, that turns on optimizations that save
space and turns off optimizations that squander space.
handle_options:
Pass the special option handler to getopt.
frameopt:
For each labelled code sequence that tears down the stack frame but
does not use it, we used to create a parallel code sequence that omits
the teardown code, for use by gotos from locations that did not have
a stack frame. However, peepholing may discover that it is better
not to tear down the stack frame at the site of the goto after all,
so we need the original code sequence as well.
The current change fixes a bug that occurs if the original code
sequence is modified by another part of frameopt to omit teardown
code. In such cases, which are produced by --pred-value-number,
peepholing redirects a goto to a code sequence that it thinks tears
down the stack frame, but actually doesn't.
With this change, --pred-value-number now works.
llds:
Fix typos in a comment.
Estimated hours taken: 0.25
compiler/options.m:
Disable --optimize-higher-order, since it is buggy.
Ensure that neither it not --optimize-unused-args, which is
also buggy, are enabled by any --optimization-level.
Move the --high-level-c option to the "compilation model options"
section.
Estimated hours taken: 3
options.m:
Add a new option category "Link options".
Split the "Optimization options" category into sub-categories
"High-level (HLDS->HLDS)", "Medium-level (HLDS->LLDS)",
"Low-level (LLDS-LLDS)", and "Output-level (LLDS->C)" options.
Reorganize the placing of options so that each option is
in the correct category.
Add more verbose synonyms for some options: `--arg-convention'
for `--args', and `--optimization-level' for `--opt-level'.
Comment out the unused `--specialize' and `--optimize-copyprop'
options.
Change the short option version of `--line-numbers' from `-l'
to `-n'.
Rename `--optimize' as `--llds-optimize'.
options.m, mercury_compile.pp:
Rename `--optimize-dead' as `--optimize-dead-procs'.
Change the `--cflags' and `--link-flags' options from
`string' to `accumulating'.
Add new options `-l' (`--library'), `-L' (`--library-directory'),
and `--link-object'.
Estimated hours taken: 1.5
code_exprn:
Distribute the initial comments among the declarations of the exported
predicates. This makes it much less likely that the declarations will
be modified without changes in the comments. Since this has happened
in the past, some predicates are now without comments.
Changed code_exprn__place_var to prefer to get even a constant term
from a location if it has been produced before, and factor out some
code that is shared between the handling of cached and evaled
expressions.
code_exprm, code_info:
Removed an unnecessary argument from code_exprn__get_varlocs.
dead_proc_elim:
Changed the predicate name prefix from dead__ to dead_proc_elim__
to conform to notes/CODING_STANDARDS.
handle_options:
Remove an inappropriate comment.
jumpopt:
Filter out redundant livevals whether --optimize-fulljumps is given
or not. (I thought they aren't created if the option isn't given,
but they are.)
options:
Change the meaning of -O from --c-optimize to --opt-level.
Disabled unused args until the bug is fixed.
Estimated hours taken: 2
code_exprn:
When we are processing the flushing of create expressions, make sure
the Lval we are creating into isn't a field reference. This avoids
deep field of field of field of ... nesting. It does introduce
references to high register numbers, but this is a lesser evil,
and Tom and I plan to fix this anyway.
arg_info, globals, options:
Change --args old to --args simple.
options:
Make some help messages more specific.
code_aux, code_exprn, code_info, det_report, make_hlds, mercury_to_goedel,
prog_io, typecheck:
Changes to accommodate the move from varset__lookup_name
to varset__search_name.
Estimated hours taken: 2
exprn_aux:
Both code_exprn and lookup_switch had code to check whether an
expression is constant or not. Some of the code is different
due to different handling of variables in rvals, but exprn_aux
now contains the common subset.
This common subset used to treat some address constants incorrectly,
simply by not considering them; they are now considered and treated
properly.
code_exprn, lookup_switch, exprn_aux:
Remove redundant option lookups in the process of checking for
constant expressions.
code_exprn:
Other minor cleanups, including removal of a block of code Tom
says was "deep magic" (but which turns out to be unnecessary).
code_info:
Removed some dead code.
options:
Added real support for --opt-level, in the form of a table of
default values of options for each optimization level between
0 and 5 (both inclusive). This needs a new form of documentation.
How do you do tables in texinfo?
Estimated hours taken: 6
arg_info:
Add support for the compact argument passing convention. The proper
handling of higher order calls etc is still missing.
globals:
Added a third global type, args_method, current either "old" or
"compact".
passes_aux:
Moved some auxiliary predicates from mercury_compile and options
to passes_aux.
constraint, det_analysis, make_hlds, modules, optimize, undef_types:
Import and refer to passes_aux.
mercury_compile, handle_options:
Remove the predicates moved to passes_aux. Also move the option
postprocessing code to a new module, handle_options.
Another change is that we stop after syntax errors if the new option
--halt-at-syntax-errors is set.
handle_option:
New module for option postprocessing.
options:
Remove the option lookup predicates, which were obsolete.
Add new options --args, --halt-at-syntax-errors and --opt-level.
Add a special handler for --opt-level.
lookup_switch:
Call getopt to look up options, not options.
value_number, vn_block:
Extended basic blocks with more than one incr_hp pose a problem for
value numbering when using boehm_gc, because value numbering coalesces
all the allocations into one. Previously we did not optimize such
sequences. I modified value numbering to divide up such blocks into
smaller blocks, each with at most one incr_hp, and optimize these.
At the moment, some of these blocks contain deeply nested field
refs, which value numbering is very slow to handle; code_exprn
should be modified to fix these.
value_number:
Rename usemap to useset, since this is more accurate.
Fixed a bug in --pred-value-number, which manifested itself as
the generation of duplicate labels.
labelopt:
Rename usemap to useset, since this is more accurate.
Estimated hours taken: 5.0
Added a new kind of switch generation which generates array lookups
for dense switches that output constants.
compiler/code_gen.pp:
The interface to switch_gen__generate_switch changed - we now pass
the goal-info which gets used in lookup_switch.
compiler/dense_switch.m:
export dense_switch__calc_density/3 and dense_switch__type_range/5
which get used by lookup_switch.
compiler/options.m:
Added a new option "lookup-switch-size" which is the minimum number
of cases that should be in a switch before we turn it into a lookup
table. Currently, it defaults to 4 which is the same value as used
for the "dense-switch-size" option. Some experimentation may show
a better value.
Also fixed the option names for "dense-switch-size" and
"string-switch-size" which were "...switch_size".
compiler/switch_gen.m:
switch_gen__generate_switch/Lots now take the hlds__goal_info as
one of its arguments, because the goal-info is needed by lookup-
switches.
Also, in switch_gen__generate_switch/Lots, check to see if a switch
can be turned into a dense lookup table, and turn it into one if it
can.
compiler/lookup_switch.m:
A new module that turns switches into lookup tables if the outputs
of the switch are all constants. It does this by generating code
for each of the cases and checking that no code actually got generated
and that all the outputs were constants. The result is that for many
predicates like char_to_int/2, etc instead of a computed goto with
lots and lots of silly trivial cases, we get a simple lookup. This
is good.
There is a case where it may not be a win - if the cost of the range
check and the bitvector lookup outweighs the cost of the jumps that
would otherwise take place.
Estimated hours taken: 7
dead_proc_elim:
A new pass to eliminate any procedures not reachable from the
exported modes of the exported predicates, either via calls or
by being mentioned in a higher order construct. Useful after
inlining and specialization passes have created orphan procedures.
options:
Added new option for enabling dead procedure elimination.
mercury_compile:
Call dead procedure elimination if its option is set.
This change displaces dump numbers after 13.
Also cleaned up the format of some of the verbose messages,
and removed some old, commented out code that isn't useful anymore.
Further work on the pass structure is required.
unused_args:
Cleaned up the format of some of the verbose messages,
hlds_out:
Print out the pred_id and status of each predicate; useful in looking
at what dead_proc_elim is doing.
hlds:
Added a predicate to return the exported procs of a pred.
Removed the randomization of pred_id's, which hasn't been required
for along time.
Exported the type pred_id for use by hlds_out. (This last may be
a temporary change.)
llds:
Make the extern declarations to the various bunch functions into
ANSI prototypes.
det_analysis:
Changes to make the cc component of at_most_many_cc "sticky" in the
various tables, i.e. if a goal is in a one solution context, any
conjunction, disjunction or switch containing it is also in a one
solution context. This need not propagate beyond quantification.
switch_detection:
Undid the unnecessary export of a predicate.
Updated a comment.
Estimated hours taken: 200
Added file unused_args.m which warns about and removes unused arguments,
e.g. type_infos, from predicates. Added compiler options:
warn-unused-args - default on
optimize-unused-args - default on
Added file higher_order.m which optimizes calls to higher-order predicates
where the higher-order arguments are known. Added compiler options:
optimize-higher-order - default on
compiler/unused_args.m
New file.
compiler/higher_order.m
New file.
compiler/code_util.m
code_util__make_proc_label/4 - adjusted this so to take into account
the fact that unused_args.m can remove arguments from special preds,
compiler/liveness.m
A fix to a bug in the way this handles switches.
Previously, if none of the cases of a switch contained the
switched-on variable, the variable was put in the pre-death set
and was clobbered before the switch. Without unused_args removing
unused variables local to the switch, this never occurred.
compiler/mercury_compile.pp
Added code to call unused_args.m.
Added hlds dump stage 12 unused-args.
compiler/quantification.m
Similar problem to that in liveness.m above.
Estimated hours taken: 6
options.m, llds.m, mercury_compile.pp:
Add new option `--linker-delete-unused-code'.
If this option is enabled, generate a bunch of C files,
one C file per LLDS module, in a `<module>.dir' directory
rather than just generating a single C file.
At link time, build one big library from all the `.o' files
in all the `<module>.dir' directories using `ar', and link
the `foo_init.o' against this library. Doing this means that
the linker will not link in unreferenced code.
mercury_compile.pp:
Fix some bugs which meant that `mc foo/bar.m' didn't work.
Estimated hours taken: 1.5
Undo dylan's changes in the names of some library entities,
by applying the following sed script
s/term_atom/term__atom/g
s/term_string/term__string/g
s/term_integer/term__integer/g
s/term_float/term__float/g
s/term_context/term__context/g
s/term_functor/term__functor/g
s/term_variable/term__variable/g
s/_term__/_term_/g
s/std_util__bool_/bool__/g
to all the `.m' and `.pp' files in the compiler and library directories.
The reason for undoing these changes was to minimize incompatibilities
with 0.4 (and besides, the changes were not a really good idea in the first
place).
I also moved `bool' to a separate module.
The main reason for that change is to ensure that the `__' prefix is
only used when it genuinely represents a module qualifier.
(That's what dylan's changes were trying to acheive, but `term__'
does genuinely represent a module qualifier.)
compiler/*.m:
Apply sed script above;
where appropriate, add `bool' to the list of imported modules.
mercury_compile.pp:
Invoke the unique_modes.m pass after determinism analysis.
getopt.m, options.m:
Change getopt.m to use higher-order predicates, so that
it doesn't depend on option.m, and move it from the compiler
directory to the library directory.
mercury_compile.pp, hlds_out.m, mercury_to_c.m:
Minor changes to work with the new getopt.
Estimated hours taken: 1
options.m:
Correct a couple of mistakes in the help message,
and add a little bit more documentation for a couple of options.
mercury_compile.pp:
The first HLDS dump stage should be numbered 1, not 0,
since the second stage is numbered 2.
getopt.m:
Allow short options to take an argument without any intervening
space, e.g. `-I../library' rather than `-I ../library'.
Improve error messages - all error messages now include the name
of the offending option.
Estimated hours taken: 2
Make sure that `mc -c' passes the correct machine- and grade-specific
options to the C compiler.
configure.in:
Set CFLAGS_FOR_REGS and CFLAGS_FOR_GOTOS to any special
gcc options required for the use of gcc global registers and gcc
nonlocal gotos respectively.
scripts/mc.in:
Pass the values of CFLAGS_FOR_REGS and CFLAGS_FOR_GOTOS to
mercury_compile.
compiler/options.m:
Add new options for passing CFLAGS_FOR_REGS and CFLAGS_FOR_GOTOS.
compiler/mercury_compile.pp:
Pass the CFLAGS_FOR_REGS and/or CFLAGS_FOR_GOTOS to the C compiler,
if appropriate.
Estimated hours taken: 4
Changed the way configuration parameters are handled so that we
can avoid bootstrapping problems. Instead of getting configuration
paramters from `conf.m.in', they are now passed via the `mc' script.
Also renamed the `num_real_regs' option to `num_real_r_regs',
to avoid confusion with the NUM_REAL_REGS macro set in runtime/machdeps/*.h
(which has a different meaning).
compiler/conf.m.in:
Removed this module, which used to define the old
conf__low_tags_bits/1 predicate.
compiler/Mmake:
Removed references to conf.m*.
compiler/options.m:
Added conf_low_tag_bits option, to replace the old
conf__low_tag_bits/1 predicate.
Rename num_real_regs option as num_real_r_regs.
compiler/tag_switch.m:
Rename num_real_regs option as num_real_r_regs.
compiler/mercury_compile.pp:
Use the conf_low_tag_bits option rather than calling
the old conf__low_tag_bits/1 predicate.
Estimated hours taken: 0.1
compiler/cse_detection.m:
Rename a function symbol.
compiler/middle_rec.m:
Fix earlier overhast commit.
compiler/options.m:
Change help message for --num-real-regs.
excess:
A new pass to remove unnecessary assignment unifications.
mercury_compile:
Call the new excess assignment module.
options:
Add a new option, excess_assign, to control the new optimization.
Add another, num-real-regs, to specify how many of r1, r2 etc are
actually real registers. The default is now set to 5 for kryten;
later it should be supplied by the mc script, with a value determined
at configuration time.
tag_switch:
Use num-real-regs to figure out whether it is likely to be worthwhile
to eliminate the common subexpression of taking the primary tag of
a variable. Also fix an old performance bug: the test for when a
jump table is worthwhile was reversed.
value_number, vn_block:
Do value numbering on extended basic blocks, not basic blocks.
vn_debug:
Modify an information message.
labelopt:
Clean up an export an internal predicate for value numbering. Replace
bintree_set with set.
middle_rec:
Prepare for the generalization of middle recursion optimization
to include predicates with an if-then-else structure.
cse_detection:
Fix a bug: when hoisting a common desconstruction X = f(Yi), create
new variables for the Yi. This avoids problems with any of the Yis
appearing in other branches of the code.
goal_util:
Add a new predicate for use by cse_detection.
common:
Fix a bug: recompute instmap deltas, since they may be affected by the
optimization of common structures.
code_info:
Make an error message more explicit.
det_analysis:
Restrict import list to the needed modules.
*.m:
Import assoc_list.
1. A bug fix and a new warning for quantification
quantification.m:
Fix bug in renaming apart of lambda goals: it used to
sort the lambda variables. Improve efficiency slightly.
Add some comments for quantification__rename_apart.
Use more informative variable names in a couple of places.
quantification.m, make_hlds.m, mercury_to_mercury.pp, options.m:
Add code to quantification.m to detect variables with
overlapping scopes, and pass back a list of warnings.
Add code to make_hlds.m to print out the warnings, if
the new option warn_overlapping_scopes was enabled (as
it is by default). Add code to options.m and
mercury_to_mercury.pp to handle the new warning
option.
common.m, cse_detection.m, follow_code.m, unify_proc.m,
Add an extra argument `_Warnings' to calls to
`implicitly_quantify_clause_body', as required by the
above change.
goal_util.m:
Export the predicate goal_util__rename_var_list for use by
quantification.m.
2. A (very incomplete) start to a new backend which will generate
high-level, debuggable C code.
options.m:
Add new option --high-level-C.
mercury_to_mercury.pp:
Handle new option.
mercury_to_c.m:
New file.
llds.m, hlds_out.m:
Export predicates for use by mercury_to_c.m.
det_analysis, det_report:
Split the old det_analysis module, which was getting too big,
by moving the error diagnosis predicates to a new module.
value_number:
Convert each if statement that contains one of the boolean operators
{and, or, not} at the top level to eliminate the operator, introducing
additional if statements if necessary. The reason that this is a good
idea is that
if_val(tag(r1) == 1 && field(1, r1, N) = X)
get transformed into two ifs, and the field reference can be extracted
as a common subexpression in an assignment between the two ifs, after
the primary tag has been tested. This is necessary to avoid an
unaligned memory reference. Before this change, we simply did not
optimize code sequences containing such ifs.
vn_order:
Prepare for an optimization (to come later this week) whereby if
a block contains multiple exit points with inconsistent bindings,
we can optimize the front part separately as well as the back part.
vn_debug:
Added a message to help me find the most profitable way to do the
above change.
opt_util, frameopt:
Moved some code for dealing with det procedure prologues from
frameopt to opt_util, since now value_number needs its also.
options:
Make tag_switch apply in more cases.
options:
Add a new option, --optimize-delay-slot, on by default, that is
used in the next change, and another, --specialize, for later use.
Also fix an old oversight.
frameopt:
Now we fill delays slots only if the new option is on. This helps
on e.g. x86s, which don't HAVE delay slots.
optimize:
Call frameopt with the option value.
tag_switch:
Fixed two bugs. First, if a primary tag value did not have cases for
all its secondary tag values, we now emit a goto the failure label
if the secondary tag does not match any case; we used to just fall
through. Second, the failure code itself used to be generated in
the context of the end of one of the cases; this should now be fixed,
although I want to go over it with Tom to make sure.
The computation of the secondary tag is now done once, instead of
being repeated at every secondary tag test.
options:
Set tag_switch_size to 4 by default, reduced from 8. It was this change
that exposed the two bugs above. After the fix, the compiler is smaller
by about 2 Kb.
switch_gen:
Add some comments.
code_util:
Fixed nonstandard indentation.
llds:
Optionally generate while (1) loops instead of short backward branches.
This is faster in the absence of fast jumps.
options:
Add a new option, --no-emit-c-loops.
middle_rec:
We now check if the LLDS code after the recursive call is empty.
If yes, we don't generate the downward loop.
code_aux:
Minor cleanup associated with previous change.
frameopt:
Instead of blindly assuming that any code before an if_val will be
able to fill the delay slot, we check whether it computes a value
that is used in the condition. We now also allow a slightly wider
range of user instructions to fill delay slots.
opt_util:
Some new preds to support the new funcionality in frameopt.
tag_switch:
Compute the tag of the switched-on value into a register at the
start, instead of computing it in each if_val.
options.m:
Reorganize the long usage message: move all the options
affecting link compatibility into a new section, and add
some documentation on the different grades.
instructions, and the last argument from local labels. All these were
placeholders for info put in there by prof.m and used when emitting C
code.
The set of labels that serve as return points are now calculated in llds.m
just before each procedure has its C code generated. This set is passed to
output_instruction along with the label at the start of the procedure.
options, code_gen:
Add an option, --no-simple-neg, to disable the generation of
simplified code for simple negations, since sometimes the more
complex code is better (e.g. for queens) due to branch frequencies.
peephole, jumpopt:
Move the detection of tailcalls from peephole to jumpopt. This
allows us to avoid building some maps in peephole. The code in
jumpopt is also somewhat more general, but this is unlikely
to lead to better code.
opt_util:
Some changes to support the previous modifications. We also
allow framevars in code that looks for stackvars, since the
two kinds of variables can both occur in code that does commits.
optimize:
The main predicate of peephole has a new name, call it by that name.
Also remove Tom's comment asking for my inspection of his change.
value_number:
The main predicate of peephole has a new name, call it by that name.
Also loosen a too-tight sanity check.
options.m
A new option called --halt-at-warn has been added. The idea is for
mmake to halt when a warning is reported by mc, by mc setting the
exit status to one. mc -h phrases this better.
det_analysis.m make_hlds.m mercury_compile.pp prog_io.m typecheck.m
Recognise the new --halt-at-warn option. Some of the error reports
in det_analysis.m now also give an error exit status.
options.m:
By default, disable value numbering.
(From what I understand it should work now, but nevertheless
it should not be enabled by default since it slows down
compilation.)
frameopt:
Look inside blocks introduced by value numbering when looking
restorations of succip.
value_number, opt_util:
If we are using conservative garbage collection, disable value
numbering for blocks that allocate more than one cell on the heap.
This allows value numbering of most blocks to work in the absence
of -DALL_INTERIOR_POINTERS.
all other source files:
Clean up "blank" lines that nevertheless contain space or tab
characters.
options.m, mercury_compile.pp:
Added a new compilation warning (& option), which issues a warning if
a module has an interface section that doesn't export anything, or
is non-existant.
common:
Reorganized the way ready-for-reuse structures are represented.
The type of the variable to which the structure is bound is now
stored with the structure information, so we can avoid reusing
a structure for a variable of a different type (which may have
a different data representation). This is necessary for correct
handling of convert_item/2 in prog_io.
options:
Turn on common structure optimization by default.
mercury_compile.pp, options.m:
Split most of the option post-processing into a separate predicate.
Changed the handling of --num-tag-bits: previously this option was
ignored unless you specified --tags high, but now it works like
it originally did, i.e. even with --tags low. The autoconf value
is only used if no --num-tag-bits option is specified.
Removed the undocumented arbitrary limit of a maximum of 6 tag bits.
(On a 64-bit architecture, you might well want to use 30 tag bits!)
hlds:
Removed the notion of "internal determinism"; commits are now indicated
through "some" goals. Renamed the predicate module_info_shapes to
module_info_get_shapes. Will later change other predicates also to
get consistent naming.
hlds_out:
Removed printing of internal determinisms.
det_analysis:
Changes to accommodate the new way of signalling commits. The comments
on optimizations have been modified to reflect the need for information
about whether goals can raise exceptions. Exported two predicates for
use by follow_code.
live_vars:
Changes to accommodate the new way of signalling commits.
code_gen:
Shift the handling of commits to "some" goals. Some predicates
had three versions, one for each code model; these have been
simplified significantly. The sequence of predicates has also
been rationalised a bit. There is still room for improvement
on both fronts.
disj_gen, ite_gen, middle_rec:
Changed calls to modified predicates in code_gen.
common:
When this pass changes A == f(B, C), D := f(B, C) into A == f(B, C),
D := A, it can change the scopes of A, B and C. The pass did not
take this into account; now it does. The pass is still disabled
until it has been more adequately tested.
mercury_compile:
Moved followcode into the back end. We now thread ModuleInfo through
the backend instead of Shapes, since follow_code modifies other parts
of ModuleInfo as well. Rationalised the stage numbers, WHICH MEANS
-d NUMBERS HAVE CHANGED.
follow_code:
Follow_code is now after determinism analysis, so that we can check
that it does not change the determinism of the branched structure
we are pushing code into. We now push not just builtins but also the
first call after the branched structure into the branched structure,
since this will reduce register shuffling. Made a start on pushing code
into the fronts of branched structures, when some code before the branch
point is useful only in one branch.
options:
Added an option prev_code for the (incomplete) functionality in
follow_code.
vn_flush:
Moved a comment about future functionality to where it now belongs.
cse_detection:
Removed obsolete debugging predicate.