Estimated hours taken: 20
Give duplicate code elimination more teeth in dealing with similar arguments
of different function symbols. For the source code
:- type t1 ---> f(int)
; g(int, int).
:- pred p1(t1::in, int::out) is det.
p1(f(Y), Y).
p1(g(Y, _), Y).
we now generate the C code
Define_entry(mercury__xdup__p1_2_0);
r1 = const_mask_field(r1, (Integer) 0);
proceed();
thus avoiding the cost of testing the function symbol.
runtime/mercury_tags.h:
Add two new macros, mask_field and const_mask_field, that behave
just like field and const_field except that instead of stripping
off a known tag from the pointer, they strip (mask) off an unknown
tag.
compiler/llds.m:
Change the first argument of the lval field/3 from tag to maybe(tag).
Make the comments on some types more readable.
compiler/llds_out.m:
If the first arg of the lval field/3 is no, emit a (const_)mask_field
macro; otherwise, emit a (const_)field macro.
compiler/basic_block.m:
New module to convert sequences of instructions to sequences of
basic blocks and vice versa. Used in the new dupelim.m.
compiler/dupelim.m:
Complete rewrite to give duplicate code elimination more teeth.
Whereas previously we eliminated blocks of code only if they exactly
duplicated other blocks of code, we now look for blocks that can be
"anti-unified". For example, the blocks
r1 = field(mktag(0), r2, 0)
goto L1
and
r1 = field(mktag(1), r2, 0)
<fall through to L1>
anti-unify, with the most specific common generalization being
r1 = mask_field(r2, 0)
goto L1
If several basic blocks antiunify, we replace one copy with the
antiunified block and try to eliminate the others. We do not
eliminate blocks that can be fallen into, since eliminating them
would require introducing a goto, which would slow the code down.
compiler/peephole,m:
If a conditional branch to a label is followed by that label or
by an unconditional branch to that label, eliminate the branch.
Dupelim produces this kind of code.
compiler/{code_exprn,exprn_aux,lookup_switch,opt_debug,unify_gen}.m:
Minor changes required by the change to field/3.
compiler/{frameopt,jumpopt,labelopt,mercury_compile,optimize,value_number}.m:
s/__main/_main/ in predicate names.
compiler/jumpopt.m:
Add some documentation.
compiler/unify_gen.m:
Fix a module qualified predicate name reference that would not
work in Prolog.
compiler/notes/compiler_design.html:
Document the new file basic_block.m.
Estimated hours taken: 1
Fix some problems Fergus pointed out after reviewing my
stack layouts change.
compiler/continuation_info.m:
Separate library imports from compiler imports.
compiler/handle_options.m:
compiler/options.m:
Add some comments to explain the stack_layouts option.
Comment out the documentation of the stack-layouts option,
as it is a developer only option.
compiler/code_gen.m:
compiler/llds.m:
compiler/llds_common.m:
compiler/llds_out.m:
compiler/mercury_compile.m:
compiler/optimize.m:
Remove llds_proc_id from c_procedure, as pred_proc_id is
available instead.
Estimated hours taken: 50
Generate stack layouts for accurate garbage collection.
compiler/base_type_layout.m:
Change the order of some arguments so that threaded data
structures are more often in the final two arguments (allows
easy use of higher order predicates).
Simplify some code using higher order preds.
Export base_type_layout__construct_pseudo_type_info, as
stack_layout.m needs to be able to generate pseudo_type_infos
too.
Fix problems with cell numbers being re-used -- get the next
cell number from module_info, and update module_info
after processing base_type_layouts.
compiler/code_gen.m:
Add information about each procedure to the continuation info.
Handle new field in c_procedure.
compiler/continuation_info.m:
Redesign most of this module to deal with labels
that are continuation points for multiple calls.
Change the order of some arguments so that threaded data
structures are in the final two arguments.
Cleaned up and documented code.
compiler/dupelim.m:
compiler/exprn_aux.m:
Handle new label_entry data type.
compiler/export.m:
compiler/opt_debug.m:
Handle new label_entry and general data types.
compiler/llds_out.m:
Add an argument to get_proc_label to control whether a
"mercury_" prefix is wanted.
Handle new label_entry and general data types.
compiler/llds.m:
Add a new alternative for data_const - a label_entry.
Add a new alternative for data_name - general, which
allows any sort of data, with names generated elsewhere.
Add the pred_proc_id as a field of c_procedure.
compiler/optimize.m:
compiler/llds_common.m:
compiler/optimize.m:
Handle new field in c_procedure.
compiler/mercury_compile.m:
Generate layout information after code has been generated,
and output stack layouts.
compiler/notes/compiler_design.html:
Document new stack_layout module.
compiler/stack_layout.m:
New file - generates the LLDS code that defines
global constants to hold the stack_layout structures.
compiler/options.m:
compiler/handle_options.m:
Add --stack-layout option which outputs stack layouts.
Make accurate gc imply stack_layout.
Estimated hours taken: 3
Enable --warn-interface-imports by default. This was turned off while
list and term were defined in mercury_builtin.m, since it caused many
warnings.
Fix all the unused interface imports that have been added since then.
compiler/options.m:
Enable --warn-interface-imports by default.
compiler/module_qual.m:
Fix formatting inconsistencies with module names in warning
messages. (".m" was not appended to module names if there was
only one module).
compiler/*.m:
library/*.m:
tests/invalid/type_loop.m:
tests/warnings/*.m:
Remove usused interface imports, or move them into
implementation (mostly bool, list and std_util).
Estimated hours taken: 0.5
compiler/mercury_compile.m:
compiler/optimize.m:
Insert lots of cuts, with the aim of reducing the memory consumption
of the SICStus Prolog version of the Mercury compiler.
Estimated hours taken: 8
Enable the code to treat `__' as an alternative syntax for module
qualification, after fixing various places in the compiler where
we use `__' in ways that are incompatible with this.
compiler/prog_io.m:
compiler/prog_io_goal.m:
Uncomment the code to handle `__' as module qualification.
compiler/intermod.m:
compiler/hlds_module.m:
compiler/modecheck_unify.m:
Fix bugs in the handling of module qualified higher-order terms.
compiler/*.m:
s/hlds__/hlds_/g
compiler/passes_aux.m:
s/process__/process_/g
compiler/pragma_c_gen.m:
compiler/code_gen.m:
s/code_gen__/pragma_c_gen__/ for the predicates defined in
pragma_c_gen.m (this ought to have been done when the code
was first moved from code_gen.m to pragma_c_gen.m).
compiler/llds.m:
s/llds__proc_id/llds_proc_id/g
The reason for this was to avoid ambiguity between proc_id
in hlds_pred.m and llds__proc_id in llds.m.
compiler/quantification.m:
compiler/make_hlds.m:
compiler/mercury_to_c.m:
s/goal_vars/quantification__goal_vars/g
The reason for this was to avoid ambiguity between goal_vars
in quantification.m and goal_util__goal_vars in goal_util.m.
compiler/dupelim.m:
compiler/optimize.m:
s/dupelim__main/dupelim_main/g
The reason for this change is that a program can only
have one main/2 predicate.
compiler/prog_io_dcg.m:
Remove the old "temporary hack" to strip off and ignore
io__gc_call/1, since the new handling of `__' broke it.
It was only useful for optimizing NU-Prolog performance,
which we don't care about anymore.
compiler/mercury_compile.m:
compiler/modules.m:
compiler/intermod.m:
compiler/prog_io.m:
Remove occurrences of io__gc_call.
compiler/llds_out.m:
compiler/base_type_info.m:
Ensure that we properly handle the special hacks in mercury_builtin
where predicates from other modules (e.g. term__context_init)
are defined in mercury_builtin because they are needed for
type_to_term and term_to_type. llds_out.m: don't put
`mercury_builtin' in the mangled names for those symbols.
base_type_info.m: handle types whose status is "imported"
in their own module.
Estimated hours taken: 25
A rewrite of frameopt, with supporting changes in other modules.
frameopt:
A complete rewrite, with three objectives.
The first is to fix a basic design flaw that was in the module from
the beginning, which is that it looked at whether a block would have
a stack frame if the frame setup wasd delayed as long as possible,
and took this as gospel. This sometimes led to code that throws away
the frame to enter a block that does not need a frame and then
constructing it again to enter another block which does need a frame.
It also lead to some twisted code when we jumped from a block without
a frame to a block with one, since we'd have to set up a stack frame
on arrival at the target block; this sometimes required branches
around this setup code at the start of the target block to properly
support fallthroughs.
We now work out in advance which blocks must have a frame, and
propagate the requirement for a frame both forwards and backwards
until a fixpoint is reached, and only then transform the code.
The propagation phase ensures that we never suffer from either
of the problems described above.
The second objective is to integrate another optimization concerned
with stack frames: not delaying the creation, but reusing a frame
set up for one call to also act as the frame of a tail recursive call.
We used to this (badly) in peephole; we now do it (well) here.
The third objective is to separate out the filling of delay slots,
so frameopt can be invoked before value numbering. (Filling delay
slots creates code that refers to the same location by two distinct
names, detstackvar(0) and detstackvar(N) where N>0, which breaks the
assumption behind value numbering.) Invoking frameopt before value
numbering should make value numbering more effective whenever frameopt
decides to keep the stack frame.
delay_slot:
A new module to perform the optimization of filling branch delay slots.
opt_util:
Return the initial label instruction from opt_util__get_prologue,
and delete some predicates that aren't and won't be needed.
peephole:
Don't pass around the Teardown and Setup maps, since the optimization
they were needed for (keeping stack frames) is now done by frameopt.
optimize:
Use the new interface of frameopt and peephole.
Invoke frameopt before the value numbering passes.
We don't need a dedicated peephole pass after frameopt anymore,
What we need is a labelopt pass to get rid of the extra labels frameopt
introduces, and possibly a jumpopt pass to short-circuit any jumps
that replace tailcalls.
Invoke delay_slot optimization and post_value_number at the very end.
We don't need to invoke any frameopt post-pass anymore.
Fix a couple of places where we were not dumping the instruction
properly when --debug-opt was given.
value_number:
Use the new interface of peephole and opt_util__get_prologue.
jumpopt:
Under some circumstances we were generating the instruction "r1 = r1";
we don't do this anymore.
llds_out:
Add a missing newline at the end of garbage collection annotations.
Estimated hours taken: 0.5
optimize.m:
Fix a bug: if --optimize-vn-repeat was non-zero but
--optimize-value-number was not set, then it was invoking
the non-value-numbering optimizations the wrong number
of times.
Estimated hours taken: 12
The main changes are
1 associating a name with the arguments of constructors
2 removing the follow_vars field from calls, higher-order calls
and complicated unifications, since they are not used
3 merging the follow_vars and store_alloc passes, since they logically
belong together
4 add a new module, lco, for detecting opportunities for last
call optimization modulo constructor application; it won't
actually apply the optimization until the mode system becomes
expressive enough to handle it (this module detects 529 opportunities
in the compiler and library)
5 make "-O3 --optimize-value-number" do the right thing; previously,
it used not to apply value numbering because the vnrepeat option
defaulted to zero
6 don't refer to .err2 files anymore; use .err instead.
prog_data:
The list associated with each value of type "constructor" now
contains not only the types of the arguments but their names as well.
equiv_type, hlds_data, hlds_out, make_hlds, mercury_to_{goedel,mercury},
mode_util, module_qual, shapes, type_util, unify_proc:
Modify the traversal of type definitions to account for the names
in the lists inside values of type "constructor".
prog_io:
Parse argument names. An unrelated change is that we now
check whether :- pred declarations give modes to some of their
arguments but not to all, in which case we return an error.
hlds_goal:
Remove the follow_vars field from calls, higher-order calls
and complicated unifications.
*.m:
Handle the new arities of calls, higher order calls and complicated
unifications.
mercury_compile:
Don't call follow_vars directly anymore, but do call lco if its option
is set. Also flush the main output before a call to maybe_report_stats
to prevent ugly output.
store_alloc:
Call follow_vars directly.
follow_vars:
Expose the initialization and traversal predicates for store_alloc.
lco:
Find opportunities for last call optimization modulo constructor
application.
passes_aux:
Add a HLDS traversal type for lco.
optimize:
Consider the vnrepeat count to be zero unless value numbering is on.
options:
Set the default value of vnrepeat to 1.
modules:
Don't refer to .err2 files.
Estimated hours taken: 1
Since NU-Prolog hasn't been capable of executing the compiler for a long time
now, I have removed the .pp files and replaced them with .m files.
code_gen, mercury_compile, optimize:
Remove NU-Prolog specific code.
Mmake:
Don't refer to the .pp files.
dnf:
Add the capability of transforming all procedures regardless of
markers. This will be useful when generating idiomatic Prolog code.
mercury_to_goedel, polymorphism:
Fix comments.