Estimated hours taken: 40
Implement nondet pragma C codes.
runtime/mercury_stacks.h:
Define a new macro, mkpragmaframe, for use in the implementation
of nondet pragma C codes. This new macro includes space for a
struct with a given sruct tag in the nondet stack frame being created.
compiler/{prog_data.m,hlds_goal.m}:
Revise the representation of pragma C codes, both as the item and
in the HLDS.
compiler/prog_io_pragma.m:
Parse nondet pragma C declarations.
Fix the indentation in some places.
compiler/llds.m:
Include an extra argument in mkframe instructions. This extra argument
gives the details of the C structure (if any) to be included in the
nondet stack frame to be created.
Generalize the LLDS representation of pragma C codes. Instead of a
fixed sequence of <assign from inputs, user c code, assign to outputs>,
let the sequence contain these elements, as well as arbitrary
compiler-generated C code, in any order and possibly with repetitions.
This flexibility is needed for nondet pragma C codes.
Add a field to pragma C codes to say whether they can call Mercury.
Some optimizations can do a better job if they know that a pragma C
code cannot call Mercury.
Add another field to pragma C codes to give the name of the label
they refer to (if any). This is needed to prevent labelopt from
incorrectly optimizing away the label definition.
Add a new alternative to the type pragma_c_decl, to describe the
declaration of the local variable that points to the save struct.
compiler/llds_out.m:
Output mkframe instructions that specify a struct as invoking the new
mkpragmaframe macro, and make sure that the struct is declared just
before the procedure that uses it.
Other minor changes to keep up with the changes to the representation
of pragma C code in the LLDS, and to make the output look a bit nicer.
compiler/pragma_c_gen.m:
Add code to generate code for nondet pragma C codes. Revise the utility
predicates and their data structures a bit to make this possible.
compiler/code_gen.m:
Add code for the necessary special handling of prologs and epilogs
of procedures defined by nondet pragma C codes. The prologs need
to be modified to include a programmer-defined C structure in the
nondet stack frame and to communicate the location of this structure
to the pragma C code, whereas the functionality of the epilog is
taken care of by the pragma C code itself.
compiler/make_hlds.m:
When creating a proc_info for a procedure defined by a pragma C code,
we used to insert unifications between the headvars and the vars of
the pragma C code into the body goal. We now perform substitutions
instead. This removes a factor that would complicate the generation
of code for nondet pragma C codes.
Pass a moduleinfo down the procedures that warn about singletons
(and other basic scope errors). When checking whether to warn about
an argument of a pragma C code not being mentioned in the C code
fragment, we need to know whether the argument is input or output,
since input variables should appear in some code fragments in a
nondet pragma C code and must not appear in others. The
mode_is_{in,out}put checks need the moduleinfo.
(We do not need to check for any variables being mentioned where
they shouldn't be. The C compiler will fail in the presence of any
errors of that type, and since those variables could be referred
to via macros whose definitions we do not see, we couldn't implement
a reliable test anyway.)
compiler/opt_util.m:
Recognize that some sorts of pragma_c codes cannot affect the data
structures that control backtracking. This allows peepholing to
do a better job on code sequences produced for nondet pragma C codes.
Recognize that the C code strings inside some pragma_c codes refer to
other labels in the procedure. This prevents labelopt from incorrectly
optimizing away these labels.
compiler/dupelim.m:
If a label is referred to from within a C code string, then do not
attempt to optimize it away.
compiler/det_analysis.m:
Remove a now incorrect part of an error message.
compiler/*.m:
Minor changes to conform to changes to the HLDS and LLDS data
structures.
Estimated hours taken: 500 or so
This change implements typeclasses. Included are the necessary changes to
the compiler, runtime and library.
compiler/typecheck.m:
Typecheck the constraints on a pred by adding constraints for each
call to a pred/func with constraints, and eliminating constraints
by applying context reduction.
While reducing the constraints, keep track of the proofs so that
polymorphism can produce the tyepclass_infos for eliminated
constraints.
compiler/polymorphism.m:
Perform the source-to-source transformation which turns code with
typeclass constraints into code without constraints, but with extra
"typeclass_info", or "dictionary" parameters.
Also, rather than always having a type_info directly for each type
variable, sometimes the type_info is hidden inside a typeclass_info.
compiler/bytecode*.m:
Insert some code to abort if bytecode generation is used when
typeclasses are used.
compiler/call_gen.m:
Generate code for a class_method_call, which forms the body of a class
method (by selecting the appropriate proc from the typeclass_info).
compiler/dead_proc_elim.m:
Don't eliminate class methods if they are potentially used outside
the module
compiler/hlds_data.m:
Define data types to store:
- the typeclass definitions
- the instances of a class
- "constraint_proof". ie. the proofs of redundancy of a
constraint. This info is used by polymorphism to construct the
typeclass_infos for a constraint.
- the "base_tyepclass_info_constant", which is analagous the
the base_type_info_constant
compiler/hlds_data.m:
Define the class_method_call goal. This goal is inserted into the
body of class method procs, and is responsible for selecting the
appropriate part of the typeclass_info to call.
compiler/hlds_data.m:
Add the class table and instance table to the module_info.
compiler/hlds_out.m:
Output info about base_typeclass_infos and class_method_calls
compiler/hlds_pred.m:
Change the representation of the locations of type_infos from "var"
to type_info_locn, which is either a var, or part of a typeclass_info,
since now the typeclass_infos contain the type_infos for the type that
they constrain.
Add constraints to the pred_info.
Add constraint_proofs to the pred_info (so that typeclass.m can
annotate the pred_info with the reasons that constraints were
eliminated, so that polymorphism.m can in turn generate the
typeclass_infos for the constraints).
Add the "class_method" marker.
compiler/lambda.m:
A feable attempt at adding class ontexts to lambda expressions,
untested and almost certainly not working.
compiler/llds_out.m:
Output the code addresses for do_*det_class_method, and output
appropriately mangled symbol names for base_typeclass_infos.
compiler/make_hlds.m:
Add constraints to the types on pred and func decls, and add
class and instance declarations to the class_table and instance_table
respectively.
compiler/mercury_compile.m:
Add the check_typeclass pass.
compiler/mercury_to_mercury.m:
Output constraints of pred and funcs, and output typeclass and instance
declarations.
compiler/module_qual.m:
Module qualify typeclass names in pred class contexts, and qualify the
typeclass and instance decls themselves.
compiler/modules.m:
Output typeclass declarations in the short interface too.
compiler/prog_data.m:
Add the "typeclass" and "instance" items. Define the types to store
information about the declarations, including class contexts on pred
and func decls.
compiler/prog_io.m:
Parse constraints on pred and func declarations.
compiler/prod_out.m:
Output class contexts on pred and func decls.
compiler/type_util.m:
Add preds to apply a substitution to a class_constraint, and to
a list of class constraints. Add type_list_matches_exactly/2. Also
add typeclass_info and base_typeclass_info as types which should not
be optimised as no_tag types (seeing that we cheat a bit about their
representation).
compiler/notes/compiler_design.html:
Add notes on module qualification of class contexts. Needs expansion
to include more stuff on typeclasses.
compiler/*.m:
Various minor changes.
New Files:
compiler/base_typeclass_info.m:
Produce one base_typeclass_info for each instance declaration.
compiler/prog_io_typeclass.m:
Parse typeclass and instance declarations.
compiler/check_typeclass.m:
Check the conformance of an instance declaration to the typeclass
declaration, including building up a proof of how superclass
constraints are satisfied so that polymorphism.m is able to construct
the typeclass_info, including the superclass typeclass_infos.
library/mercury_builtin.m:
Implement that base_typeclass_info and typeclass_info types, as
well as the predicates type_info_from_typeclass_info/3 to extract
a type_info from a typeclass_info, and superclass_from_typeclass_info/3
for extracting superclasses.
library/ops.m:
Add "typeclass" and "instance" as operators.
library/string.m:
Add a (in, uo) mode for string__length/3.
runtime/mercury_ho_call.c:
Implement do_call_*det_class_method, which are the pieces of code
responsible for extracting the correct code address from the
typeclass_info, setting up the arguments correctly, then executing
the code.
runtime/mercury_type_info.h:
Macros for accessing the typeclass_info structure.
Estimated hours taken: 40 (+ unknown time by Zoltan)
Add support for memory profiling.
(A significant part of this change is actuallly Zoltan's work. Zoltan
did the changes to the compiler and a first go at the changes to the
runtime and library. I rewrote much of Zoltan's changes to the runtime
and library, added support for the new options/grades, added code to
interface with mprof, did the changes to the profiler, and wrote the
documentation.)
[TODO: add test cases.]
NEWS:
Mention support for memory profiling.
runtime/mercury_heap_profile.h:
runtime/mercury_heap_profile.c:
New files. These contain code to record heap profiling information.
runtime/mercury_heap.h:
Add new macros incr_hp_msg(), tag_incr_hp_msg(),
incr_hp_atomic_msg(), and tag_incr_hp_atomic_msg().
These are like the non-`msg' versions, except that if
PROFILE_MEMORY is defined, they also call MR_record_allocation()
from mercury_heap_profile.h to record heap profiling information.
Also, fix up the indentation in lots of places.
runtime/mercury_prof.h:
runtime/mercury_prof.c:
Added code to dump out memory profiling information to files
`Prof.MemoryWords' and `Prof.MemoryCells' (for use by mprof).
Change the format of the `Prof.Counts' file so that the
first line says what it is counting, the units, and a scale
factor. Prof.MemoryWords and Prof.MemoryCells can thus have
exactly the same format as Prof.Counts.
Also cleaned up the interface to mercury_prof.c a bit, and did
various other minor cleanups -- indentation changes, changes to
use MR_ prefixes, additional comments, etc.
runtime/mercury_prof_mem.h:
runtime/mercury_prof_mem.c:
Rename prof_malloc() as MR_prof_malloc().
Rename prof_make() as MR_PROF_NEW() and add MR_PROF_NEW_ARRAY().
runtime/mercury_wrapper.h:
Minor modifications to reflect the new interface to mercury_prof.c.
runtime/mercury_wrapper.c:
runtime/mercury_label.c:
Rename the old `-p' (primary cache size) option as `-C'.
Add a new `-p' option to disable profiling.
runtime/Mmakefile:
Add mercury_heap_profile.[ch].
Put the list of files in alphabetical order.
Delete some obsolete stuff for supporting `.mod' files.
Mention that libmer_dll.h and libmer_globals.h are
produced by Makefile.DLLs.
runtime/mercury_imp.h:
Mention that libmer_dll.h is produced by Makefile.DLLs.
runtime/mercury_dummy.c:
Change a comment to refer to libmer_dll.h rather than
libmer_globals.h.
compiler/llds.m:
Add a new field to `create' and `incr_hp' instructions
holding the name of the type, for heap profiling.
compiler/unify_gen.m:
Initialize the new field of `create' instructions with
the appropriate type name.
compiler/llds_out.m:
Output incr_hp_msg() / tag_incr_hp_msg() instead of
incr_hp() / tag_incr_hp().
compiler/*.m:
Minor changes to most files in the compiler back-end to
accomodate the new field in `incr_hp' and `create' instructions.
library/io.m:
Add `io__report_full_memory_stats'.
library/benchmarking.m:
Add `report_full_memory_stats'. This uses the information saved
by runtime/mercury_heap_profile.{c,h} to print out a report
of memory usage by procedures and by types.
Also modify `report_stats' to print out some of that information.
compiler/mercury_compile.m:
If `--statistics' is enabled, call io__report_full_memory_stats
at the end of main/2. This will print out full memory statistics,
if the compiler was compiled with memory profiling enabled.
compiler/options.m:
compiler/handle_options.m:
runtime/mercury_grade.h:
scripts/ml.in:
scripts/mgnuc.in:
scripts/init_grade_options.sh-subr:
scripts/parse_grade_options.sh-subr:
Add new option `--memory-profiling' and new grade `.memprof'.
Add `--time-profiling' as a new synonym for `--profiling'.
Also add `--profile-memory' for more fine-grained control:
`--memory-profiling' implies both `--profile-memory' and
`--profile-calls'.
scripts/mprof_merge_runs:
Update to handle the new format of Prof.Counts and to
also merge Prof.MemoryWords and Prof.MemoryCells.
profiler/options.m:
profiler/mercury_profile.m:
Add new options `--profile memory-words' (`-m'),
`--profile memory-cells' (`-M') and `--profile time' (`-t').
Thes options make the profiler select a different count file,
Prof.MemoryWords or Prof.MemoryCells instead of Prof.Counts.
specific to time profiling.
profiler/read.m:
profiler/process_file.m:
profiler/prof_info.m:
profiler/generate_output.m:
Update to handle the new format of the counts file.
When reading the counts file, look at the first line of
the file to determine what is being profiled.
profiler/globals.m:
Add a new global variable `what_to_profile' that records
what is being profiled.
profiler/output.m:
Change the headings to reflect what is being profiled.
doc/user_guide.texi:
Document memory profiling.
Document new options.
doc/user_guide.texi:
compiler/options.m:
Comment out the documentation for `.proftime'/`--profile-time',
since doing time and call profiling seperately doesn't work,
because the code addresses change when you recompile with a
different grade. Ditto for `.profmem'/`--profile-memory'.
Also comment out the documentation for
`.profcalls'/`--profile-calls', since it is redundant --
`.memprof' produces the same information and more.
configure.in:
Build a `.memprof' grade. (Hmm, should we do this only
if `--enable-all-grades' is specified?)
Don't ever build a `.profcalls' grade.
Estimated hours taken: 9
Fix code generation for commits and nondet if-then-elses so that
it computes MR_ticket_counter correctly.
compiler/ite_gen.m:
compiler/code_info.m:
Change the way we do a soft cut when generating code for nondet
if-then-elses with nondet conditions so that the ticket counter
is restored correctly on backtracking.
compiler/llds.m:
Add new instructions `mark_trail_stack(lval)' and
`discard_tickets_to(rval)' to save/restore the ticket counter.
compiler/code_info.m:
Save the ticket counter before doing a commit and
restore it afterwards.
compiler/*.m:
Various minor changes to handle the new LLDS instructions.
runtime/mercury_trail.h:
Add new macros to implement the new LLDS instructions.
compiler/livemap.m:
Change the code in build_livemap_instr for mark_hp and
store_ticket so that it deletes the target lval from the
set of live variables, and simplify the code there for reset_ticket.
Estimated hours taken: 3
Change the generated code for trailing to match the new trailing interface.
compiler/code_gen.m:
When generating code for negations, ensure that we generate a
discard_ticket instruction to discard the current ticket before
failing.
compiler/llds.m:
compiler/llds_out.m:
compiler/code_info.m:
Change the `restore_ticket(Rval)' instruction to
`reset_ticket(Rval, Reason)', where Reason is one of
undo, commit, exception, or gc, as per runtime/mercury_trail.h.
A reset with Reason = undo gives the old "restore" behaviour.
compiler/frameopt.m:
compiler/livemap.m:
compiler/llds_common.m:
compiler/middle_rec.m:
compiler/opt_*.m:
compiler/peephole.m:
compiler/value_number.m:
compiler/vn_*.m:
Trivial changes to handle reset_ticket/2 instead of restore_ticket/1.
compiler/code_gen.m:
compiler/ite_gen.m:
compiler/disj_gen.m:
Change the places that called code_gen__maybe_discard_ticket
to instead call code_gen__maybe_reset_and_discard_ticket(...commit...).
Estimated hours taken: 4
Fix a bug where tag_switch.m was generating references to non-existent
labels for det switches that don't cover the full range of the type.
llds.m:
Add new alternative `do_not_reached' to the code_addr type.
exprn_aux.m:
dupelim.m:
livemap.m:
llds_out.m:
opt_util.m:
opt_debug.m:
Add new code to handle `do_not_reached'.
tag_switch.m:
When generating tag switch jump tables for det switches that do
not cover the whole type, which can happen if the initial inst of
the switch variable is a bound(...) inst that represents a
subtype, make sure that we don't generate references to
undefined labels for cases that occur in the switch var's type
but not in the switch var's initial inst. Instead, make such
references refer to a label that jumps to `do_not_reached'.
Estimated hours taken: 7
Making the types pred_id and proc_id (almost) abstract.
compiler/code_util.m:
Changed the type of several predicates:
code_util__make_uni_label/4 (arg 3 was int, now proc_id)
code_util__inline_builtin/4 (arg 3 was proc_id, now int)
Added predicate code_util__translate_builtin_2/6.
compiler/hlds_module.m:
Moved invalid_pred_id/1 to hlds_pred.m
compiler/hlds_pred.m:
Types pred_id/0 and proc_id are now abstract.
Added predicates:
hlds_pred__initial_pred_id/1, hlds_pred__initial_proc_id/1,
hlds_pred__next_pred_id/2, hlds_pred__next_proc_id/2,
pred_id_to_int/2, proc_id_to_int/2,
hlds_pred__in_in_unification_proc_id/1
Moved predicate invalid_pred_id/1 (from hlds_module.m).
compiler/*.m:
Miscellaneous minor changes to cast pred/proc_ids to ints
where appropriate.
Estimated hours taken: 3
Cleaned up the handling of labels for specialized versions of predicates
from other modules.
compiler/llds.m:
Changed the representation of proc_label slightly.
Each proc_label now contains the name of the module producing the
code for a predicate as well as the module containing the declaration
for the predicate.
compiler/code_util.m:
compiler/llds_out.m:
Fixed a bug in my last change that resulted in duplicate label
names for specialized versions of predicates.
The name of the module producing the code for the predicate
is added as an extra qualifier in the label for specialised
versions of predicates from other modules.
compiler/base_type_info.m:
compiler/opt_util.m:
compiler/opt_debug.m:
compiler/shapes.m:
Fixed uses of proc_label.
Estimated hours taken: 7
Add support for taking the addresses of words on the heap as well as on
on either stack. This will be used later to support tail recursion modulo
constructor application as well as parallelism.
The support provided is a first draft. Since nothing in the compiler
currently generates code that uses the new facilities, they have not been
tested yet beyond ensuring that they don't interfere with the old functionality
of the compiler.
llds:
Add a new type, mem_ref, that denotes a reference to a stackvar,
a framevar, or to a field of a cell on the heap.
Add a new function symbol to the type rval: mem_addr(mem_ref),
which represents the address of the word denoted by the mem_ref.
Add a new function symbol to the type lval: mem_ref(rval).
Given that Rval is an address, mem_ref(Rval) denotes the word
at that address. The value of Rval should have originally come from
a mem_addr(_) type rval, but that value could have been store in
registers, stack slots etc since then.
code_exprn, code_info, dupelim, exprn_aux, garbage_out, livemap, llds_common,
llds_out, middle_rec, opt_debug, opt_util, vn_cost, vn_filter:
Added code to handle the new mem_ref type and the new alternatives
in lval and rval.
exprn_aux:
Make exprn_aux__substitute_lval_in_lval more thorough.
vn_type:
Add vn shadows of the new things in llds.
vn_flush, vn_order, vn_util:
Handle the new things in llds and/or their vn shadows.
Estimated hours taken: 1.5
llds:
Add an extra argument to pragma_c to hold the context of the pragma
definition.
llds_out:
Add code to print the context as a #line directive. This code is
commented out, since it leads to mysterious crashes on some tests.
pragma_c_gen:
Fill in the context field in pragma_cs.
dupelim, frameopt, livemap, llds_common, middle_rec, opt_debug, opt_util,
value_number, vn_block, vn_filter, vn_verify:
Trivial changes to handle the extra argument in pragma_cs.
Estimated hours taken: 25
A rewrite of frameopt, with supporting changes in other modules.
frameopt:
A complete rewrite, with three objectives.
The first is to fix a basic design flaw that was in the module from
the beginning, which is that it looked at whether a block would have
a stack frame if the frame setup wasd delayed as long as possible,
and took this as gospel. This sometimes led to code that throws away
the frame to enter a block that does not need a frame and then
constructing it again to enter another block which does need a frame.
It also lead to some twisted code when we jumped from a block without
a frame to a block with one, since we'd have to set up a stack frame
on arrival at the target block; this sometimes required branches
around this setup code at the start of the target block to properly
support fallthroughs.
We now work out in advance which blocks must have a frame, and
propagate the requirement for a frame both forwards and backwards
until a fixpoint is reached, and only then transform the code.
The propagation phase ensures that we never suffer from either
of the problems described above.
The second objective is to integrate another optimization concerned
with stack frames: not delaying the creation, but reusing a frame
set up for one call to also act as the frame of a tail recursive call.
We used to this (badly) in peephole; we now do it (well) here.
The third objective is to separate out the filling of delay slots,
so frameopt can be invoked before value numbering. (Filling delay
slots creates code that refers to the same location by two distinct
names, detstackvar(0) and detstackvar(N) where N>0, which breaks the
assumption behind value numbering.) Invoking frameopt before value
numbering should make value numbering more effective whenever frameopt
decides to keep the stack frame.
delay_slot:
A new module to perform the optimization of filling branch delay slots.
opt_util:
Return the initial label instruction from opt_util__get_prologue,
and delete some predicates that aren't and won't be needed.
peephole:
Don't pass around the Teardown and Setup maps, since the optimization
they were needed for (keeping stack frames) is now done by frameopt.
optimize:
Use the new interface of frameopt and peephole.
Invoke frameopt before the value numbering passes.
We don't need a dedicated peephole pass after frameopt anymore,
What we need is a labelopt pass to get rid of the extra labels frameopt
introduces, and possibly a jumpopt pass to short-circuit any jumps
that replace tailcalls.
Invoke delay_slot optimization and post_value_number at the very end.
We don't need to invoke any frameopt post-pass anymore.
Fix a couple of places where we were not dumping the instruction
properly when --debug-opt was given.
value_number:
Use the new interface of peephole and opt_util__get_prologue.
jumpopt:
Under some circumstances we were generating the instruction "r1 = r1";
we don't do this anymore.
llds_out:
Add a missing newline at the end of garbage collection annotations.
Estimated hours taken: 0.8
Fix a long-standing bug that broke the optimization of semidet tailcalls
for procedures that have some outputs.
opt_util:
When matching two continuations to see if they form a semidet tailcall,
ignore differences between the liveness annotations of the success
and failure continuations.
We don't need these annotation anyway (what counts is the annotation
just before the call), and they will not match unless the procedure
has no outputs.
jumpopt:
Don't try to delete the liveness annotation from the code returned
by opt_util, since opt_util has already deleted it.
Estimated hours taken: 3
code_gen, pragma_c_code:
Move the code that generates code for pragma_c_codes to a new module.
llds:
Change the representation of reg and temp lvals, in order to create
the concept of a "register type" and to reduce memory requirements.
Also add a comment indicating a possible future extension dealing with
model_non pragma_c_codes.
code_exprn, code_info:
Add the ability to request registers of a given type, or a specific
register, when acquiring registers.
bytecode, bytecode_gen, call_gen, dupelim, exprn_aux, follow_vars, frameopt,
garbage_out, jumpopt, llds_out, middle_rec, opt_debug, opt_util, store_alloc,
string_switch, tag_switch, unify_gen, vn_block, vn_cost, vn_filter, vn_flush,
vn_order, vn_temploc, vn_type, vn_util, vn_verify:
Small changes to accommodate the new register representation.
hlds_goal:
Add a comment indicating a possible future extension dealing with
model_non pragma_c_codes.
inlining:
Add a comment indicating a how to deal with a possible future extension
dealing with model_non pragma_c_codes.
Estimated hours taken: 2
value_number:
Add two safety checks whose absence was bug. One of these replaces
an earlier, inadequate safety check in vn_block.
vn_block:
Remove that inadequate safety check, since now a more comprehensive
one is applied in value_number.
opt_util:
Opt_util__instr_labels now returns even those labels inside
code_addrs. This was needed for value_number.
labelopt:
Simplify the code based on the new capability of
opt_util__instr_labels.
dense_switch:
Break a too long line.
Estimated hours taken: 1
jumpopt:
Optimize the sequence
if (cond) L1
r1 = TRUE
stuff
proceed
...
L1: r1 = FALSE
same stuff
proceed
into
r1 = not cond
stuff
proceed
r1 = TRUE
stuff
proceed
...
L1: r1 = FALSE
same stuff
proceed
Dead code elimination will get rid of the code after the first proceed,
and probably of the code at L1 as well.
The optimization also applies if the L1 branch sets r1 to TRUE and
the fallthrough sets it to FALSE; in that case we don't negate
the condition.
opt_util:
Export a predicate for jumpopt.
Estimated hours taken: 12
value_number, opt_util:
Fix a bug triggered the conjunction of (1) value numbering being
repeated in -O5 (2) middle recursion optimization and (3) the
current code of modules.m. The problem was that although value
numbering was producing correct code, the livevals annotations
in the generated code were left unchange although they were
no longer correct.
The fix is a new predicate in opt_util to update the annotations
and to call it in value numbering.
vn_util:
Fix the bug reported by Tom in compiling scene.m: simplify
several kinds of patterns involving floats. These assume that
floats obey the laws of reals. Later we will have to add a mechanism
to prevent such simplication and reordering. (We already assume
that integers obey the laws of whole numbers.)
opt_debug, vn_debug:
Make the format of debugging output more suitable.
Estimated hours taken: 2.5
Mangle function names differently, so that we don't get
symbol name clashes between a function of arity N and
a predicate of arity N+1. (This fixes a bug which resulted
in error messages from the assembler.)
compiler/llds.m:
Add a new pred_or_func field to the proc/4 functor in
the proc_label type.
compiler/code_util.m:
Initialize this new field.
compiler/llds.m:
Use this new field when printing out labels.
compiler/{opt_debug.m,opt_util.m}:
Ignore this new field.
Estimated hours taken: 15
hlds_data:
Rename address_const to code_addr_const, and add base_type_info_const
as a new alternative in cons_id, and make corresponding changes
to cons_tag.
Make hlds_type__defn an abstract type.
llds:
Rename address_const to code_addr_const, and add data_addr_const
as a new alternative in rval_const.
Change type "label" to have four alternatives, not three:
local/2 (for internal labels), c_local (local to a C module),
local/1 (local a Mercury module but not necessarily to a C module,
and exported.
llds_out:
Keep track of the things declared previously, and don't declare them
again unnecessarily. Associate indentation with the following item
rather than the previous item (the influence of 244); this results
in braces being put in different places than previously, but should be
easier to maintain. Handle the new forms of addresses and labels.
Refer to c_local labels as STATIC when not using --split-c-files.
code_info:
Use a presently junk field to store a cell counter, which is used
to allocate distinguishing numbers to create'd cells. Previously
we used the label counter, which meant that label numbers changed
when we optimized away some creates. Handle the new forms of
addresses and labels.
exprn_aux:
Handle the new forms of addresses and labels. We are now more
precise in figuring out what label address forms will be considered
constants by the C compilers.
others:
Changes to handle the new forms of addresses and labels, and/or to
access hlds_type__defn as an abstract type.
Estimated hours taken: 2
llds.m:
Add a boolean argument to the create rval, which should be set to true
if the cell created must have a unique reference.
vn_type.m:
Add a corresponding argument to vn_create.
others:
Fix references to creates and vn_creates.
Estimated hours taken: 4
passes_aux:
Flesh out the code already here for traversing module_infos,
making it suitable to handle all the passes of the back end.
mercury_compile:
Use the traversal code in passes_aux to invoke the back end passes
over each procvedure in turn. Print a one-line message for each
predicate if -v is given (this fixes a long-standing bug).
excess.m, follow_code.m, follow_vars.m, live_vars.m, lveness.m, store_alloc.m:
Remove the code to traverse module_infos, since it is now unnecessary.
export.m:
Remove an unused argument from export__produce_header_file_2.
others:
Move imports from interfaces to implementations, or in some cases
remove them altogether.
Estimated hours taken: 2
Add support for det stack trace dumps in debugging grades, so programmers
can find out which predicates are causing det stack overflows.
llds.m:
Add an extra argument to the llds instruction incr_sp.
This argument says what predicate the stack frame belongs to.
vn_type.m:
Add a corresponding argument to the control node vn_incr_sp.
the other files:
Handle the extra argument of incr_sp or vn_incr_sp.
Estimated hours taken: 45
Module qualification of types, insts and modes.
Added a new interface file - <module>.int3. This contains the
short interface qualified as much as possible given the information
in the current module.
When producing the .int and .int2 files for a module, the compiler uses
the information in the .int3 files of modules imported in the interface
to fully module qualify all items. The .int2 file is just a fully
qualified version of the .int3 file. The .int3 file cannot be overwritten
by the fully qualified version in the .int2 file because then mmake would
not be able to tell when the interface files that depend on that .int3
file really need updating.
The --warn-interface-imports option can be used to check whether
a module imported in the interface really needs to be imported in
the interface.
compiler/module_qual.m
Module qualify all types, insts and modes. Also checks for modules
imported in the interface of a module that do not need to be.
compiler/modules.m
The .int file for a module now depends on the .int3 files of imported
modules. Added code to generate the make rule for the .int file in the
.d file. There is now a file .date2 which records the last time the
.int2 file was updated.
The .int3 files are made using the --make-short-interface option
introduced a few weeks ago.
compiler/options.m
Added option --warn-interface-imports to enable warning about interface
imports which need not be in the interface. This is off by default
because a lot of modules in the library import list.m when they only
need the type list, which is defined in mercury_builtin.m.
Removed option --builtin-module, since the mercury_builtin name is wired
into the compiler in a large number of places.
compiler/prog_util.m
Added a predicates construct_qualified_term/3 and construct_qualfied_term/4
which take a sym_name, a list of argument term and a context for the /4
version and give a :/2 term.
compiler/type_util.m
Modified type_to_type_id to handle qualified types. Also added predicates
construct_type/3 and construct_type/4 which take a sym_name and a list of
types and return a type by calling prog_util:construct_qualified_term.
compiler/modes.m
On the first iteration of mode analysis, module qualify the modes of
lambda expressions.
compiler/mode_info.m
Added field to mode_info used to decide whether or not to module qualify
lambda expressions.
compiler/mode_errors.m
Added dummy mode error for when module qualification fails so that mode
analysis will stop.
Added code to strip mercury_builtin qualifiers from error messages to
improve readability.
compiler/typecheck.m
Strip builtin qualifiers from error messages.
compiler/llds.m
compiler/llds_out.m
compiler/opt_util.m
compiler/opt_debug.m
Change the format of labels produced for the predicates to use the
qualified version of the type name.
compiler/mercury_compile.pp
Call module_qual__module_qualify_items and make_short_interface.
Remove references to undef_modes.m and undef_types.m
compiler/undef_modes.m
compiler/undef_types.m
Removed, since their functionality is now in module_qual.m.
compiler/prog_io.m
Changed to qualify the subjects of type, mode and inst declarations.
compiler/*.m
Changes to stop various parts of the compiler from throwing away
module qualifiers.
Qualified various mercury_builtin builtins, e.g. in, out, term etc.
where they are wired in to the compiler.
compiler/hlds_data.m
The mode_table and user_inst_table are now abstract types each
storing the {mode | inst}_id to hlds__{mode | inst}_defn maps
and a list of mode_ids or inst_ids. This was done to improve the
efficiency of module qualifying the modes of lambda expressions
during mode analysis.
module_info_optimize/2 now sorts the lists of ids.
The hlds_module interface to the mode and inst tables has not changed.
compiler/hlds_module.m
Added yet another predicate to search the predicate table.
predicate_table_search_pf_sym_arity searches for predicates or
functions matching the given sym_name, arity and pred_or_func.
compiler/higher_order.m
Changed calls to solutions/2 to list__filter/3. Eliminated unnecessary
requantification of goals.
compiler/unused_args.m
Improved abstraction slightly.
Estimated hours taken: 30+
arg_info:
Fix allocation to work properly for --args compact.
bytecode*:
Handle complex deconstruction unifications. Not really tested because
I can't find a test case.
bytecode_gen, call_gen, code_util:
Use the new method to handle builtin predicates/functions. We now
handle reverse mode arithmetic and unary plus/minus as builtins.
code_gen, code_init, follow_vars, hlds_pred:
Put back the initial follow_vars field of the proc_info, since this
may allow the code generator to emit better code at the starts of
of predicates.
inlining:
Don't inline recursive predicates.
goals_util:
Add a predicate to find out if a goal calls a particular predicate.
Used in inlining to find out if a predicate is recursive.
unused_args:
Remove code that used to set the mode of unused args to free->free.
Since this changes the arg from top_in to top_unused *without* code
in other modules being aware of the change, this screws up --args
compact.
llds, llds_out, garbage_out:
Prepare for the move to the new type_info structure by adding a new
"module" type for defining structures holding type_infos. Not
currently generated or output.
llds, opt_debug, opt_util, vn_type, vn_cost, vn_temploc:
Change the argument of temp to be a reg, not an int, allowing
floating point temporaries.
vn_type:
Add information about the number of floating point registers and
temporaries to the parameter structure (these are currently unused).
llds, dupelim, frameopt, livemap, middle_rec, value_number, vn_filter,
vn_verify:
Add an extra field to blocks giving the number of float temporaries.
options:
Add parameters to configure the number of floating point registers
and temporaries.
mercury_compile:
Add an extra excess assign phase at the start of the middle pass.
This should reduce the size of the code manipulated by the other
phases, and gives more accurate size information to inlining.
(The excess assign phase before code generation is I think still
needed since optimizations can introduce such assignments.)
value_number:
Optimize code sequences before and after assignments to curfr
separately, since such assignments change the meaning of framevars.
This fixes the bug that caused singleton variable warnings to contain
garbage.
vn_block, vn_flush, vn_order, vn_util:
Add special handling of assignments to curfr. This is probably
unnecessary after my change to value_number, and will be removed
again shortly :-(
vn_flush:
Improve the code generated by value numbering (1) by computing values
into the place that needs them in some special circumstances, and
(2) by fixing a bug that did not consider special registers to be
as fast as r1 etc.
vn_util:
Improve the code generated by value numbering by removing duplicates
from the list of uses of a value before trying to find out if there
is more than one use.
simplify:
Avoid overzealous optimization of main --> { ..., error(...) }.
handle_options:
Fix an error message.
code_aux:
Break an excessive long line.
Estimated hours taken: 1.5
Split llds into two parts. llds.m defines the data types, while llds_out.m
has the predicates for printing the code.
Removed the call_closure instruction. Instead, we use calls to the
system-defined addresses do_call_{det,semidet,nondet}_closure. This is
how call_closure was implemented already. The advantage of the new
implementation is that it allows jump optimization of what used to be
call_closures, without new code in jumpopt.
Estimated hours taken: 20
vn_block:
Fix a typo which reflected a fundamental design error. When finding
cheaper copies of live lvals, for use in creating specialized copies
(parallels) of blocks jumped to from the current location, we used
to use the map reflecting the contents of lvals at the start of the
block, not at the point of the jump.
--pred-value-number, which uses the information computed by the
buggy predicate, actually bootstrapped some time ago despite
this fundamental bug!
value_number:
Fix a bug in the creation of parallel code sequences for computed
gotos. Add some more opprtunities for printing diagnostics.
Move code concerning final verification to vn_verify.
vn_verify:
Move the remaining code concerned with final verification from
value_number to vn_verify.
peephole:
Add a new pattern, which transforms the sequence
incr_sp N; goto L2; L1; incr_sp N; L2
into just
L1; incr_sp N; L2
The pattern is of course more broadly applicable, but I have seen
it only when it involves a single incr_sp between the two labels.
(The longer pattern can be introduced by frameopt.)
opt_util:
Look inside blocks when checking whether an instruction can fall
through. This improves the performance of labelopt.
vn_table:
Make the type vn_table abstract; add, export and use access functions.
vn_util:
Remove a noop predicate, since now it won't ever be made to do
anything.
vn_cost:
Refine debugging output.
vn_debug:
Add some more debugging routines.
opt_debug:
Add some more debugging routines.
det_analysis:
Remove an unused argument.
labelopt:
Formatting change.
A Constraint Solver Interface For Mercury
<thunderous applause>
Estimated hours taken: 1 summer studentship
This is the implementation of a fairly general constraint solver interface. If
using a library grade *.cnstr, we emit C instructions to keep track of the
solver's implicit state. This is done by storing and restoring 'tickets' -
abstract handles on the solver's state.
We emit a store_ticket() macro:
-when entering the first disjunct of a disjunction
-when entering the condition of an if-then-else
We emit a restore_ticket() macro:
-when entering a disjunct other than the first of a disjunction
-when entering the else part of an if-then-else
We emit a discard_ticket() macro:
-after the restore_ticket() in the final disjunct of a disjunction
-at the start of the 'then' part of an if-then-else
The rules for emitting the macros is slightly more complicated than that shown
above for if-then-elses (determinism of the parts must be taken into account).
compiler/code_info.m:
Get an llds store_ticket/restore_ticket etc. instruction
compiler/disj_gen.m:
Emit ticket macros in the appropriate places in a disjunction.
compiler/dupelim.m:
Handle the new llds instruction.
compiler/frameopt.m:
Handle the new llds instruction.
compiler/handle_options.m:
If the grade is *.cnstr, set the constraints option on.
compiler/ite_gen.m:
Emit ticket macros in the appopriate places in an if-then-else.
compiler/livemap.m:
Handle the new llds instruction.
compiler/llds.m:
Output the ticket macros.
compiler/make_hlds.m:
An irrelevant tidy-up.
compiler/mercury_compile.pp:
If the grade is *.cnstr, pass -DCONSTRAINTS to mgnuc
compiler/middle_rec.m:
Handle the new llds instruction.
compiler/opt_*.m:
Handle the new llds instruction.
compiler/options.m:
Introduce a new boolean option 'constraints'.
compiler/shapes.m:
Output a new shape - 'ticket'.
compiler/unify_proc.m:
Handle the new llds instruction.
compiler/v*.m:
Handle the new llds instruction.
Estimated hours taken: 1.5
Undo dylan's changes in the names of some library entities,
by applying the following sed script
s/term_atom/term__atom/g
s/term_string/term__string/g
s/term_integer/term__integer/g
s/term_float/term__float/g
s/term_context/term__context/g
s/term_functor/term__functor/g
s/term_variable/term__variable/g
s/_term__/_term_/g
s/std_util__bool_/bool__/g
to all the `.m' and `.pp' files in the compiler and library directories.
The reason for undoing these changes was to minimize incompatibilities
with 0.4 (and besides, the changes were not a really good idea in the first
place).
I also moved `bool' to a separate module.
The main reason for that change is to ensure that the `__' prefix is
only used when it genuinely represents a module qualifier.
(That's what dylan's changes were trying to acheive, but `term__'
does genuinely represent a module qualifier.)
compiler/*.m:
Apply sed script above;
where appropriate, add `bool' to the list of imported modules.
Estimated hours taken: _2___
Change names with badly placed double underscores (ie where the part of
a name before a double underscore is not the same as the module name.)
Reflect changes in the library interface.
compiler/*:
Use the newer, more correct form of the term and bool names.
Predicates "bool__" are now "std_util__bool" and labels of
the term ADT are now "term_" instead of "term__".
compiler/vn*.m:
change all names "vn__*" to a correct module prefix. All the
names remain qualified.
compiler/hlds.m:
s/\<is_builtin__/hlds__is_builtin_/g
s/\<dependency_info__/hlds__dependency_info_/g
compiler/unify_proc.m:
s/\<unify_proc_info__/unify_proc__info_/g
compiler/transform.m:
s/\<reschedule__conj/transform__reschedule_conj/g
det_analysis, det_report:
Split the old det_analysis module, which was getting too big,
by moving the error diagnosis predicates to a new module.
value_number:
Convert each if statement that contains one of the boolean operators
{and, or, not} at the top level to eliminate the operator, introducing
additional if statements if necessary. The reason that this is a good
idea is that
if_val(tag(r1) == 1 && field(1, r1, N) = X)
get transformed into two ifs, and the field reference can be extracted
as a common subexpression in an assignment between the two ifs, after
the primary tag has been tested. This is necessary to avoid an
unaligned memory reference. Before this change, we simply did not
optimize code sequences containing such ifs.
vn_order:
Prepare for an optimization (to come later this week) whereby if
a block contains multiple exit points with inconsistent bindings,
we can optimize the front part separately as well as the back part.
vn_debug:
Added a message to help me find the most profitable way to do the
above change.
opt_util, frameopt:
Moved some code for dealing with det procedure prologues from
frameopt to opt_util, since now value_number needs its also.
options:
Make tag_switch apply in more cases.
det_analysis:
Added some code to fix up disjunctions that have at most one solution.
We now transform a disjunction to an if-then-else only if the
disjunction is locally nondet. If the disjunction cannot fail, we
replace it with a disjunct that cannot fail and issue a warning;
we issue a warning in several other cases as well.
mercury_to_mercury:
Fix two duplicate fact bugs pointed out by the new det_analysis.
peephole:
Add a new optimization: a stack frame teardown followed by a
conditional branch to a label that builds a stack frame is now
replaced by code that starts with the conditional branch to the
code after the stack frame setup, and has the stack frame teardown
only in the fall through code. This optimization is applied only
after frameopt.
opt_util:
Export a previously internal predicate for use by peephole.
optimize, value_number:
Conform to the new interface of peephole.
frameopt:
Add some debugging code (now commented out) that helped in making
the above optimization.
prog_io:
Cosmetic changes.
The changes made allow declarations of the form:
:- pragma(c_code, predname(Varname1::mode1, Varname2::mode2, ...),
"Some C code to execute instead of a mercury clause;").
There are still a couple of minor problems to be fixed in the near future:
If there is a regular clause given as well as a pragma(c_code, ...) dec, it
is not handled well, and variables names '_' are not handled well.
prog_io.m:
parse the pragma(c_code, ...) dec.
hlds.m:
define a new hlds__goal_expr 'pragma_c_code'.
make_hlds.m:
insert the pragma(c_code, ...) dec. as a pragma_c_code into the hlds.
det_analysis.m:
infer that pragma_c_code goals are det.
modes.m:
convince the mode checker that the correct pragma variables are bound
etc.
quantification.m:
quantify the variables in the pragma(c_code, ...) dec.
code_gen.pp:
convert pragma_c_code into pragma_c (in the llds).
llds.m:
define a new instr, pragma_c. Output the pragma_c
hlds_out.m:
mercury_to_mercury.m:
mercury_to_goedel.m:
spit out pragma(c_code, ...) decs properly
*.m: handle the new pragma_c_code in the hlds or the new pragma_c in the llds
llds:
Optionally generate while (1) loops instead of short backward branches.
This is faster in the absence of fast jumps.
options:
Add a new option, --no-emit-c-loops.
middle_rec:
We now check if the LLDS code after the recursive call is empty.
If yes, we don't generate the downward loop.
code_aux:
Minor cleanup associated with previous change.
frameopt:
Instead of blindly assuming that any code before an if_val will be
able to fill the delay slot, we check whether it computes a value
that is used in the condition. We now also allow a slightly wider
range of user instructions to fill delay slots.
opt_util:
Some new preds to support the new funcionality in frameopt.
tag_switch:
Compute the tag of the switched-on value into a register at the
start, instead of computing it in each if_val.
jumpopt:
Added last call optimization for nondet predicates.
llds:
Added a new lval type to represent the succip slot of nondet
stack frames.
other files:
Changes required by the change to llds (there is a minor unrelated
change in vn_cost as well).
Tyson: please check my changes to code_info__get_shape_num and
garbage_out__write_liveval.
instructions, and the last argument from local labels. All these were
placeholders for info put in there by prof.m and used when emitting C
code.
The set of labels that serve as return points are now calculated in llds.m
just before each procedure has its C code generated. This set is passed to
output_instruction along with the label at the start of the procedure.
options, code_gen:
Add an option, --no-simple-neg, to disable the generation of
simplified code for simple negations, since sometimes the more
complex code is better (e.g. for queens) due to branch frequencies.
peephole, jumpopt:
Move the detection of tailcalls from peephole to jumpopt. This
allows us to avoid building some maps in peephole. The code in
jumpopt is also somewhat more general, but this is unlikely
to lead to better code.
opt_util:
Some changes to support the previous modifications. We also
allow framevars in code that looks for stackvars, since the
two kinds of variables can both occur in code that does commits.
optimize:
The main predicate of peephole has a new name, call it by that name.
Also remove Tom's comment asking for my inspection of his change.
value_number:
The main predicate of peephole has a new name, call it by that name.
Also loosen a too-tight sanity check.
frameopt:
Look inside blocks introduced by value numbering when looking
restorations of succip.
value_number, opt_util:
If we are using conservative garbage collection, disable value
numbering for blocks that allocate more than one cell on the heap.
This allows value numbering of most blocks to work in the absence
of -DALL_INTERIOR_POINTERS.
all other source files:
Clean up "blank" lines that nevertheless contain space or tab
characters.
opt_util:
In between incr_sp and decr_sp instructions that are being eliminated,
allow assignments to stackvars as long as the stackvars are not used
as input. These assignments are also eliminated.
This set of changes includes most of the work necessary for
mode and determinism checking of higher-order predicates.
prog_io.m:
Change the syntax for lambda expressions: they need
to have a determinism declaration. Lambda
expressions must now look like this:
lambda([X::in, Y::out] is det, ...goal...).
^^^^^^
Note that both the modes and the determinism are mandatory,
not optional.
hlds.m:
Insert a determinism field in the lambda_goal structure.
hlds_out.m, inlining.m, make_hlds.m, modes.m, polymorphism.m, quantification.m,
switch_detection.m, typecheck.m:
Modified to use lambda_goal/4 rather than lambda_goal/3.
prog_io.m:
Add a new field to the `ground' inst, of type `maybe(pred_inst_info)'.
We use this to store the modes and determinism of higher-order
predicate terms.
code_info.m, inst_match.m, mercury_to_mercury.m, mode_util.m, modes.m,
polymorphism.m, shapes.m, undef_modes.m:
Modified to handle higher-order pred modes:
use ground/2 rather than ground/1.
(Note that modes.m still requires a bit more work on this.)
llds.m:
Add a new field to the call_closure/3 instruction to hold the
caller address for use with profiling, since the C macros
require a caller address.
dup_elim.m, frame_opt.m, garbage_out.m, live_map.m, middle_rec.m, opt_debug.m,
opt_util.m, value_number.m, vn_*.m:
Modified to use call_closure/4 rather than call_closure/3.
mercury_to_mercury.m:
Export mercury_output_det for use by hlds_out.m.
frameopt:
Make the teardown map bidirectional, and export it.
peephole:
Add a new pattern to handle cases generated by fulljump optimization.
This pattern uses the teardownmap, but it is disabled for the moment.
optimize:
Pass the teardown map where it is needed, and make sure we do a
peephole pass immediately after frameopt to use the teardownmap
while it is still valid.
jumpopt:
Rename a variable.
labelopt:
A block being eliminated may have the last remaining reference
to the label starting another block. Therefore on the last invocation
of labelopt, we iterate to a fixpoint before returning.
opt_debug:
Add a predicate to help debug bidirectional teardown maps.
opt_util:
Liberalized some optimizations, but the changes are disabled for the
moment.
value_number:
Pass an empty teardown map to peephole. Loosen the sanity check on
tags a bit.
vn_order:
If the new value of a location depends on its old value,
avoid creating a circularity in the preferred order relation.
Such circularity may be broken arbitrarily, even though we have
a clear preference.
vn_flush:
Modify the criteria for saving the old value stored in a location
about to be overwritten, in an effort to eliminate useless copies
of old values.
vn_util:
Tighten the requirement for classifying a use as a "real" use,
also in the effort to eliminate useless saves of old values.
Recognize some more expression patterns as yieldsing known
results. Move some functionality from vn_flush to vn_util,
since it is needed by the other modification.
-------------------------------------------------------
Implement unique modes. We do not handle local aliasing yet, so this
is still not very useful, except for io__state. Destructive update is
not yet implemented. Also note that this really only implements
"mostly unique" variables that may be non-unique on backtracking - we
don't check that you don't backtrack over I/O, for example.
prog_io.m, mode_util.m, modes.m, inst_match.m:
Major changes to Handle unique modes.
mercury_to_mercury.m, polymorphism.m, prog_out.m, undef_modes.m:
Use `ground(Uniqueness)' rather than just `ground'.
compiler/*.m:
Fix compile errors now that unique modes are enforced: add a
few calls to copy/2, and comment out lots of unique mode
declarations that caused problems.
typecheck.m, mode_info.m:
Hack around the use of unique modes, which doesn't work
because we don't allow local aliasing yet: make the insts
`uniq_type_info' and `uniq_mode_info' not unique at all,
and add a call to copy/2 when extracting the io_state from
type_info or mode_info.
-------------------------------------------------------
Plus a couple of unrelated changes:
hlds.m:
Change the modes for the special predicates from `ground -> ground'
to `in', so that any error messages that show those modes
come out looking nicer.
Add a new shared_inst_table for shared versions of user-defined
insts.
mercury_to_goedel.m:
Use string__is_alnum_or_underscore.
typecheck:
Improved the format of the message about calls with wrong arity.
jumpopt, optimize:
A goto whose target is the predicate entry label is replaced by
the pointed-to code only if a flag is set; optimize sets the flag
only after value numbering and frameopt. This means the pointed-to
code is in better shape when it is "inlined".
peephole, opt_util:
When optimizing incr_sp/decr_sp pairs, allow a restoration of succip
between them to be optimized away. This works because the only way
this can happen is if the store of succip in its slot was promoted
before the incr_sp, and no calls may have been in the meantime,
so the original copy is still in succip.
frameopt, optimize:
Postponed the check for whether succip is ever restored, since
peephole may affect the decision.
follow_code:
We now push code from the outside context into this context before
pushing code from this context into nested contexts, since this may
give us more code to push. I also removed redundant references to
ModuleInfo.
prog_io:
Small formatting change.
options, optimize:
Add a new option, --optimize-fulljumps, which defaults on.
jumpopt, opt_util:
If --optimize-fulljumps is set, replace unconditional gotos with
the instruction sequence they point to. This not only avoids a jump
at runtime, but also increases basic block length and makes value
numbering more effective.
peephole:
Fulljump optimization can replace a recursive tailcall with the
initial part of the code of the procedure. Therefore peephole now
looks for a decr_sp followed by an incr_sp, and removes such pairs
from the instruction sequence.
frameopt:
Do not consider a decr_sp followed by an incr_sp to be a fatal error
(just in case peephole is switched off).
vn_block:
Fix a big tickled by fulljump optimization: maxfr, curfr and succip
were not required to be made up to date before an if_val exited
the extended basic block.
vn_util:
Simplify some more patterns of vnrvals. The extra patterns are
involved in testing conditions that are known to be true or false.
These patterns can arise when fulljump optimization replaces a
recursive tailcall.
code*.m & *gen.m:
Implement an improved method of handling negated contexts.
The new method avoids saving things onto the stack before
an if-then-else or negation if it can.
Also, fix the implementation of nondet if-then-else so that
it does the soft cut properly.
mercury_compile:
Sort the list of interface files before printing them to a .d file.
opt_util, peephole:
Fix a bug tickled by value numbering. Some sequences of code were
recognized as having no access to nondet stack control slots even
in the presence of such accesses, which lead to the incorrect
introduction of succeed_discards.
value_number:
Loosen the value correspondence sanity check, which was failing
needlessly, and tighten the tag sanity check, which was passing
incorrect code.
Do not try value numbering on blocks containing structures such as
"if (tag(x) == X && field(X, x, X) == X) goto X", since these will
definitely lead to tag sanity check violations.
vn_flush:
If a shared node has no uses left when flushed, leave it be.
When generating a mkframe, reflect its update of the top redoip slot
in the data structures.
vn_order:
Some hacks to get the relmaps partway to where I want them. This
code needs cleaning up.
vn_debug:
New debugging routines to support my changes to vn_order.
vn_type:
Deleted the vn_modframe vn_instr, since its role has been taken over
by assignments to redoip(maxfr).
opt_debug:
Reflect the change to vn_type, print address constants in vn_rvals,
and fix a typo.
vn_block, vn_util:
Reflect the change to vn_type.
frameopt, opt_util:
Attempt to fill delay slots with the instruction after an if_val
in preference to the saving of the succip.
optimize:
Fix a typo in earlier change.
value_number:
Check that the last node in the order is a control node.
vn_order:
If two registers or stackvars can be generated in any order,
prefer to generate them in numerical sequence for neatness.
vn_debug:
Add routine for printing the initial and final ordering of
unrelated nodes.