Estimated hours taken: 400
Branches: main
This diff implements stack slot optimization for the LLDS back end based on
the idea that after a unification such as A = f(B, C, D), saving the
variable A on the stack indirectly also saves the values of B, C and D.
Figuring out what subset of {B,C,D} to access via A and what subset to access
via their own stack slots is a tricky optimization problem. The algorithm we
use to solve it is described in the paper "Using the heap to eliminate stack
accesses" by Zoltan Somogyi and Peter Stuckey, available in ~zs/rep/stackslot.
That paper also describes (and has examples of) the source-to-source
transformation that implements the optimization.
The optimization needs to know what variables are flushed at call sites
and at program points that establish resume points (e.g. entries to
disjunctions and if-then-elses). We already had code to compute this
information in live_vars.m, but this code was being invoked too late.
This diff modifies live_vars.m to allow it to be invoked both by the stack
slot optimization transformation and by the code generator, and allows its
function to be tailored to the requirements of each invocation.
The information computed by live_vars.m is specific to the LLDS back end,
since the MLDS back ends do not (yet) have the same control over stack
frame layout. We therefore store this information in a new back end specific
field in goal_infos. For uniformity, we make all the other existing back end
specific fields in goal_infos, as well as the similarly back end specific
store map field of goal_exprs, subfields of this new field. This happens
to significantly reduce the sizes of goal_infos.
To allow a more meaningful comparison of the gains produced by the new
optimization, do not save any variables across erroneous calls even if
the new optimization is not enabled.
compiler/stack_opt.m:
New module containing the code that performs the transformation
to optimize stack slot usage.
compiler/matching.m:
New module containing an algorithm for maximal matching in bipartite
graphs, specialized for the graphs needed by stack_opt.m.
compiler/mercury_compile.m:
Invoke the new optimization if the options ask for it.
compiler/stack_alloc.m:
New module containing code that is shared between the old,
non-optimizing stack slot allocation system and the new, optimizing
stack slot allocation system, and the code for actually allocating
stack slots in the absence of optimization.
Live_vars.m used to have two tasks: find out what variables need to be
saved on the stack, and allocating those variables to stack slots.
Live_vars.m now does only the first task; stack_alloc.m now does
the second, using code that used to be in live_vars.m.
compiler/trace_params:
Add a new function to test the trace level, which returns yes if we
want to preserve the values of the input headvars.
compiler/notes/compiler_design.html:
Document the new modules (as well as trace_params.m, which wasn't
documented earlier).
compiler/live_vars.m:
Delete the code that is now in stack_alloc.m and graph_colour.m.
Separate out the kinds of stack uses due to nondeterminism: the stack
slots used by nondet calls, and the stack slots used by resumption
points, in order to allow the reuse of stack slots used by resumption
points after execution has left their scope. This should allow the
same stack slots to be used by different variables in the resumption
point at the start of an else branch and nondet calls in the then
branch, since the resumption point of the else branch is not in effect
when the then branch is executed.
If the new option --opt-no-return-calls is set, then say that we do not
need to save any values across erroneous calls.
Use type classes to allow the information generated by this module
to be recorded in the way required by its invoker.
Package up the data structures being passed around readonly into a
single tuple.
compiler/store_alloc.m:
Allow this module to be invoked by stack_opt.m without invoking the
follow_vars transformation, since applying follow_vars before the form
of the HLDS code is otherwise final can be a pessimization.
Make the module_info a part of the record containing the readonly data
passed around during the traversal.
compiler/common.m:
Do not delete or move around unifications created by stack_opt.m.
compiler/call_gen.m:
compiler/code_info.m:
compiler/continuation_info.m:
compiler/var_locn.m:
Allow the code generator to delete its last record of the location
of a value when generating code to make an erroneous call, if the new
--opt-no-return-calls option is set.
compiler/code_gen.m:
Use a more useful algorithm to create the messages/comments that
we put into incr_sp instructions, e.g. by distinguishing between
predicates and functions. This is to allow the new scripts in the
tool directory to gather statistics about the effect of the
optimization on stack frame sizes.
library/exception.m:
Make a hand-written incr_sp follow the new pattern.
compiler/arg_info.m:
Add predicates to figure out the set of input, output and unused
arguments of a procedure in several different circumstances.
Previously, variants of these predicates were repeated in several
places.
compiler/goal_util.m:
Export some previously private utility predicates.
compiler/handle_options.m:
Turn off stack slot optimizations when debugging, unless
--trace-optimized is set.
Add a new dump format useful for debugging --optimize-saved-vars.
compiler/hlds_llds.m:
New module for handling all the stuff specific to the LLDS back end
in HLDS goal_infos.
compiler/hlds_goal.m:
Move all the relevant stuff into the new back end specific field
in goal_infos.
compiler/notes/allocation.html:
Update the documentation of store maps to reflect their movement
into a subfield of goal_infos.
compiler/*.m:
Minor changes to accomodate the placement of all back end specific
information about goals from goal_exprs and individual fields of
goal_infos into a new field in goal_infos that gathers together
all back end specific information.
compiler/use_local_vars.m:
Look for sequences in which several instructions use a fake register
or stack slot as a base register pointing to a cell, and make those
instructions use a local variable instead.
Without this, a key assumption of the stack slot optimization,
that accessing a field in a cell costs only one load or store
instruction, would be much less likely to be true. (With this
optimization, the assumption will be false only if the C compiler's
code generator runs out of registers in a basic block, which for
the code we generate should be unlikely even on x86s.)
compiler/options.m:
Make the old option --optimize-saved-vars ask for both the old stack
slot optimization (implemented by saved_vars.m) that only eliminates
the storing of constants in stack slots, and the new optimization.
Add two new options --optimize-saved-vars-{const,cell} to turn on
the two optimizations separately.
Add a bunch of options to specify the parameters of the new
optimizations, both in stack_opt.m and use_local_vars.m. These are
for implementors only; they are deliberately not documented.
Add a new option, --opt-no-return-cells, that governs whether we avoid
saving variables on the stack at calls that cannot return, either by
succeeding or by failing. This is for implementors only, and thus
deliberately documented only in comments. It is enabled by default.
compiler/optimize.m:
Transmit the value of a new option to use_local_vars.m.
doc/user_guide.texi:
Update the documentation of --optimize-saved-vars.
library/tree234.m:
Undo a previous change of mine that effectively applied this
optimization by hand. That change complicated the code, and now
the compiler can do the optimization automatically.
tools/extract_incr_sp:
A new script for extracting stack frame sizes and messages from
stack increment operations in the C code for LLDS grades.
tools/frame_sizes:
A new script that uses extract_incr_sp to extract information about
stack frame sizes from the C files saved from a stage 2 directory
by makebatch and summarizes the resulting information.
tools/avg_frame_size:
A new script that computes average stack frame sizes from the files
created by frame_sizes.
tools/compare_frame_sizes:
A new script that compares the stack frame size information
extracted from two different stage 2 directories by frame_sizes,
reporting on both average stack frame sizes and on specific procedures
that have different stack frame sizes in the two versions.
The main aim of this change is to make the overall, high-level structure
of the compiler clearer, and to encourage better encapsulation of the
major components.
compiler/libs.m:
compiler/backend_libs.m:
compiler/parse_tree.m:
compiler/hlds.m:
compiler/check_hlds.m:
compiler/transform_hlds.m:
compiler/bytecode_backend.m:
compiler/aditi_backend.m:
compiler/ml_backend.m:
compiler/ll_backend.m:
compiler/top_level.m:
New files. One module for each of the major components of the
Mercury compiler. These modules contain (as separate sub-modules)
all the other modules in the Mercury compiler, except gcc.m and
mlds_to_gcc.m.
Mmakefile:
compiler/Mmakefile:
Handle the fact that the top-level module is now `top_level',
not `mercury_compile' (since `mercury_compile' is a sub-module
of `top_level').
compiler/Mmakefile:
Update settings of *FLAGS-<modulename> to use the appropriate
nested module names.
compiler/recompilation_check.m:
compiler/recompilation_version.m:
compiler/recompilation_usage.m:
compiler/recompilation.check.m:
compiler/recompilation.version.m:
compiler/recompilation.version.m:
Convert the `recompilation_*' modules into sub-modules of the
`recompilation' module.
compiler/*.m:
compiler/*.pp:
Module-qualify the module names in `:- module', `:- import_module',
and `:- use_module' declarations.
compiler/base_type_info.m:
compiler/base_type_layout.m:
Deleted these unused empty modules.
compiler/prog_data.m:
compiler/globals.m:
Move the `foreign_language' type from prog_data to globals.
compiler/mlds.m:
compiler/ml_util.m:
compiler/mlds_to_il.m:
Import `globals', for `foreign_language'.
Mmake.common.in:
trace/Mmakefile:
runtime/Mmakefile:
Rename the %.check.c targets as %.check_hdr.c,
to avoid conflicts with compiler/recompilation.check.c.
Estimated hours taken: 2
compiler/code_exprn.m:
Fix an old bug: the code for allocating spare registers was using a
different and more liberal test for a register being free than the
code for assigning to that register. This bug caused a code generator
abort on the icfp2000 entry at -O6 with intermodule optimization.
Estimated hours taken: 140
Add an alternative to code_exprn that does eager code generation (code_exprn
always does lazy code generation). Its main advantages are that the new code
is significantly simpler, and that it does not generate unnecessary shuffling
code. Its main disadvantage, which is that it does not eliminate the creation
of unneeded cells, can be eliminated by switching on --unneeded-code.
For now, you can select the use of the new code generator with the
--no-lazy-code option (which was previously present but unused).
This will be made the default later, after I do more performance tests.
Var_locn contains stricter self-checks than code_exprn does. This required
modifications to some other parts of the code generator to ensure that the
self-checks do not fail unnecessarily. (This mostly took the form of explicitly
killing off dead variables before calling code_info__clear_all_registers, which
would complain about losing the last record of the value of a variable that was
alive as far as it knew.) To make my changes simpler, also took the opportunity
to simplify parts of the code generator which were handing around rvals that
in fact had to be wrappers around lvals, by handing around the lvals directly.
Testing this change also required fixing an old bug which prevented compiling
the library with -O1 --trace deep, together with the usual intermodule
optimization. The bug is that a library module reads predicates from
builtin.opt or private_builtin.opt, does not eliminate them because of the -O1,
and then tries to generate traced code for them. However, this fails because
the builtin modules contain some predicates that cannot be made to conform to
typeinfo-liveness, which is required by tracing.
compiler/var_locn.m:
The new module that implements eager code generation.
compiler/follow_vars.m:
Improve the follow_vars pass, since eager code generation requires
better follow_vars information. We now generate correct information
for generic calls, and record not only where some vars (e.g. those
which appear as input arguments of following calls) should be put,
but also which registers are not reserved for those variables and
are thus available for other variables.
compiler/hlds_goal.m:
Modify the follow_vars field of the goal_info to record the number
of the first non-reserved register.
compiler/code_info.m:
Replace the general-purpose predicate code_info__cache_exprn, which
associated a variable with an rval without generating code, with a set
of special-purpose predicates such as code_info__assign_const_to_var
and code_info__assign_cell_to_var, some of which can generate code.
These new predicates and some older ones (e.g. code_info__setup_call)
now choose at runtime whether to call code_exprn or var_locn. The
basis for the decision is checking whether the code_info structure
contains an exprn_info or a var_locn_info. This is decided in
code_info__init based on the value of the lazy_code option, and
maintained unchanged from then on.
Rename some predicates to better reflect their current possible
behaviors.
compiler/unify_gen.m:
Call the new special-purpose predicates in code_info instead of
code_info__cache_exprn.
Replace an incorrect clause with a call to error, since that clause
could never be invoked.
compiler/call_gen.m:
Hand over the task of generating the args of generic calls to
code_info, since it already has code to do the right thing, which
includes reserving the registers to be used for the input args.
Notify the rest of the code generator after the last use of
non-forward-live variables, in order to avoid spurious calls to error
(it is an error to clobber the last location of a live variable).
Notify the rest of the code generator when generic calls overwrite
registers, to allow the proper consistency checks to be made.
If an output variable is singleton, then do not make it known to the
code generator. It never will never become dead, and may thus cause a
spurious compiler abort if its storage is ever clobbered.
Export a predicate for use by follow_vars.
Factor out some common code.
Call the new preds in code_info where necessary.
compiler/pragma_c_gen.m:
Notify the rest of the code generator after the last use of
non-forward-live variables, in order to avoid spurious calls to error
(it is an error to clobber the last location of a live variable).
If an output variable is singleton, then do not make it known to the
code generator. It never will never become dead, and may thus cause a
spurious compiler abort if its storage is ever clobbered.
When using var_locn, ensure that none of the input arguments of a
model_semi pragma_c_code is assigned to r1. If we did, and the last
reference to the value of that argument was after an assignment to
SUCCESS_INDICATOR, the C compiler would be forced to generate code
to shuffle the value of the argument out of the way.
compiler/code_exprn.m:
Minor changes to return lvals directly instead of lvals wrapped inside
rvals and to conform the new format of follow_vars.
Do not include the registers reserved by follow_vars in the
search for a spare register.
compiler/lookup_switch.m:
compiler/switch_gen.m:
Fix an old bug that did not matter with code_exprn but does matter with
var_locn: the branch end structure was being computed in the wrong
place.
compiler/disj_gen.m:
At the ends of non-last disjuncts, kill off the variables that we
needed to know inside the disjunct but won't need to know after the
disjunct, in order to avoid error messages about throwing away their
state. The variables affected are those which are needed only by the
resumption point of the next disjunct, not by enclosing resumption
points or forward execution.
compiler/arg_info.m:
Associate an lval, not an rval, with each argument.
compiler/*.m:
Minor changes to conform to (a) the new format of follow_vars, (b)
the replacement of rvals containing lvals by lvals.
compiler/code_util.m:
Add some utility predicates for var_locn.m.
compiler/exprn_aux.m:
Add some utility functions for var_locn.m.
Export a predicate for var_locn.m.
compiler/handle_options.m:
If --no-lazy-code is set, switch on the "optimizations" on whose
presence it depends.
compiler/mercury_compile.m:
compiler/code_gen.m:
Turn off tracing for predicates that don't obey typeinfo liveness
for backend_by_preds and backend_by_phases respectively.
Look up options in the globals structure in the module_info, not in the
globals structure in the I/O state, since this is where we turn off
tracing. (We should later make sure that other parts of the compiler
are also consistent on this issue.)
compiler/stack_layout.m:
Throw away any continuation_info structures that belong to predicates
that don't obey typeinfo liveness.
Estimated hours taken: 2
Add information required for structure reuse and compile time garbage
collection to the LLDS. The code generator does not yet generate
this information.
This will be committed to the main branch to avoid CVS conflicts.
compiler/llds.m:
Add an LLDS instruction `free_heap(rval)', which applies the
MR_free_heap macro to its argument.
Add a `maybe(rval)' field to `create' rvals to hold the address
of a cell to reuse. This field should always be `no' after
code generation, because all non-constant creates are converted
into lower-level operations during code generation.
compiler/value_number.m:
Don't reorder instructions around a `free_heap' instruction
to avoid generating code which looks at deallocated memory.
compiler/*.m:
Handle the new instruction and field.
Estimated hours taken: 16
Allow the compiler to handle create rvals whose arguments have a size
which is different from the size of a word. Use this capability to reduce
the size of RTTI information, in two ways.
The first way is by rearranging the way in which we represent information
about the live values at a label. Instead of an array with an entry for
each live value, the entry being a pair of Words containing a shape
representation and a location description respectively, use an array
of shape representations (still Words), followed by an array of 32-bit ints
(which may be smaller than Word) describing locations whose descriptions
don't fit into 8 bits, followed by an array of 8-bit ints describing
locations whose descriptions do fit into 8 bits.
The second way is by reducing the sizes of some fields in the C structs
used for RTTI. Several of these had to be bigger than necessary in the
past because their fields were represented by the args of a create rval.
On cyclone, this reduces the size of the object file for queens.m by 2.8%.
IMPORTANT
Until this change is reflected in the installed compiler, you will not be
able to use any modules compiled with debugging in your workspaces if the
workspace has been updated to include this change. This is because the RTTI
data structures generated by the old installed compiler will not be compatible
with the new structure definitions.
The workaround is simple: if your workspace contains modules compiled with
debugging, don't do a cvs update until this change has been installed.
configure.in:
Check whether <stdint.h> is present. If not, autoconfigure
types that are at least 16 and 32 bits in size.
runtime/mercury_conf.h.in:
Mention the macros used by the configure script, MR_INT_LEAST32_TYPE
and MR_INT_LEAST16_TYPE.
runtime/mercury_conf_param.h:
Document the macros used by the configure script, MR_INT_LEAST32_TYPE
and MR_INT_LEAST16_TYPE.
runtime/mercury_types.h:
If <stdint.h> is available, get the basic integer types (intptr_t,
int_least8_t, etc) from there. Otherwise, get them from the
autoconfigure script. Define types such as Word in terms of these
(eventually) standard types.
runtime/mercury_stack_layout.h:
Add macros for manipulating short location descriptions, update the
types and macros for manipulating long location descriptions.
Modify the way the variable count is represented (since it now must
count locations with long and short descriptions separately),
and move it to the structure containing the arrays it describes.
Reduce the size of the some fields in structs. This required some
reordering of fields to avoid the insertion of padding by the compiler,
and changes to the definitions of some types (e.g. MR_determinism).
runtime/mercury_layout_util.[ch]:
runtime/mercury_stack_trace.c:
runtime/mercury_accurate_gc.c:
trace/mercury_trace.c:
trace/mercury_trace_external.c:
trace/mercury_trace_internal.c:
Update the code to conform to the changes to stack_layout.h.
compiler/llds.m:
Modify the create rval in two ways. First, add an extra argument to
represent the types of the arguments, which used to always be implicit
always a word in size, but may now be explicit and possibly smaller
(e.g. uint_least8). Second, since the code generator would do the wrong
thing with creates with smaller than wordsize arguments, replace
the old must-be-unique vs may-be-nonunique bool with a three-valued
marker, must_be_dynamic vs must_be_static vs can_be_either.
Add uint_least8, uint_least16, uint_least32 (and their signed variants)
and string as llds_types.
Add a couple of utility predicates for checking whether an llds_type
denotes a type whose size is the same as word.
compiler/llds_out.m:
Use explicitly given argument types when declaring and initializing
the arguments of a cell, if they are given.
compiler/llds_common.m:
Don't conflate creates with identical argument values but different
C-level argument types. The probability of a match is minuscule anyway.
compiler/stack_layout.m:
Use the new representation of creates to generate the new versions of
RTTI data structures.
compiler/code_exprn.m:
If a create is marked must_be_static, don't inspect the arguments
to decide whether it can be static or not. If it can't, we'll get
an abort later on in llds_out or during C compilation anyway.
compiler/base_type_layout.m:
When creating pseudo typeinfos, return the llds_type of the resulting
rval.
Minor changes required by the change in create.
compiler/base_type_info.m:
compiler/base_typeclass_info.m.m:
compiler/code_util.m:
compiler/dupelim.m:
compiler/exprn_aux.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/lookup_switch.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/string_switch.m:
compiler/unify_gen.m:
compiler/vn_cost.m:
compiler/vn_filter.m:
compiler/vn_flush.m:
compiler/vn_order.m:
compiler/vn_type.m:
compiler/vn_util.m:
compiler/vn_verify.m:
Minor changes required by the change in create.
library/benchmarking.m:
library/std_util.m:
Use the new macros in hand-constructing proc layout structures.
library/Mmakefile:
Add explicit dependencies for benchmarking.o and std_util.o
on ../runtime/mercury_stack_layout.h. Although this is only a subset
of the truth (in reality, all library objects depend on most of the
runtime headers), it is a good tradeoff between safety and efficiency.
The other runtime header files tend not to change in incompatible ways.
trace/Mmakefile:
Add explicit dependencies for all the object files on
../runtime/mercury_stack_layout.h, for similar reasons.
Estimated hours taken: 20
These changes make `var' and `term' polymorphic. This allows us to make
variables and terms representing types of a different type to those
representing program terms and those representing insts.
These changes do not *fix* any existing problems (for instance
there was a messy conflation of program variables and inst variables,
and where necessary I've just called varset__init(InstVarSet) with
an XXX comment).
NEWS:
Mention the changes to the standard library.
library/term.m:
Make term, var and var_supply polymorphic.
Add new predicates:
term__generic_term/1
term__coerce/2
term__coerce_var/2
term__coerce_var_supply/2
library/varset.m:
Make varset polymorphic.
Add the new predicate:
varset__coerce/2
compiler/prog_data.m:
Introduce type equivalences for the different kinds of
vars, terms, and varsets that we use (tvar and tvarset
were already there but have been changed to use the
polymorphic var and term).
Also change the various kinds of items to use the appropriate
kinds of var/varset.
compiler/*.m:
Thousands of boring changes to make the compiler type correct
with the different types for type, program and inst vars and
varsets.
Estimated hours taken: 50
Add support for nested modules.
- module names may themselves be module-qualified
- modules may contain `:- include_module' declarations
which name sub-modules
- a sub-module has access to all the declarations in the
parent module (including its implementation section).
This support is not yet complete; see the BUGS and LIMITATIONS below.
LIMITATIONS
- source file names must match module names
(just as they did previously)
- mmc doesn't allow path names on the command line any more
(e.g. `mmc --make-int ../library/foo.m').
- import_module declarations must use the fully-qualified module name
- module qualifiers must use the fully-qualified module name
- no support for root-qualified module names
(e.g. `:parent:child' instead of `parent:child').
- modules may not be physically nested (only logical nesting, via
`include_module').
BUGS
- doesn't check that the parent module is imported/used before allowing
import/use of its sub-modules.
- doesn't check that there is an include_module declaration in the
parent for each module claiming to be a child of that parent
- privacy of private modules is not enforced
-------------------
NEWS:
Mention that we support nested modules.
library/ops.m:
library/nc_builtin.nl:
library/sp_builtin.nl:
compiler/mercury_to_mercury.m:
Add `include_module' as a new prefix operator.
Change the associativity of `:' from xfy to yfx
(since this made parsing module qualifiers slightly easier).
compiler/prog_data.m:
Add new `include_module' declaration.
Change the `module_name' and `module_specifier' types
from strings to sym_names, so that module names can
themselves be module qualified.
compiler/modules.m:
Add predicates module_name_to_file_name/2 and
file_name_to_module_name/2.
Lots of changes to handle parent module dependencies,
to create parent interface (`.int0') files, to read them in,
to output correct dependencies information for them to the
`.d' and `.dep' files, etc.
Rewrite a lot of the code to improve the readability
(add comments, use subroutines, better variable names).
Also fix a couple of bugs:
- generate_dependencies was using the transitive implementation
dependencies rather than the transitive interface dependencies
to compute the `.int3' dependencies when writing `.d' files
(this bug was introduced during crs's changes to support
`.trans_opt' files)
- when creating the `.int' file, it was reading in the
interfaces for modules imported in the implementation section,
not just those in the interface section.
This meant that the compiler missed a lot of errors.
library/graph.m:
library/lexer.m:
library/term.m:
library/term_io.m:
library/varset.m:
compiler/*.m:
Add `:- import_module' declarations to the interface needed
by declarations in the interface. (The previous version
of the compiler did not detect these missing interface imports,
due to the above-mentioned bug in modules.m.)
compiler/mercury_compile.m:
compiler/intermod.m:
Change mercury_compile__maybe_grab_optfiles and
intermod__grab_optfiles so that they grab the opt files for
parent modules as well as the ones for imported modules.
compiler/mercury_compile.m:
Minor changes to handle parent module dependencies.
(Also improve the wording of the warning about trans-opt
dependencies.)
compiler/make_hlds.m:
compiler/module_qual.m:
Ignore `:- include_module' declarations.
compiler/module_qual.m:
A couple of small changes to handle nested module names.
compiler/prog_out.m:
compiler/prog_util.m:
Add new predicates string_to_sym_name/3 (prog_util.m) and
sym_name_to_string/{2,3} (prog_out.m).
compiler/*.m:
Replace many occurrences of `string' with `module_name'.
Change code that prints out module names or converts
them to strings or filenames to handle the fact that
module names are now sym_names intead of strings.
Also change a few places (e.g. in intermod.m, hlds_module.m)
where the code assumed that any qualified symbol was
fully-qualified.
compiler/prog_io.m:
compiler/prog_io_goal.m:
Move sym_name_and_args/3, parse_qualified_term/4 and
parse_qualified_term/5 preds from prog_io_goal.m to prog_io.m,
since they are very similar to the parse_symbol_name/2 predicate
already in prog_io.m. Rewrite these predicates, both
to improve maintainability, and to handle the newly
allowed syntax (module-qualified module names).
Rename parse_qualified_term/5 as `parse_implicit_qualified_term'.
compiler/prog_io.m:
Rewrite the handling of `:- module' and `:- end_module'
declarations, so that it can handle nested modules.
Add code to parse `include_module' declarations.
compiler/prog_util.m:
compiler/*.m:
Add new predicates mercury_public_builtin_module/1 and
mercury_private_builtin_module/1 in prog_util.m.
Change most of the hard-coded occurrences of "mercury_builtin"
to call mercury_private_builtin_module/1 or
mercury_public_builtin_module/1 or both.
compiler/llds_out.m:
Add llds_out__sym_name_mangle/2, for mangling module names.
compiler/special_pred.m:
compiler/mode_util.m:
compiler/clause_to_proc.m:
compiler/prog_io_goal.m:
compiler/lambda.m:
compiler/polymorphism.m:
Move the predicates in_mode/1, out_mode/1, and uo_mode/1
from special_pred.m to mode_util.m, and change various
hard-coded definitions to instead call these predicates.
compiler/polymorphism.m:
Ensure that the type names `type_info' and `typeclass_info' are
module-qualified in the generated code. This avoids a problem
where the code generated by polymorphism.m was not considered
type-correct, due to the type `type_info' not matching
`mercury_builtin:type_info'.
compiler/check_typeclass.m:
Simplify the code for check_instance_pred and
get_matching_instance_pred_ids.
compiler/mercury_compile.m:
compiler/modules.m:
Disallow directory names in command-line arguments.
compiler/options.m:
compiler/handle_options.m:
compiler/mercury_compile.m:
compiler/modules.m:
Add a `--make-private-interface' option.
The private interface file `<module>.int0' contains
all the declarations in the module; it is used for
compiling sub-modules.
scripts/Mmake.rules:
scripts/Mmake.vars.in:
Add support for creating `.int0' and `.date0' files
by invoking mmc with `--make-private-interface'.
doc/user_guide.texi:
Document `--make-private-interface' and the `.int0'
and `.date0' file extensions.
doc/reference_manual.texi:
Document nested modules.
util/mdemangle.c:
profiler/demangle.m:
Demangle names with multiple module qualifiers.
tests/general/Mmakefile:
tests/general/string_format_test.m:
tests/general/string_format_test.exp:
tests/general/string__format_test.m:
tests/general/string__format_test.exp:
tests/general/.cvsignore:
Change the `:- module string__format_test' declaration in
`string__format_test.m' to `:- module string_format_test',
because with the original declaration the `__' was taken
as a module qualifier, which lead to an error message.
Hence rename the file accordingly, to avoid the warning
about file name not matching module name.
tests/invalid/Mmakefile:
tests/invalid/missing_interface_import.m:
tests/invalid/missing_interface_import.err_exp:
Regression test to check that the compiler reports
errors for missing `import_module' in the interface section.
tests/invalid/*.err_exp:
tests/warnings/unused_args_test.exp:
tests/warnings/unused_import.exp:
Update the expected diagnostics output for the test cases to
reflect a few minor changes to the warning messages.
tests/hard_coded/Mmakefile:
tests/hard_coded/parent.m:
tests/hard_coded/parent.child.m:
tests/hard_coded/parent.exp:
tests/hard_coded/parent2.m:
tests/hard_coded/parent2.child.m:
tests/hard_coded/parent2.exp:
Two simple tests case for the use of nested modules with
separate compilation.
Estimated hours taken: 0.75
library/*.m:
compiler/*.m:
Undo Zoltan's bogus update of all the copyright dates.
The dates in the copyright header should reflect the years
in which the file was modified (and no, changes to the
copyright header itself don't count as modifications).
Estimated hours taken: 20
Give duplicate code elimination more teeth in dealing with similar arguments
of different function symbols. For the source code
:- type t1 ---> f(int)
; g(int, int).
:- pred p1(t1::in, int::out) is det.
p1(f(Y), Y).
p1(g(Y, _), Y).
we now generate the C code
Define_entry(mercury__xdup__p1_2_0);
r1 = const_mask_field(r1, (Integer) 0);
proceed();
thus avoiding the cost of testing the function symbol.
runtime/mercury_tags.h:
Add two new macros, mask_field and const_mask_field, that behave
just like field and const_field except that instead of stripping
off a known tag from the pointer, they strip (mask) off an unknown
tag.
compiler/llds.m:
Change the first argument of the lval field/3 from tag to maybe(tag).
Make the comments on some types more readable.
compiler/llds_out.m:
If the first arg of the lval field/3 is no, emit a (const_)mask_field
macro; otherwise, emit a (const_)field macro.
compiler/basic_block.m:
New module to convert sequences of instructions to sequences of
basic blocks and vice versa. Used in the new dupelim.m.
compiler/dupelim.m:
Complete rewrite to give duplicate code elimination more teeth.
Whereas previously we eliminated blocks of code only if they exactly
duplicated other blocks of code, we now look for blocks that can be
"anti-unified". For example, the blocks
r1 = field(mktag(0), r2, 0)
goto L1
and
r1 = field(mktag(1), r2, 0)
<fall through to L1>
anti-unify, with the most specific common generalization being
r1 = mask_field(r2, 0)
goto L1
If several basic blocks antiunify, we replace one copy with the
antiunified block and try to eliminate the others. We do not
eliminate blocks that can be fallen into, since eliminating them
would require introducing a goto, which would slow the code down.
compiler/peephole,m:
If a conditional branch to a label is followed by that label or
by an unconditional branch to that label, eliminate the branch.
Dupelim produces this kind of code.
compiler/{code_exprn,exprn_aux,lookup_switch,opt_debug,unify_gen}.m:
Minor changes required by the change to field/3.
compiler/{frameopt,jumpopt,labelopt,mercury_compile,optimize,value_number}.m:
s/__main/_main/ in predicate names.
compiler/jumpopt.m:
Add some documentation.
compiler/unify_gen.m:
Fix a module qualified predicate name reference that would not
work in Prolog.
compiler/notes/compiler_design.html:
Document the new file basic_block.m.
Estimated hours taken: 40 (+ unknown time by Zoltan)
Add support for memory profiling.
(A significant part of this change is actuallly Zoltan's work. Zoltan
did the changes to the compiler and a first go at the changes to the
runtime and library. I rewrote much of Zoltan's changes to the runtime
and library, added support for the new options/grades, added code to
interface with mprof, did the changes to the profiler, and wrote the
documentation.)
[TODO: add test cases.]
NEWS:
Mention support for memory profiling.
runtime/mercury_heap_profile.h:
runtime/mercury_heap_profile.c:
New files. These contain code to record heap profiling information.
runtime/mercury_heap.h:
Add new macros incr_hp_msg(), tag_incr_hp_msg(),
incr_hp_atomic_msg(), and tag_incr_hp_atomic_msg().
These are like the non-`msg' versions, except that if
PROFILE_MEMORY is defined, they also call MR_record_allocation()
from mercury_heap_profile.h to record heap profiling information.
Also, fix up the indentation in lots of places.
runtime/mercury_prof.h:
runtime/mercury_prof.c:
Added code to dump out memory profiling information to files
`Prof.MemoryWords' and `Prof.MemoryCells' (for use by mprof).
Change the format of the `Prof.Counts' file so that the
first line says what it is counting, the units, and a scale
factor. Prof.MemoryWords and Prof.MemoryCells can thus have
exactly the same format as Prof.Counts.
Also cleaned up the interface to mercury_prof.c a bit, and did
various other minor cleanups -- indentation changes, changes to
use MR_ prefixes, additional comments, etc.
runtime/mercury_prof_mem.h:
runtime/mercury_prof_mem.c:
Rename prof_malloc() as MR_prof_malloc().
Rename prof_make() as MR_PROF_NEW() and add MR_PROF_NEW_ARRAY().
runtime/mercury_wrapper.h:
Minor modifications to reflect the new interface to mercury_prof.c.
runtime/mercury_wrapper.c:
runtime/mercury_label.c:
Rename the old `-p' (primary cache size) option as `-C'.
Add a new `-p' option to disable profiling.
runtime/Mmakefile:
Add mercury_heap_profile.[ch].
Put the list of files in alphabetical order.
Delete some obsolete stuff for supporting `.mod' files.
Mention that libmer_dll.h and libmer_globals.h are
produced by Makefile.DLLs.
runtime/mercury_imp.h:
Mention that libmer_dll.h is produced by Makefile.DLLs.
runtime/mercury_dummy.c:
Change a comment to refer to libmer_dll.h rather than
libmer_globals.h.
compiler/llds.m:
Add a new field to `create' and `incr_hp' instructions
holding the name of the type, for heap profiling.
compiler/unify_gen.m:
Initialize the new field of `create' instructions with
the appropriate type name.
compiler/llds_out.m:
Output incr_hp_msg() / tag_incr_hp_msg() instead of
incr_hp() / tag_incr_hp().
compiler/*.m:
Minor changes to most files in the compiler back-end to
accomodate the new field in `incr_hp' and `create' instructions.
library/io.m:
Add `io__report_full_memory_stats'.
library/benchmarking.m:
Add `report_full_memory_stats'. This uses the information saved
by runtime/mercury_heap_profile.{c,h} to print out a report
of memory usage by procedures and by types.
Also modify `report_stats' to print out some of that information.
compiler/mercury_compile.m:
If `--statistics' is enabled, call io__report_full_memory_stats
at the end of main/2. This will print out full memory statistics,
if the compiler was compiled with memory profiling enabled.
compiler/options.m:
compiler/handle_options.m:
runtime/mercury_grade.h:
scripts/ml.in:
scripts/mgnuc.in:
scripts/init_grade_options.sh-subr:
scripts/parse_grade_options.sh-subr:
Add new option `--memory-profiling' and new grade `.memprof'.
Add `--time-profiling' as a new synonym for `--profiling'.
Also add `--profile-memory' for more fine-grained control:
`--memory-profiling' implies both `--profile-memory' and
`--profile-calls'.
scripts/mprof_merge_runs:
Update to handle the new format of Prof.Counts and to
also merge Prof.MemoryWords and Prof.MemoryCells.
profiler/options.m:
profiler/mercury_profile.m:
Add new options `--profile memory-words' (`-m'),
`--profile memory-cells' (`-M') and `--profile time' (`-t').
Thes options make the profiler select a different count file,
Prof.MemoryWords or Prof.MemoryCells instead of Prof.Counts.
specific to time profiling.
profiler/read.m:
profiler/process_file.m:
profiler/prof_info.m:
profiler/generate_output.m:
Update to handle the new format of the counts file.
When reading the counts file, look at the first line of
the file to determine what is being profiled.
profiler/globals.m:
Add a new global variable `what_to_profile' that records
what is being profiled.
profiler/output.m:
Change the headings to reflect what is being profiled.
doc/user_guide.texi:
Document memory profiling.
Document new options.
doc/user_guide.texi:
compiler/options.m:
Comment out the documentation for `.proftime'/`--profile-time',
since doing time and call profiling seperately doesn't work,
because the code addresses change when you recompile with a
different grade. Ditto for `.profmem'/`--profile-memory'.
Also comment out the documentation for
`.profcalls'/`--profile-calls', since it is redundant --
`.memprof' produces the same information and more.
configure.in:
Build a `.memprof' grade. (Hmm, should we do this only
if `--enable-all-grades' is specified?)
Don't ever build a `.profcalls' grade.
Estimated hours taken: 7
Fix a code generation bug that broke Tom's mediancut.m program.
compiler/code_exprn.m:
Change code_exprn__lval_in_use/3 to also check the registers marked
in use, not just the variables marked in use.
This avoids problems in the following case:
- a register R was marked in use, since it was the target
register we decided to use to compute variable X;
- before variable X was given status `evaled(R)' we had to
first produce some other variable Y that X depended on;
- Y was placed in register R (since that was not considered
in use);
- we then assigned X to R, because that was
the location we had decided to place X in,
thus clobbering Y;
- subsequently we used R, expecting it to hold Y.
This change increases the code size of the compiler on DEC Alpha
by 100k (2.7%). :-(
Estimated hours taken: 14
Implemented a :- use_module directive. This is the same as
:- import_module, except all uses of the imported items
must be explicitly module qualified.
:- use_module is implemented by ensuring that unqualified versions
of items only get added to the HLDS symbol tables if they were imported
using import_module.
Indirectly imported items (from `.int2' files) and items declared in `.opt'
files are treated as if they were imported with use_module, since all uses
of them should be module qualified.
compiler/module_qual.m
Keep two sets of type, mode and inst ids, those which can
be used without qualifiers and those which can't.
Renamed some predicates which no longer have unique names since
'__' became a synonym for ':'.
Made mq_info_set_module_used check whether the current item is in
the interface, rather than relying on its caller to do the check.
Removed init_mq_info_module, since make_hlds.m now uses the
mq_info built during the module qualification pass.
compiler/prog_data.m
Added a pseudo-declaration `used', same as `imported' except uses of
the following items must be module qualified.
Added a type need_qualifier to describe whether uses of an item
need to be module qualified.
compiler/make_hlds.m
Keep with the import_status whether current item was imported
using a :- use_module directive.
Use the mq_info structure passed in instead of building a new one.
Ensure unqualified versions of constructors only get added to the
cons_table if they can be used without qualification.
compiler/hlds_module.m
Added an extra argument to predicate_table_insert of type
need_qualifier.
Only add predicates to the name and name-arity indices if they
can be used without qualifiers.
Changed the structure of the module-name-arity index, so that
lookups can be made without an arity, such as when type-checking
module qualified higher-order predicate constants. This does not
change the interface to the module_name_arity index.
Factored out some common code in predicate_table_insert which
applies to both predicates and functions.
compiler/hlds_pred.m
Removed the opt_decl import_status. It isn't needed any more
since all uses of items declared in .opt files must now be
module qualified.
Added some documentation about when the clauses_info is valid.
compiler/intermod.m
Ensure that predicate and function calls in the `.opt' file are
module qualified. Use use_module instead of import_module in
`.opt' files.
compiler/modules.m
Handle use_module directives.
Report a warning if both use_module and import_module declarations
exist for the same module.
compiler/mercury_compile.m
Collect inter-module optimization information before module
qualification, since it can't cause conflicts any more. This means
that the mq_info structure built in module_qual.m can be reused in
make_hlds.m, instead of building a new one.
compiler/prog_out.m
Add a predicate prog_out__write_module_list, which was moved
here from module_qual.m.
compiler/typecheck.m
Removed code to check that predicates declared in `.opt' files
were being used appropriately, since this is now handled by
use_module.
compiler/*.m
Added missing imports, mostly for prog_data and term.
NEWS
compiler/notes/todo.html
doc/reference_manual.texi
Document `:- use_module'.
tests/valid/intermod_lambda_test.m
tests/valid/intermod_lambda_test2.m
tests/invalid/errors.m
tests/invalid/errors2.m
Test cases.
Estimated hours taken: 3
Enable --warn-interface-imports by default. This was turned off while
list and term were defined in mercury_builtin.m, since it caused many
warnings.
Fix all the unused interface imports that have been added since then.
compiler/options.m:
Enable --warn-interface-imports by default.
compiler/module_qual.m:
Fix formatting inconsistencies with module names in warning
messages. (".m" was not appended to module names if there was
only one module).
compiler/*.m:
library/*.m:
tests/invalid/type_loop.m:
tests/warnings/*.m:
Remove usused interface imports, or move them into
implementation (mostly bool, list and std_util).
Estimated hours taken: 5
Slight rearrangement of the data structures of the code generator, to allow
more flexibility in code generation. The rearrangement moves the stack_slots
information (mapping vars to their stack slots if any) and follow_vars
information (mapping vars to the location preferred for them by future code)
from code_info to code_exprn. This allows the predicates in code_exprn to
make use of this information.
As a result of these changes, the code generator now emits 110 fewer lines
of C code for the compiler (478 lines are replaced by 368). There is no
discernible impact on the memory requirements or running time of the compiler
itself.
code_exprn:
Add the two fields to the exprn_info data structure.
Several predicates in code_exprn now evaluate variables directly
into their preferred location, instead of a random register.
code_info:
Remove the two fields from the code_info data structure.
Estimated hours taken: 3
Replace calls to map__set with calls to either map__det_insert or
map__det_update. In some cases this required a small amount of code
reorganization.
Estimated hours taken: 5
Made some extensive additions to bag.m to include the standard set
operations (union, intersection, subtraction). Also added some other
useful predicates to operate on bags.
library/bag.m:
The following changes were made which will break any programs using
these predicates: in bag__contains/2, the order of the arguments were
swapped to make bag__contains the same as map__contains.
bag__remove was det, but is now semidet. bag__delete was added to
replace the old bag__remove. These changes make bag conform to the
same standard as set.m and map.m.
compiler/code_exprn.m:
This needed to be changed as it uses bag.m, and the changes to
bag.m stopped code_exprn.m from compiling.
Estimated hours taken: 7
Add support for taking the addresses of words on the heap as well as on
on either stack. This will be used later to support tail recursion modulo
constructor application as well as parallelism.
The support provided is a first draft. Since nothing in the compiler
currently generates code that uses the new facilities, they have not been
tested yet beyond ensuring that they don't interfere with the old functionality
of the compiler.
llds:
Add a new type, mem_ref, that denotes a reference to a stackvar,
a framevar, or to a field of a cell on the heap.
Add a new function symbol to the type rval: mem_addr(mem_ref),
which represents the address of the word denoted by the mem_ref.
Add a new function symbol to the type lval: mem_ref(rval).
Given that Rval is an address, mem_ref(Rval) denotes the word
at that address. The value of Rval should have originally come from
a mem_addr(_) type rval, but that value could have been store in
registers, stack slots etc since then.
code_exprn, code_info, dupelim, exprn_aux, garbage_out, livemap, llds_common,
llds_out, middle_rec, opt_debug, opt_util, vn_cost, vn_filter:
Added code to handle the new mem_ref type and the new alternatives
in lval and rval.
exprn_aux:
Make exprn_aux__substitute_lval_in_lval more thorough.
vn_type:
Add vn shadows of the new things in llds.
vn_flush, vn_order, vn_util:
Handle the new things in llds and/or their vn shadows.
Estimated hours taken: 3
code_gen, pragma_c_code:
Move the code that generates code for pragma_c_codes to a new module.
llds:
Change the representation of reg and temp lvals, in order to create
the concept of a "register type" and to reduce memory requirements.
Also add a comment indicating a possible future extension dealing with
model_non pragma_c_codes.
code_exprn, code_info:
Add the ability to request registers of a given type, or a specific
register, when acquiring registers.
bytecode, bytecode_gen, call_gen, dupelim, exprn_aux, follow_vars, frameopt,
garbage_out, jumpopt, llds_out, middle_rec, opt_debug, opt_util, store_alloc,
string_switch, tag_switch, unify_gen, vn_block, vn_cost, vn_filter, vn_flush,
vn_order, vn_temploc, vn_type, vn_util, vn_verify:
Small changes to accommodate the new register representation.
hlds_goal:
Add a comment indicating a possible future extension dealing with
model_non pragma_c_codes.
inlining:
Add a comment indicating a how to deal with a possible future extension
dealing with model_non pragma_c_codes.
Estimated hours taken: _____
Take the code generator a big step closer to notes/ALLOCATION.
The new code generator emits code that is smaller and faster than
the code we used to emit.
Nondet liveness is no longer used; nondet live sets are always empty.
In code that was being modified anyway, remove its handling. Other
uses will be removed later (this keeps this change from being far too big;
as it is it is merely too big). Similarly for cont-lives.
In several places, clarify the code that gathers several code pieces together.
call_gen:
Unset the failure continuation and flush the resume vars to
their stack slots before nondet calls.
Move the code that decides whether a nondet call can be a tailcall
to code_info.
code_aux:
Remove the code to handle resume points, since these are now
handled in the specific constructs that need them. Replace it
with a sanity check.
code_exprn:
Add a predicate to place multiple vars.
code_gen:
Remove the predicate code_gen__generate_forced_goal, since it
packaged together some operations that should be executed at different
times.
Don't unset the failure continuation after every nondet goal;
this is now done in the constructs that need it.
Modify the handling of negation to use resume point info
according to notes/ALLOCATION.
Remove the predicate code_gen__ensure_vars_are_saved which was
use to save all lives variables to the stack before nondet
disjunctions and if-then-elses; we don't do that anymore.
code_info:
Significantly simplify and document the handling of failure
continuations, and make the types involved abstract types.
Factor out common code in the handling of det and semi commits.
Keep track of "zombies", variables that are dead wrt forward
execution but whose values we need because they may be needed
at a resume point we can reach.
Remove several now unneeded predicates, and introduce new
predicates to help other modules.
code_util:
Add a couple of predicates to check whether ia goal cannot fail before
flushing all variables to the stack, and whether a goal cannot flush
any variables to the stack. These are used in liveness to decide
which entry labels will be needed at resume points.
disj_gen:
Unify the handling of det and semi disjunctions. Model the code
handling of nondet disjunctions on the code handling pruned
disjunctions. It is possible that the handling of nondet and pruned
disjunctions can also be unified; the new code should make this
significantly easier.
Make the code conform to notes/ALLOCATION. This means saving
only the variables mentioned in the resume_point field, not
flushing all live variables to the stack at the start of a
nondet disjunction, handling zombies, and using the new method
of flushing variables at the ends of branched structures.
ite_gen:
Unify the handling of det and semi if-then-elses. Model the code
handling of nondet if-then-elses on the code handling det/semi
if-then-elses. It is possible that the handling of nondet and pruned
if-then-elses can also be unified; the new code should make this
significantly easier.
Make the code conform to notes/ALLOCATION. This means saving
only the variables mentioned in the resume_point field, not
flushing all live variables to the stack at the start of a
nondet if-then-else, handling zombies, and using the new method
of flushing variables at the ends of branched structures.
Apply the new rules about liveness in if-then-elses, which say that
the else part is parallel not to the then part but to the conjunction
of the condition and the then part.
dense_switch, lookup_switch, string_switch, switch_gen, tag_switch, middle_rec:
Use the new method of flushing variables at the ends of branched
structures. Don't call remake_with_store map; switch_gen will do so.
Fix an old bug in lookup_switch.
The code in switch_gen which looked for the special case of a two-way
switch used to use a heuristic to decide which one was recursive and
which one was a base case. We now check the codes of the cases.
hlds_goal:
Adjust the structure of the resume_point field to make it easier
to use. Add a more convenient access predicate.
hlds_out:
Don't print the nondet liveness and cont live fields, since they are
not used anymore. Comment out the printing of the context field,
which is rarely useful. Modify the printing of the resume_point field
to conform to its new definition.
live_vars:
Use the resume_point field, not the nondetlives field, to decide
which variables may be needed on backward execution. Remove some
code copied from liveness.m.
liveness:
Put the several pieces of information we thread through the traversal
predicates into a single tuple.
Don't put variables which are local to one branch of a branched
structure into the post-birth sets of other branches.
Apply the new rules about liveness in if-then-elses, which say that
the else part is parallel not to the then part but to the conjunction
of the condition and the then part. Variables that are needed in the
else part but not in the condition or the then part now die in at the
start of the condition (they will be protected by the resume point on
the condition).
We now treat pruned and non-pruned disjunctions the same way
wrt deadness; the old way was too conservative (it had to be).
We still mishandle branches which produce some variables but
can't succeed.
mercury_compile:
Liveness now prints its own progress message with -V; support this.
store_alloc:
When figuring out what variables need to be saved across calls,
make sure that we put in interference arcs between those variables
and those that are required by enclosing resume points.
Don't compute cont-lives, since they are not used anymore.
livemap:
Fix the starting comment.
Estimated hours taken: 0.5
Get rid of unnecessary placement of variables in registers at failure
continuations. The change reduces the size of the code of the compiler
on Alphas by 100 Kb (3%).
code_exprn:
Add a predicate to produce a variable either in a register
or in a stack slot.
code_info:
Call this predicate (instead of another that produces variables
into registers only) when we are flushing the values of the variables
whose values will be needed at a resumption point.
Estimated hours taken: 6
Another step towards implementing notes/ALLOCATION.
hlds_goal:
Define a new type that gives the information associated with
resumption points. Add a field of this type to goal_info, and the
associated operations.
Change the layout of the access predicates to make the code_info
structure easier to modify in the future.
liveness:
Fill in the resume_point fields of goals that establish
resumption points.
hlds_out:
Print the new resume_point field of goal_info.
Allow the printing of base_type_infos on the right hand sides
of unifications.
code_info:
Reuse the stack of store maps field, which we don't use anymore,
to store the stack of resumption-point variable sets. Add operations
on this new stack.
Remove the slot that holds the code model of the current procedure,
since it is used rarely and can be easily looked up in the proc_info,
which we also store. Reuse the slot to store the maximum number of
stack pushes, eliminating the pair used to store it previously.
Change the layout of the access predicates to make the code_info
structure easier to modify in the future.
code_gen:
Don't pass the code model to code_info__init.
code_exprn:
Make an abort message more specific.
hlds_pred:
Fix typos.
Estimated hours taken: 2.5
Switch from using a stack of store_maps in the code_info to govern what
goes where at the end of each branched structure to using the store map
fields of the goal expressions of those structures.
Fix variable names where they resembled the wrong kind of map(var, lval).
code_info:
Remove the operations on stacks of store maps.
Modify the generate_forced_saves and remake_with_store_map operations
to take a store_map parameter.
When making variables magically live, pick random unused variables
to hold them, since we can no longer use the guidance of the top
store map stack entry. This may lead to the generation of some
excess move instructions at non-reachable points in the code;
this will be fixed later.
code_gen:
Remove the store map push and pop invocations.
Modify the generate_forced_goal operation to take a store_map parameter.
code_exprn:
Export a predicate for use by code_info.
middle_rec, disj_gen, ite_gen, switch_gen,
dense_switch, lookup_switch, string_switch, tag_switch:
Pass the store map around to get it to invocations of the primitives
in code_gen and code_info that now need it.
goal_util:
Name apart the new follow_vars field in hlds__goal_infos.
(This should have been in the change that introduced that field.)
common, constraint, cse_detection, det_analysis, dnf, excess, follow_code,
intermod, lambda, lco, liveness, make_hlds, mode_util, modes, polymorphism,
quantification, simplify, switch_detection, typecheck, unique_modes,
unused_args:
Fix variable names.
follow_vars, store_alloc:
Add comments.
Estimated hours taken: 0.25
mercury/compiler/code_exprn.m:
Fix the new predicate introduced in the last change:
code_exprn__materialize_vars_in_rval
which was removing and adding register dependencies to
the exprn_info structure, and it shouldn't have been.
Estimated hours taken: 3
These changes have 2 parts:
* Fix a bug in unify_gen triggered by zoltans fix to a
bug triggered by dmo's graphics project and add a test
case for it.
* Fix a couple of small bugs in the testing procedures where
they required you to have . in your path.
mercury/compiler/code_exprn.m,
mercury/compiler/code_info.m:
add the predicate
code_{exprn,info}__materialize_vars_in_rval/5
which generates code to materialize the vars in an rval and
updates the exprn info appropriately.
This predicate was added because it is needed for generating
the sub-unifications where a deconstruction has assignments
into the term (ie field(....) = var; rather than the other
way around).
mercury/compiler/exprn_aux.m:
export exprn_aux__vars_in_rval.
mercury/compiler/follow_vars.m:
fix a singleton variable warning.
mercury/compiler/unify_gen.m:
When generating code to assign into the fields of a term
within a deconstruction, materialize any variables in the
the field expression (into which you are going to assign)
before doing the assignment. Before this fix, the code generator
was emitting code that contained var(M), which with certain
combinations of opt flags was causing an abort in llds_out.
tests/runtests:
tiny bugfix so that runtests works for people who don't have `.'
in their path.
tests/valid/Mmake:
enable `two_way_unif' which tests the bugfix in unify_gen shown above
mercury/tools/bootcheck:
tiny bugfix so that runtests works for people who don't have `.'
in their path.
Estimated hours taken: 24
A bunch of changes required to fix problems in code generation for
model_det and model_semi disjunctions.
simplify.m:
Don't convert all model_det and model_semi disjunctions into
if-then-elses, because that doesn't work if the disjuncts
have output variables, which can happen (e.g. with cc_nondet
disjunctions)
disj_gen.m:
Fix a bug in the code generation for semidet disjunctions:
don't forget to jump to the end of the disjunction after
each disjunct!
liveness.m, live_vars.m, store_alloc.m, disj_gen.m:
Treat backtracking in model_det and model_semi disjunctions
as shallow backtracking rather than deep backtracking.
This means that rather than pushing all live variables
onto the stack at the start of a model_det/semi disjunction,
and using the nondet_lives to keep track of them, we instead
treat these disjunctions a bit more like an if-then-else and
use the ordinary liveness/deadness to keep track of them.
code_aux.m:
Change code_aux__pre_goal_update so that it only applies
the post-deaths if the goal is atomic. Applying the
*post*-deaths to the set of live variables in the *pre*-goal
update only makes sense for atomic goals.
(I think previously we only ever generated post-deaths
for atomic goals, but now we generate them also for
goals inside model_det or model_semi disjunctions.)
code_gen.pp, middle_rec.m:
Pass an is-atomic flag to code_aux__pre_goal_update.
hlds_goal.m:
Add some comments.
goal_util.m:
Fix bugs in goal_util__name_apart_goalinfo.
It wasn't applying the substitution to all the
appropriate fields.
code_exprn.m:
Improve the error message for one of the internal errors.
hlds_out.m:
Print the stack slot allocations in the HLDS dump again.
Estimated hours taken: 2
llds.m:
Add a boolean argument to the create rval, which should be set to true
if the cell created must have a unique reference.
vn_type.m:
Add a corresponding argument to vn_create.
others:
Fix references to creates and vn_creates.
Estimated hours taken: 4
code_exprn.m:
Use place_arg instead of construct_code to emit the code for the
arguments of creates. This is a win, because construct_code does
not check whether the value it is constructing is a static term,
while place_arg does. To enable place_arg to generate good code
when used like this, it now has an extra argument that allows
the caller to say where it would like the value to go.
Estimated hours taken: 20
det_analysis:
Make sure we don't change the goal being analyzed except possibly
for the introduction of `some's (which should not hurt anything).
Make sure we don't print any error messages except in the final
iteration, when all the inputs to the inference are stable.
If the --debug-detism options is set, print messages about the
progress of inference and checking.
Also moved some code around.
det_report:
Distinguish the handling of warning messages and error messages.
simplify:
Use the new ability of det_report to separate warnings and errors.
passes_aux:
Add a new generic pass form, for use by simplify.
option:
Add --debug-detism (as above), --aditi, which at the moment
only enables the disjunctive normal form transformation, and
--inlining/--no-inlining, which set the other three flags
involved in inlining depending on whether you want standard
inlining or none at all.
Follow_code used to be set twice and follow_vars not at all;
I fixed this.
Reenabled optimize_higher_order at -O3.
Moved value numbering to -O4 and pred_value_number to -O5.
This makes it easier to separate value numbering from the
other optimizations (which are likely to be more effective).
Divided options_help into sections to avoid excessive
compilation times.
store_alloc:
Base the store map on the follow_vars info attached to the
branched structure which I added recently, and not on the
follow_vars map being passed around, since it will be more accurate.
hlds_out:
Print information about follow_vars and store_maps when -D is given.
follow_code:
Undo an old hack that change to follow_vars has made counterproductive.
middle_rec:
Fix a bug uncovered by the change to follow_code. When looking for a
register to hold the counter, it is not enough to avoid picking a
register that appears in the recursive case; we must also avoid
registers that occur only in the base case.
livemap:
Mentioning the code address succip now causes the succip to be
considered live. This may or may not fix the bug with pred_value_number
miscompiling unused_args.m; the other changes have caused the input
to value numbering to change, and they no longer trigger the problem.
(Will try to test this later.)
mercury_compile:
Try to make sure that we print statistics only after passes that
were actually executed. Also, reduce the number of lookups of the
verbose option. Move some predicates so that the order of their
appearance matches the current order of invocation.
vn_table:
Loosen a sanity check to let xnuc2 pass through it.
code_exprn, switch_detection:
Minor changes.
Estimated hours taken: 6
compiler/{code_gen.pp,code_info.m,code_exprn.m}:
When generating semidet pragma c_codes, make sure to shuffle r1
out of the way in case its value is needed after the pragma.
Estimated hours taken: 2
Do some more work on improving floating-point performance:
emit boxed floating point constants as static ground terms.
options.m:
Add new option --unboxed-float.
exprn_aux.m
Add --unboxed-float to the `exprn_opts' that affect whether
or not things can be static constants. If --unboxed-float
is not set, and --static-ground-terms is, then consider
float_consts to be constant.
code_exprn.m, lookup_switch.m:
Trivial changes to handle new arity of exprn_opts type.
llds.m:
If --unboxed-float is not set, and --static-ground-terms is, then
output `static const Float mercury_float_const_...' declarations
for float_consts.
Estimated hours taken: 1
code_exprn:
Avoid creating unecessary shuffling operations. Specifically,
if a register (say r1) is live, and if you want to put a value
into it, we used to generate a sequence such as:
r2 = r1;
r1 = <some rval>;
Very often the original value of r1 is needed *only* in <some rval>.
We now generate this bad code as before, but then check whether
there are any live variables whose values *require* r2 (as opposed
to having one of their several copies accessible via r2). If not,
we remove the register copy.
Most of the work is done by the auxiliary predicates introduced
in the previous checkin.
This change reduces the size of the compiler by 65 Kb, almost 3%.
This is with standard optimization. Since this optimization removes
code that is also removed by value numbering, any gain in the
size of fully optimized code will be minimal.
Estimated hours taken: 4
code_exprn:
Made a start towards getting better code generated for nested creates
and towards getting rid of useless "shuffle lval" instructions.
Also, some minor cleanup.
exprn_aux:
Add some auxiliary predicates for the new code_exprn.
delay_info:
Remove a useless import of hlds, which is now empty.
Estimated hours taken: 1.5
code_exprn:
Distribute the initial comments among the declarations of the exported
predicates. This makes it much less likely that the declarations will
be modified without changes in the comments. Since this has happened
in the past, some predicates are now without comments.
Changed code_exprn__place_var to prefer to get even a constant term
from a location if it has been produced before, and factor out some
code that is shared between the handling of cached and evaled
expressions.
code_exprm, code_info:
Removed an unnecessary argument from code_exprn__get_varlocs.
dead_proc_elim:
Changed the predicate name prefix from dead__ to dead_proc_elim__
to conform to notes/CODING_STANDARDS.
handle_options:
Remove an inappropriate comment.
jumpopt:
Filter out redundant livevals whether --optimize-fulljumps is given
or not. (I thought they aren't created if the option isn't given,
but they are.)
options:
Change the meaning of -O from --c-optimize to --opt-level.
Disabled unused args until the bug is fixed.
Estimated hours taken: 2
code_exprn:
When we are processing the flushing of create expressions, make sure
the Lval we are creating into isn't a field reference. This avoids
deep field of field of field of ... nesting. It does introduce
references to high register numbers, but this is a lesser evil,
and Tom and I plan to fix this anyway.
arg_info, globals, options:
Change --args old to --args simple.
options:
Make some help messages more specific.
code_aux, code_exprn, code_info, det_report, make_hlds, mercury_to_goedel,
prog_io, typecheck:
Changes to accommodate the move from varset__lookup_name
to varset__search_name.
Estimated hours taken: 2
exprn_aux:
Both code_exprn and lookup_switch had code to check whether an
expression is constant or not. Some of the code is different
due to different handling of variables in rvals, but exprn_aux
now contains the common subset.
This common subset used to treat some address constants incorrectly,
simply by not considering them; they are now considered and treated
properly.
code_exprn, lookup_switch, exprn_aux:
Remove redundant option lookups in the process of checking for
constant expressions.
code_exprn:
Other minor cleanups, including removal of a block of code Tom
says was "deep magic" (but which turns out to be unnecessary).
code_info:
Removed some dead code.
options:
Added real support for --opt-level, in the form of a table of
default values of options for each optimization level between
0 and 5 (both inclusive). This needs a new form of documentation.
How do you do tables in texinfo?
Estimated hours taken: 6
mercury_to_mercury:
Wrap parentheses around pred insts, since they are needed.
value_number, vn_verify:
Value numbering now reapplies itself to both halves of a block if
it cannot optimize the block as a whole.
Split the verification code into its own module, and fix line lengths.
vn_order:
Fix the computation of the label at which blocks should be divided.
Fix line lengths.
vn_debug:
Add a message to support the new block dividing capability.
code_exprn:
Redirect option lookup operations from options to getopt.
passes_aux:
Prepare for some further changes.
prog_io:
Formatting changes.
Estimated hours taken: 1.5
Undo dylan's changes in the names of some library entities,
by applying the following sed script
s/term_atom/term__atom/g
s/term_string/term__string/g
s/term_integer/term__integer/g
s/term_float/term__float/g
s/term_context/term__context/g
s/term_functor/term__functor/g
s/term_variable/term__variable/g
s/_term__/_term_/g
s/std_util__bool_/bool__/g
to all the `.m' and `.pp' files in the compiler and library directories.
The reason for undoing these changes was to minimize incompatibilities
with 0.4 (and besides, the changes were not a really good idea in the first
place).
I also moved `bool' to a separate module.
The main reason for that change is to ensure that the `__' prefix is
only used when it genuinely represents a module qualifier.
(That's what dylan's changes were trying to acheive, but `term__'
does genuinely represent a module qualifier.)
compiler/*.m:
Apply sed script above;
where appropriate, add `bool' to the list of imported modules.
code_exprn.m:
Improve error message for one of the internal errors.
det_analysis.m:
Make sure we don't generated unnecessary nested `some' goals.
prog_io.m, inst_match.m, mode_util.m, mercury_to_mercury.m:
Add new insts `mostly_unique' and `mostly_clobbered', with
semantics similar to `unique' and `clobbered', except that
mostly-unique variables might be needed on backtracking.
unique_modes.m:
A good start on the code to check that `unique' modes
are not used for nondet live variables. Still incomplete,
but just about all the code is there except the code to
actually compute the set of nondet live variables as you
traverse the goal.