Estimated hours taken: 0.5 hours for the fix, 8 hours debugging
(plus a similar amount of Zoltan's time debugging)
string_switch.m:
Fix a code generation bug: it called
code_info__generate_failure from the wrong spot, and so used
the wrong exprn_info. This meant that the code generated
by generate_failure contained incorrect register shuffling,
since it thought the variables were in different locations
to where they really were.
Estimated hours taken: 0.25
compiler/prog_io.m:
Export parse_some_vars_goal, for use by make_hlds.m, for
my recent change to support if-then-else expressions.
Estimated hours taken: 1
options:
Divide --inlining into --inline-simple, for inlining all procedures
with simple definitions (the curent practice), --inline-single-use
for inlining all procedures called exactly once, and --inline-threshold
for specifying an upper bound on the product of the number of calls
and the size of the procedure definition (roughly the number of
connectives).
The --inline-single-use option is off by default until the problem with
parse_dcg_goal_2 is fixed.
inlining:
Implement the new options.
goal_util:
Added a predicate for computing the size of a goal.
mercury_compile:
Call inlining if any one of three options is set.
call_gen:
Remove an obsolete comment (all of three hours old :-)
Estimated hours taken: 1
compiler/make_hlds.m:
Allow if-then-else as an expression, as well as allowing it in
goals. This is to keep the functional programming crowd happy ;-).
Estimated hours taken: 1.5
Split llds into two parts. llds.m defines the data types, while llds_out.m
has the predicates for printing the code.
Removed the call_closure instruction. Instead, we use calls to the
system-defined addresses do_call_{det,semidet,nondet}_closure. This is
how call_closure was implemented already. The advantage of the new
implementation is that it allows jump optimization of what used to be
call_closures, without new code in jumpopt.
Estimated hours taken: 8
Bug fixes for higher_order.m and unused_args.m
NEWS
Removed the message about bugs in unused_args.m and higher_order.m
compiler/options.m
Re-enabled higher_order and unused_args.
compiler/unused_args.m
Fixed so that this now handles partially instantiated
deconstructions correctly.
compiler/higher_order.m
Two bug fixes:
Specialization of types for specialized versions of predicates.
Fixed handling of curried arguments.
compiler/inlining.m, compiler/type_util.m:
Moved inlining:apply_substitution_to_type_map and
inlining:apply_rec_substitution_to_type_map to type_util.m
for use in the higher_order.m bug fix.
library/varset.m
Added predicate varset__new_vars which returns a list of new
variables.
library/term.m
Added predicates term__apply_variable_renaming(_to_list)
to apply a variable renaming (map(var, var)) to a term
or list of terms.
library/map.m
Added map__det_insert_from_corresponding_lists to insert
multiple key-value pairs into a map.
tests/valid/{Mmake, higher_order2.m, higher_order3.m, unused_args_test2.m}
Tests for the bug fixes.
Estimated hours taken: 8
options.m:
Rename branch_delay_slot to have_delay_slot.
Set optimize_delay_slot in -O2 only if have_delay_slot was set earlier.
This is possible now because the default optimization level is now
set in mc.
mercury_compile:
Change verbose output a bit to be more consistent.
dead_proc_elim:
Export the predicates that will eventually be needed by inlining.m.
inlining.m:
Use the information about the number of times each procedure is called
to inline local nonrecursive procedures that are called exactly once.
EXCEPT that this is turned off at the moment, since the inlining of
parse_dcg_goal_2 in prog_io, which this change enables, causes the
compiler to emit incorrect code.
prog_io:
Moved the data type definitions to prog_data. (Even though prog_io.m
is ten times the size of prog_data.m, the sizes of the .c files are
not too dissimilar.)
*/Mmake:
Make sure that the rules for `mmake clean' and `mmake realclean'
remove a few files that we'd missed.
Change the rules for making tags so that it uses the local
version of `mtags' (i.e. ../scripts/mtags) rather than the
installed one.
Estimated hours taken: 30 (?)
Its...
The C to Mercury Interface.
The following changes provide a C to Mercury interface. By making a declaration
such as
:- pragma(export, foo(in, in, out), "FunctionName").
you will be able to call the C function "FunctionName" from C. The arguments
are the same as the Mercury arguments, with outputs passed as pointers.
XXX We don't handle floats or strings properly.
A function prototype is output into <modulename>.h
Execution still has to start in Mercury.
Something went wrong with CVS when I tried to abort a commit just then, and it
thinks I've already commited some files... but here's a description of all
my changes anyway:
compiler/garbage_out.m:
Ignore c_export c_modules.
compiler/hlds_module.m:
Add an annotation to the hlds, indicating which procs are exported
to C.
compiler/llds.m:
Change the way labels are emitted - instead of emitting the label
directly, it first generates a string and then prints the string. This
is useful because I only want the string (to print later). The
preds which return a string are now part of the interface.
compiler/make_hlds.:
Take note of which procs are to be exported to C.
compiler/mercury_compile.pp:
Generate a <module>.h file if necessary.
compiler/mercury_to_mercury.m:
Spit out :- pragma(export, ...) decs.
compiler/prog_io.m:
Read in :- pragma(export, ...) decs.
compiler/export.m:
Handle the outputting of C exports. This includes the generation of
the .h files and the generation of the C functions.
Estimated hours taken: 0.1
compiler/dead_proc_elim.m:
Fix a problem in the call to write_progress_message: it shouldn't
include the word `predicate' in the message passed, because this
leads to messages like
`eliminating procedure of predicatepredicate foo:bar/n...' or
`eliminating procedure of predicatefunction foo:bar/n...'.
Estimated hours taken: 1.5
dead_proc_elim:
Count the number of references to each predicate if that predicate
is a candidate for inlining.
options:
Enable --optimize-delay-slots for -O2 only if the machine architecture
actually has branch delay slots.
inlining, modules:
Fix the copyright notice.
Estimated hours taken: 0.3
jumpopt:
If the label branched to in an if_val is followed by a goto,
short-circuit the conditional branch. This reduces code size
on the Alpha by 147 Kb.
Estimated hours taken: 3
options:
Add a new option, --branch-delay-slot, intended for use by mc on
the basis of the configuattion script. It says whether the machine
architecture has delays slots on branches.
The setting of option should affect whether we set
--optimize-delay-slots at -O2, but this doesn't work yet.
hlds_goal:
Add an extra field to hold follow_vars infromation to disjunctions,
switches and if-then-elses. I intend to use this information to
generate better code.
*.m:
Changes to accommodate the extra field.
Estimated hours taken: 1
compiler/typecheck.m:
Improve the error message for type ambiguities that are due
to the same predicate being defined in more than one module.
Estimated hours taken: 16
options:
Replace the word_size option with the two options bits_per_word and
bytes_per_word. The former is needed by lookup_switch, the latter by
value numbering.
lookup_switch:
Use the new option instead of word_size.
vn_type, vn_cost, vn_block, value_number:
Add a new type, vn_params, containing information such as the number
of bytes per word (from the option) and cost parameters. Use these
cost parameters to make more realistic decisions.
vn_filter:
New module to filter out unnecessary uses of temporary variables,
which gcc does unnecessarily badly on.
value_number, vn_verify:
Move verification completely to vn_verify. Tighten the verification
rules relating to tags where it concerns code sequences in which
the tag of an rval is taken in a statement before an if_val, but
loosen them to avoid spurious rejections of code sequences containing
arithmetic comparisons. Fix some missing cases from semidet switches
that may have lead to overly conservative decisions.
value_number, vn_order:
Vn_order was making an overly conservative assumption about where
to split an extended basic block if it couldn't be optimized together.
Move the decision to value_number and try to make it better. The new
heuristic is not enabled yet.
vn_debug:
Change the conditions under which one type of message is printed.
vn_flush:
Wrap some too long lines.
llds:
Fix a bug that would prevent profiling from working correctly on
value numbered code: we weren't scanning instructions inside blocks
when looking for return addresses.
peephole:
Enable an optimization previously left disabled by accident.
switch_detection, tag_switch:
Eliminate an unused argument.
Estimated hours taken: 3
compiler/options.m:
Changed to accomodate the recent change to getopt.m.
Added some new options:
--reorder-conj, --reorder-disj, --fully-strict;
--strict-sequential (== previous three);
--reclaim-heap-on-failure (== reclaim-heap-on-semidet-failure
plus reclaim-heap-on-nondet-failure);
--everything-in-one-c-function (== --procs-per-c-function 0)
Reorganized the handling of --opt-level to avoid duplication in
the table of optimization levels. Added new optimization levels -1
and 6, documented the meaning of each option level, and changed
the options set by the various optimization levels.
XXX TODO: update the user documentation to reflect the above changes.
Estimated hours taken: 1
compiler/inlining.m:
Remove an overly conservative sanity check introduced in my previous
change to inlining.m. It can't check that the HLDS is type-correct,
because polymorphism.m introduces code which is not type-correct.
(It might be better in the long run to change polymorphism.m,
but there is no simple way to do that without adversely affecting
the efficiency of the generated code.)
Estimated hours taken: 0.1
Fix a bug in inlining of polymorphic predicates, which showed up
for the `pseudoknot' benchmark when excess_assign was turned on again.
compiler/inlining.m:
Make sure we substitute in the new values of any type
parameters which are bound by an inlined call.
This fixed a bug which led to the the code generator
aborting because the code output from inlining.m was
not type-correct.
Also, tidy up the source code a bit and a some comments.
type_util.m:
Add predicate type_list_subsumes/3, for use by inlining.m and
modes.m.
modes.m:
Use type_list_subsumes/3.
Estimated hours taken: 1
compiler/shapes.m:
Fix a bug or two in the implementation of no_tag tags: to
create the shape num for a type with a no_tag tag, use the
type that results *after* we've substituted any type variables,
and create the shape by recursively calling shapes__request_shape
rather than shapes__create_shape (the latter doesn't handle
builtin types like `int' correctly, and anyway we don't want to
create it if it already exists).
Estimated hours taken: 20
vn_block:
Fix a typo which reflected a fundamental design error. When finding
cheaper copies of live lvals, for use in creating specialized copies
(parallels) of blocks jumped to from the current location, we used
to use the map reflecting the contents of lvals at the start of the
block, not at the point of the jump.
--pred-value-number, which uses the information computed by the
buggy predicate, actually bootstrapped some time ago despite
this fundamental bug!
value_number:
Fix a bug in the creation of parallel code sequences for computed
gotos. Add some more opprtunities for printing diagnostics.
Move code concerning final verification to vn_verify.
vn_verify:
Move the remaining code concerned with final verification from
value_number to vn_verify.
peephole:
Add a new pattern, which transforms the sequence
incr_sp N; goto L2; L1; incr_sp N; L2
into just
L1; incr_sp N; L2
The pattern is of course more broadly applicable, but I have seen
it only when it involves a single incr_sp between the two labels.
(The longer pattern can be introduced by frameopt.)
opt_util:
Look inside blocks when checking whether an instruction can fall
through. This improves the performance of labelopt.
vn_table:
Make the type vn_table abstract; add, export and use access functions.
vn_util:
Remove a noop predicate, since now it won't ever be made to do
anything.
vn_cost:
Refine debugging output.
vn_debug:
Add some more debugging routines.
opt_debug:
Add some more debugging routines.
det_analysis:
Remove an unused argument.
labelopt:
Formatting change.
Estimated hours taken: 0.25
compiler/make_hlds.m:
Fix a bug reported by Zoltan: it was printing out a misleading error
message, since I had `function' and `predicate' switched.
Estimated hours taken: 0.25
compiler/options.m:
Re-enable excess_assign by default. The problem with it (it
broke the C interface) was fixed a long time ago.
Estimated hours taken: 2
compiler/lambda.m:
Re-enable the optimization of not introducing separate
predicates for lambda expressions when not necessary,
after fixing it so that it doesn't attempt to curry output
arguments in cases such as
lambda([Y::out] is det, q(_, Y))
where q/2 is declared as
:- pred q(int::out, int::out) is det.
Estimated hours taken: 0.1
compiler/unify_proc.m:
In the code generated for compare/3 predicates, call
builtin_int_lt and builtin_int_gt rather than < and >.
This is because < and > are going to be moved from mercury_builtin.m
to int.m.
Estimated hours taken: 0.1
compiler/code_util.m:
Add mercury_builtin:builtin_int_{lt,gt} to the list of builtin
operators, as synonyms for "<" and ">" on ints.
These two predicates don't exist yet, but this change is
the first step towards moving the definitions of < and >
on ints from mercury_builtin.m back into int.m.
(The other parts of this change can't be committed yet
due to bootstrapping problems.)
Estimated hours taken: 0.25
compiler/prog_io.m:
Remove most of the old hack which expanded calls to is/2 in the
parser, since we now use functions instead.
Estimated hours taken: 4
compiler/notes/COMPILER_DESIGN:
Document the changes in the design of type-checking that were
needed to implement overload resolution for predicates with the
same name and arity that occur in different modules.
Estimated hours taken: 4
compiler/{typecheck.m, modes.m}:
Implement overload resolution for predicates with the same name
and arity that occur in different modules.
Amough other things, this change makes it practical to define
pred '<'(int, int) in int.m and pred '<'(float, float) in float.m,
without having to module-qualify uses of `<'.
Estimated hours taken: 0.25
My last change to llds.m broke my second-last change to it.
compiler/llds.m:
As of my previous change, float_consts are now emitted as
static ground terms, so we need to include float_consts in the
set of things which we test for when deciding whether to emit
`const Word mercury_const_n[] = { ... }' or
`const Word * mercury_const_n[] = { ... }', so that gcc gets
position-independent code right.
Estimated hours taken: 2
Do some more work on improving floating-point performance:
emit boxed floating point constants as static ground terms.
options.m:
Add new option --unboxed-float.
exprn_aux.m
Add --unboxed-float to the `exprn_opts' that affect whether
or not things can be static constants. If --unboxed-float
is not set, and --static-ground-terms is, then consider
float_consts to be constant.
code_exprn.m, lookup_switch.m:
Trivial changes to handle new arity of exprn_opts type.
llds.m:
If --unboxed-float is not set, and --static-ground-terms is, then
output `static const Float mercury_float_const_...' declarations
for float_consts.
Estimated hours taken: 1
(plus lots of time debugging the problem via email)
Fix a second occurrence of the problem with casts in the initializers
of static constants causing gcc to generate non-position-independent code.
This prevented the use of gcc on AIX RS/6000 systems, and was also
preventing us from creating genuinely shared shared libraries on Solaris.
compiler/llds.m:
Emit static constants as `const Word * mercury_const_n [] = { ... }'
rather than `const Word mercury_const_n [] = { ... }' not just
in the case when the constant contains code addresses, but also
when it contains data addresses, including addresses of other
static constants and/or string literals.
exprn_aux.m:
Add nondet (in, out) modes to the ..._contain_rval predicates,
which nondeterministically generate all the contained rvals,
and export args_contain_rval. For use by llds.m as required
by the above change.
Estimated hours taken: 1
When emitting the boxing/unboxing macros for floating point
arithmetic, avoid unnecessary boxing and unboxing of constants
and intermediate results.
compiler/llds.m:
Add a new predicate output_rval_as_float that prints the C code
for an rval as an unboxed float, and change the places which
ouput code for floating point operations to call it.
This change means we generate better code for code such as `X
is Y + Z + 1'. Previously we would unbox Y, unbox Z, add them,
box the result, unbox it, box the constant 1, unbox it, add
them, and box the final result. Now we just unbox Y, unbox Z,
add them, add the constant 1, and then box the final result.
Estimated hours taken: 0.1
compiler/hlds_out.m:
Export hlds_out__write_modes for use by mode_errors.m.
(Oops, forgot to include this with my previous change.)
Estimated hours taken: 3
A bunch of cleanups: improve error messages, tidy up the code.
Also, do some work towards supporting higher-order functions.
type_util.m:
Add new predicate type_is_higher_order/3 for checking
whether a type is a higher-order type. This recognizes
both higher-order predicate types and also higher-order
function types.
code_info.m, modes.m, polymorphism.m, shapes.m:
Use type_is_higher_order/3.
make_hlds.m:
Fix another error message to do the right thing when
reporting errors for functions.
mercury_to_mercury:
List `func' in the table of operators, so that it gets
parenthesized correctly.
modes.m, mode_errors.m:
Improve the error message for attempted higher-order unifications:
spit out some context, and if verbose_errors is enabled, spit
out a long description.
Estimated hours taken: 1
compiler/{typecheck.m,clause_to_proc.m}:
If there are no declared modes for a function, give it a default
mode of `:- func foo(in, in, ..., in) = out is det.'.