mercury

mirror of https://github.com/Mercury-Language/mercury.git synced 2026-04-22 21:03:53 +00:00

Author	SHA1	Message	Date
Zoltan Somogyi	d49f6eab84	Add missing imports of parent modules. These imports were missing from source files, but were included in imported modules' .int3 files. An upcoming change will delete these from those .int3 files.	2019-03-20 03:57:10 +11:00
Zoltan Somogyi	24b98fdafe	Pack sub-word-sized ints and dummies in terms. Previously, the only situation in which we could pack two or more arguments of a term into a single word was when all those arguments are enums. This diff changes that, so that the arguments can also be sub-word-sized integers (signed or unsigned), or values of dummy types (which occupy zero bits). This diff also records, for each argument of a function symbol, not just whether, and if yes, how it is packed into a word, but also at what offset that word is in the term's heap cell. It is more economical to compute this once, when the representation of the type is being decided, than to compute it over and over again when terms with that function symbol are being constructed or deconstructed. However, for a transition period, we compute these offsets at both times, to check the consistency of the new algorithm for computing offsets that is run at "decide representation time" with the old algorithms run at "generate code for a unification time". compiler/du_type_layout.m: Make the changes described above: pack sub-word-sized integers and dummy values into argument words, if possible, and if the relevant new option allows it. These options are temporary. If we find no problems with the new packing algorithm in a few weeks, we should be able to delete them. Allow 64 bit ints and uints to be stored in unboxed in two words on 32 bit platforms, if the relevant new option allows it. Support for this is not yet complete, but it makes sense to implement the RTTI changes for both this change and one described in the above paragraph together. For each packed argument, record not just its width, its shift and the mask, but also the number of bits the argument takes. Previously, we computed this on demand from the mask, but there is no real need for that when simply storing this info is so cheap. For all arguments, packed or not, record its offset, relative to both the start of the arguments, and the start of the memory cell. (The two are different if the arguments are preceded by either a remote secondary tag, the typeinfos and/or typeclass_infos describing some existentially typed arguments, or both.) The reason for this is given at the top. Centralize the decision of the parameters of packing in one predicate. If the option --inform-suboptimal-packing is given, print an informational message whenever the code deciding type representations finds that reordering the arguments of a function symbol would allow it to pack the arguments of that function symbol into less space. compiler/options.m: Add the option --allow-packing-ints which controls whether du_type_layout.m will attempt to pack {int,uint}{8,16,32} arguments alongside enum arguments. Add the option --allow-packing-dummies which controls whether du_type_layout.m will optimize away (in other words, represent in 0 bits) arguments of dummy types. Add the option --allow-double-word-ints which controls whether du_type_layout.m will store arguments of the types int64 and uint64 unboxed in two words on 32 bit platforms, the way it currently stores double precision floats. All three those options are off by default, which preserves binary compatibility with existing code. However, the first two are ready to be switched on (the third is not). All three options are intended to be present in the compiler only until these changes are tested. Once we deem them sufficiently tested, I will modify the compiler to always do the packing they control, at which point we can delete these options. This is why they are not documented. Add the option --inform-suboptimal-packing, whose meaning is described above. doc/user_guide.texi: Document --inform-suboptimal-packing. compiler/prog_data.m: For each argument of a function symbol in a type definition, use a new type called arg_pos_width to record the extra information mentioned above in (offsets for all arguments, and number of bits for packed arguments). For each function symbol that has some existential type constraints, record the extra information mentioned for parse_type_defn.m below. compiler/hlds_data.m: Include the position, as well as the width, in the representation of the arguments of function symbols. Previously, we used the integer 0 as a tag for dummies. Add a tag to represent dummy values, since this gives more information to any code that sees that tag. compiler/ml_unify_gen.m: compiler/unify_gen.m: Handle the packing of dummy values, and of sub-word-sized ints and uints. Compare the cell offset of each argument computed using existing algorithms here with the cell offset recorded in the argument's representation, and abort if they are different. In some cases, restructure code a bit to make it possible. For example, for tuples and closures, this means that instead of simply recording that each tuple argument or closure element is a full word, we must record its correct offset as well. Handle the new dummy_tag. Add prelim (not yet finished) support for double-word int64s/uint64s on 32 bit platforms. When packing the values of two or more variables (or constants) into a single word in a memory cell, optimize away operations that are no-ops, such as shifting anything by zero bits, shifting the constant zero by any number of bits, and ORing anything with zero. This makes the generated code easier to read. It is probably also faster for us to do it here than to write out a bigger expression, have the C compiler read in the bigger expression, and then later make the same optimization. In ml_unify_gen.m, avoid the unnecessary use of a list of the argument variables' types separate from the list of the argument variables themselves; just look up the type of each argument variable when it is processed. compiler/add_special_pred.m: When creating special (unify and compare) predicates for tuples, include the offsets in the representation of their arguments. Delete an unused predicate. compiler/llds.m: Add a new way to create an rval: a cast. We use it to implement the extraction of signed sub-word-sized integers from packed argument words in terms. Masking the right N bits out of the packed word leaves the other 32-N or 64-N bits as zeroes; a cast to int8_t, int16_t or int32_t will copy the sign bit to these bits. Likewise, when we pack signed int{8,16,32} values into words, we cast them to their unsigned versions to throw away any sign-extension bits in their original word-sized representations. No similar change is needed for the MLDS, since that already had a mechanism for casts. compiler/mlds.m: Note a potential simplification in the MLDS. compiler/builtin_lib_types.m: Add functions to return the Mercury representation of the int64 and uint64 types. compiler/foreign.m: Export a specialized version of an existing predicate, to allow ml_unify_gen.m to avoid the costs of the more general version. compiler/hlds_out_module.m: Always print the representations of all arguments, since the inclusion of position information in those representation means that the representations of even all-full-word-argument terms are of potential interest when debugging term representations. compiler/lco.m: Do not try to apply LCO to arguments of dummy types. (We could optimize them differently, by filling them in before they are "computed", but that is a separate optimization, which is of very low priority.) compiler/liveness.m: Do not include variables of dummy types in resume points. The reason for this is that the code that establishes a resume point returns, for each such variable, a list of lvals where that variable can be found. The new code in unify_gen.m will optimize away assignments to values of dummy types, so there is no lval where they can be found. We could allocate one, but doing so would be a pessimization. Instead, we simply don't save and restore such values. When their value (which is always 0) is needed, we can create them out of thin air. compiler/ml_global_data.m: Include the target language in the ml_global_data structure, to prevent some of its users having to look it up in the module_info. Add notes about the specializing the implementation of arrays of int64s/uint64s on 32 bit platforms. compiler/check_typeclass.m: compiler/ml_type_gen.m: Add sanity checks of the new precomputed fields of exist_constraints. Conform to the changes above. compiler/mlds_to_c.m: Add prelim (not yet finished) support for double-word int64s/uint64s on 32 bit platforms. Add notes about possible optimizations. compiler/parse_type_defn.m: When a function symbol in a type definition contains existential arguments, precompute and store the set of constrained and unconstrained type variables. The code in du_type_layout.m needs this information to compute the number of slots occupied by typeinfos and typeclass_infos in memory cells for this function symbol, and several other places in the compiler do too. It is easier and faster to compute this information just once, and this is the earliest time what that can be done. compiler/type_ctor_info.m: Use the prerecorded information about existential types to simplify the code here compiler/polymorphism.m: Add an XXX about possibly using the extra info we now record in exist_constraints to simplify the job of polymorphism.m. compiler/pragma_c_gen.m: compiler/var_locn.m: Create the values of dummy variables from scratch, if needed. compiler/rtti.m: Replace a bool with a bespoke type. compiler/rtti_out.m: compiler/rtti_to_mlds.m: When generating RTTI information for the LLDS and MLDS backends respectively, record new kinds of arguments as needing special treatment. These are int64s and uint64s stored unboxed in two words on 32 bit platforms, {int,uint}{8,16,32} values packed into words, and dummy arguments. Each of these has a special code: its own negative negative value in the num_bits field of the argument. Generate slightly better formatted output. compiler/type_util.m: Delete a predicate that isn't needed anymore. compiler/opt_util.m: Delete a function that hasn't been needed for a while. Conform to the changes above. compiler/arg_pack.m: compiler/bytecode_gen.m: compiler/call_gen.m: compiler/code_util.m: compiler/ctgc.selector.m: compiler/dupelim.m: compiler/dupproc.m: compiler/equiv_type.m: compiler/equiv_type_hlds.m: compiler/erl_code_gen.m: compiler/erl_rtti.m: compiler/export.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out_data.m: compiler/middle_rec.m: compiler/ml_closure_gen.m: compiler/ml_switch_gen.m: compiler/ml_top_gen.m: compiler/module_qual.qualify_items.m: compiler/opt_debug.m: compiler/parse_tree_out.m: compiler/peephole.m: compiler/recompilation.usage.m: compiler/resolve_unify_functor.m: compiler/stack_layout.m: compiler/structure_reuse.direct.choose_reuse.m: compiler/switch_util.m: compiler/typecheck.m: compiler/unify_proc.m: compiler/unused_imports.m: compiler/xml_documentation.m: Conform to the changes above. compiler/llds_out_util.m: Add a comment. compiler/ml_code_util.m: Factor out some common code. runtime/mercury_type_info.h: Allocate special values of the MR_arg_bits field of the MR_DuArgLocn type to designate arguments as two word int64/uint64s, as sub-word-sized arguments of types {int,uint}{8,16,32}, or as arguments of dummy types. (We already had a special value for two word float arguments.) Document the list of places that know about this code, so that they can be updated if and when it changes. library/construct.m: Handle the construction of terms with two-word int64/uint64 arguments, with packed {int,uint}{8,16,32} arguments, and with dummy arguments. Factor out the code common to the sectag-present and sectag-absent cases, to make it possible to do the above in just one place. library/store.m: Add an XXX to a place that I don't think handles two word arguments correctly. (I think this is an old bug.) runtime/mercury_deconstruct.c: Handle the deconstruction of terms with two-word int64/uint64 arguments, with packed {int,uint}{8,16,32} arguments, and with dummy arguments. runtime/mercury_deep_copy_body.h: Handle the copying of terms with two-word int64/uint64 arguments, with packed {int,uint}{8,16,32} arguments, and with dummy arguments. Give a macro a more descriptive name. runtime/mercury_type_info.c: Handle taking the size of terms with two-word int64/uint64 arguments, with packed {int,uint}{8,16,32} arguments, and with dummy arguments. runtime/mercury.h: Put related definitions next to each other. runtime/mercury_deconstruct.h: runtime/mercury_ml_expand_body.h: Fix indentation. tests/hard_coded/construct_test.{m,exp}: Add to this test case a test of the construction, via the library's construct.m module, of terms containing packed sub-word-sized integers, and packed dummies. tests/hard_coded/deconstruct_arg.{m,exp}: Convert the source code of this test case to state variable notation, and update the line number references (in the names of predicates created from lambda expressions) accordingly. tests/hard_coded/uint64_ground_term.{m,exp}: A new test case to check that uint64 values too large to be int64 values can be stored in static structures. tests/hard_coded/Mmakefile: Enable the new test case.	2018-05-05 13:22:19 +02:00
Zoltan Somogyi	15aa457e12	Delete $module arg from calls to unexpected.	2018-04-07 18:25:43 +10:00
Zoltan Somogyi	0d31eaf4c3	Convert (C->T;E) to (if C then T else E).	2015-09-21 05:47:55 +10:00
Zoltan Somogyi	13b6f03f46	Module qualify end_module declarations. compiler/*.m: Module qualify the end_module declarations. In some cases, add them. compiler/table_gen.m: Remove an unused predicate, and inline another in the only place where it is used. compiler/add_pragma.m: Give some predicates more meaningful names.	2014-09-04 00:24:52 +02:00
Peter Wang	b28e82a1d5	Make most_specific_rval handle mkword_hole. dupelim.most_specific_rval did not handle the `mkword_hole' option that was added to the `rval' type. compiler/dupelim.m: Fix the bug. tests/valid/Mmakefile: tests/valid/dupelim_mkword_hole.m: Add test case.	2013-05-22 14:31:01 +10:00
Peter Wang	4d38590690	Construct partially instantiated direct arg functor values. Construction unifications of partially instantiated values involving direct argument functors (where the single argument is free) did not generate any code in both low-level and high-level backends. Incorrect behaviour could result if the program tried to deconstruct the value at run-time. Also, in the LLDS backend, such a construction unification did not enter the variable into the var_state_map, leading to a compiler abort when the variable is looked up. compiler/ml_unify_gen.m: Generate code for constructions of a direct arg functor with free argument. This amounts to assigning a variable to a tagged null pointer. compiler/llds.m: Add an rval option `mkword_hole', which is like `mkword' but the pointer to be tagged is unspecified. compiler/unify_gen.m: Assign a variable to an `mkword_hole' rval, for a construction unification of a direct arg functor with a free argument. Reassign the variable to an `mkword' rval when the argument becomes bound in a later unification. compiler/code_info.m: compiler/var_locn.m: Add a predicate to reassign a variable from a `mkword_hole' expression to a `mkword' expression. compiler/llds_out_data.m: Write out `mkword_hole' values as a tagged null pointer in C code. compiler/call_gen.m: compiler/code_util.m: compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/stack_layout.m: Conform to addition of `mkword_hole'. tests/hard_coded/Mmakefile: tests/hard_coded/direct_arg_partial_inst.exp: tests/hard_coded/direct_arg_partial_inst.m: tests/hard_coded/direct_arg_partial_inst2.exp: tests/hard_coded/direct_arg_partial_inst2.m: Add test cases.	2013-02-14 16:37:04 +11:00
Zoltan Somogyi	c650eaddd2	A bunch of individually small changes to speed up the compiler when compiling Estimated hours taken: 8 Branches: main A bunch of individually small changes to speed up the compiler when compiling training_cars_full.m. Altogether, the changes speed up the compiler on that task by a bit more than 11% when the target grade is asm_fast.gc, and by a bit more than 7% when the target grade is hlc.gc. (Several of the changes affect the code that optimizes the LLDS; we don't have corresponding optimizers for the MLDS.) compiler/c_util.m: Specialize the code that prints out quoted strings for the target language. We don't want to check the target language during the conversion of EVERY SINGLE CHARACTER. compiler/dead_proc_elim.m: When we analyze the module for inlining, we are only after the use counts of procedures. We do not need to traverse ground structures to get those counts. compiler/dupelim.m: Do the search and insertion in the standardized code sequence map in one pass. compiler/global_data.m: compiler/ml_global_data.m: Do the search and insertion in the scalar data map in one pass. library/bimap.m: Add a search_insert predicate to make possible the changes in {ml_,}global_data.m. NEWS: Mention the new predicate in bimap.m. compiler/inst_match.m: Do searches and insertions in sets of expansions in one pass. Highlight discrepancies between comments on the declarations of two predicates and comments on their code. compiler/llds_out_global.m: compiler/post_typecheck.m: Reorder the bodies of some test conditions to put the cheaper and more-frequently-failing tests first. compiler/labelopt.m: compiler/opt_util.m: Do not require opt_util to return a list of code addresses that labelopt then throws away; allow opt_util.m not to gather those addresses in the first place (if the unused_args optimization is applied to it, which it is by default.) In opt_util.m, make an unnecessarily-exported predicate private. compiler/prog_data.m: Use predicates in varset.m that do directly what we want, instead of using a different predicate and then post-processing its output. (The code was originally written before the directly useful predicate in varset.m was available.) compiler/type_util.m: Specialize the frequently occurring case of no typeclass constraints at all. compiler/typecheck_info.m: Give the field names of some types identifying prefixes. Make a function symbol's name more meaningful. compiler/typecheck.m: compiler/typecheck_errors.m: Conform to the changes in typecheck_info.m.	2012-06-19 07:21:24 +00:00
Peter Wang	2ccac171dd	Add float registers to the Mercury abstract machine, implemented as an Branches: main Add float registers to the Mercury abstract machine, implemented as an array of MR_Float in the Mercury engine structure. Float registers are only useful if a Mercury `float' is wider than a word (i.e. when using double precision floats on 32-bit platforms) so we let them exist only then. In other cases floats may simply be passed via the regular registers, as before. Currently, higher order calls still require the use of the regular registers for all arguments. As all exported procedures are potentially the target of higher order calls, exported procedures must use only the regular registers for argument passing. This can lead to more (un)boxing than if floats were simply always boxed. Until this is solved, float registers must be enabled explicitly with the developer only option `--use-float-registers'. The other aspect of this change is using two consecutive stack slots to hold a single double variable. Without that, the benefit of passing unboxed floats via dedicated float registers would be largely eroded. compiler/options.m: Add developer option `--use-float-registers'. compiler/handle_options.m: Disable `--use-float-registers' if floats are not wider than words. compiler/make_hlds_passes.m: If `--use-float-registers' is in effect, enable a previous change that allows float constructor arguments to be stored unboxed in structures. compiler/hlds_llds.m: Move `reg_type' here from llds.m and `reg_f' option. Add stack slot width to `stack_slot' type. Add register type and stack slot width to `abs_locn' type. Remember next available float register in `abs_follow_vars'. compiler/hlds_pred.m: Add register type to `arg_loc' type. compiler/llds.m: Add a new kind of lval: double-width stack slots. These are used to hold double-precision floating point values only. Record setting of `--use-float-registers' in exprn_opts. Conform to addition of float registers and double stack slots. compiler/code_info.m: Make predicates take the register type as an argument, where it can no longer be assumed. Remember whether float registers are being used. Remember max float register for calls to MR_trace. Count double width stack slots as two slots. compiler/arg_info.m: Allocate float registers for procedure arguments when appropriate. Delete unused predicates. compiler/var_locn.m: Make predicates working with registers either take the register type as an argument, or handle both register types at once. Select float registers for variables when appropriate. compiler/call_gen.m: Explicitly use regular registers for all higher-order calls, which was implicit before. compiler/pragma_c_gen.m: Use float registers, when available, at the interface between Mercury code and C foreign_procs. compiler/export.m: Whether a float rval needs to be boxed/unboxed when assigned to/from a register depends on the register type. compiler/fact_table.m: Use float registers for arguments to predicates defined by fact tables. compiler/stack_alloc.m: Allocate two consecutive stack slots for float variables when appropriate. compiler/stack_layout.m: Represent double-width stack slots in procedure layout structures. Conform to changes. compiler/store_alloc.m: Allocate float registers (if they exist) for float variables. compiler/use_local_vars.m: Substitute float abstract machine registers with MR_Float local variables. compiler/llds_out_data.m: compiler/llds_out_instr.m: Output float registers and double stack slots. compiler/code_util.m: compiler/follow_vars.m: Count float registers separately from regular registers. compiler/layout.m: compiler/layout_out.m: compiler/trace_gen.m: Remember the max used float register for calls to MR_trace(). compiler/builtin_lib_types.m: Fix incorrect definition of float_type_ctor. compiler/bytecode_gen.m: compiler/continuation_info.m: compiler/disj_gen.m: compiler/dupelim.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/hlds_out_goal.m: compiler/jumpopt.m: compiler/llds_to_x86_64.m: compiler/lookup_switch.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/par_conj_gen.m: compiler/proc_gen.m: compiler/string_switch.m: compiler/tag_switch.m: compiler/tupling.m: compiler/x86_64_regs.m: Conform to changes. runtime/mercury_engine.h: Add an array of fake float "registers" to the Mercury engine structure, when MR_Float is wider than MR_Word. runtime/mercury_regs.h: Document float registers in the Mercury abstract machine. Add macros to access float registers in the Mercury engine. runtime/mercury_stack_layout.h: Add new MR_LongLval cases to represent double-width stack slots. MR_LONG_LVAL_TAGBITS had to be increased to accomodate the new cases, which increases the number of integers in [0, 2^MR_LONG_LVAL_TAGBITS) equal to 0 modulo 4. These are the new MR_LONG_LVAL_TYPE_CONS_n cases. Add max float register field to MR_ExecTrace. runtime/mercury_layout_util.c: runtime/mercury_layout_util.h: Extend MR_copy_regs_to_saved_regs and MR_copy_saved_regs_to_regs for float registers. Understand how to look up new kinds of MR_LongLval: MR_LONG_LVAL_TYPE_F (previously unused), MR_LONG_LVAL_TYPE_DOUBLE_STACKVAR, MR_LONG_LVAL_TYPE_DOUBLE_FRAMEVAR. Conform to the new MR_LONG_LVAL_TYPE_CONS_n cases. runtime/mercury_float.h: Delete redundant #ifdef. runtime/mercury_accurate_gc.c: runtime/mercury_agc_debug.c: Conform to changes (untested). trace/mercury_trace.c: trace/mercury_trace.h: trace/mercury_trace_declarative.c: trace/mercury_trace_external.c: trace/mercury_trace_internal.c: trace/mercury_trace_spy.c: trace/mercury_trace_vars.c: trace/mercury_trace_vars.h: Handle float registers in the trace subsystem. This is mostly a matter of saving/restoring them as with regular registers.	2011-10-17 04:31:33 +00:00
Zoltan Somogyi	517fbac88e	Add four LLDS instructions Paul will soon need to implement the loop control Estimated hours taken: 8 Branches: main Add four LLDS instructions Paul will soon need to implement the loop control transformation. compiler/llds.m: Add the new instructions. compiler/llds_out_instr.m: Output the new instructions. Paul may want to change the code we generate. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/reassign.m: compiler/use_local_vars.m: Handle the new instructions. In opt_util.m, fix two old bugs. First, the restore_maxfr instruction behaved as if it updated hp, not maxfr. Second, the keep_assign instruction wasn't being handled as an assignment operation. In peephole.m, fix an old bug, in which assignments through mem_refs were not considered to invalidate the cached value of an lval. In use_local_vars, fix an old bug: the keep_assign instruction wasn't being handled as an assignment operation. Assignments themselves weren't being as optimized as they could be.	2011-09-30 05:53:51 +00:00
Zoltan Somogyi	7c5fe1e988	Record the number of instructions in each basic block. Estimated hours taken: 1 Branches: main compiler/basic_block.m: Record the number of instructions in each basic block. compiler/use_local_vars.m: Do not perform this quadratic optimization on basic blocks on which it would take too long. compiler/dupelim.m: Don't try to detect duplicates in big blocks; the attempt is expensive, and also very likely to fail. (Big blocks are unlikely to duplicated; the optimization was meant for redundant copies of the procedure epilogue.) compiler/livemap.m: Put a limit on the number of iterations done by the fixpoint algorithm.	2011-05-25 08:04:27 +00:00
Zoltan Somogyi	295415090e	Convert almost all remaining modules in the compiler to use Estimated hours taken: 6 Branches: main compiler/*.m: Convert almost all remaining modules in the compiler to use "$module, $pred" instead of "this_file" in error messages. In a few cases, the old error message was misleading, since it contained an incorrect, out-of-date or cut-and-pasted predicate name. tests/invalid/unresolved_overloading.err_exp: Update an expected output containing an updated error message.	2011-05-23 05:08:24 +00:00
Julien Fischer	0e48dfc031	Mark procedures whose names use the suffix "_det" to indicate that the procedure Branches: main Mark procedures whose names use the suffix "_det" to indicate that the procedure is a det version of a semidet procedure of the same name (modulo the suffix) as obsolete. The versions that use "det_" as a prefix should be used instead. (The latter naming scheme is the one in general use throughout the standard library.) library/dir.m: library/list.m: library/stack.m: As above. Add versions with the "det_" suffix where they were not already present. Group function definitions together with the corresponding predicate definition. library/cord.m: library/erlang_rtti_implementation.m: library/io.m: library/string.m: compiler/*.m: browser/declarative_execution.m: browser/declarative_tree.m: ssdb/ssdb.m: Conform to the above changes. library/Mercury.options: Delete a setting for a deleted module. NEWS: Announce this change.	2011-05-10 04:12:28 +00:00
Julien Fischer	9ae7fe6b70	Change the argument ordering of predicates in the set module. Branches: main Change the argument ordering of predicates in the set module. library/set.m: Change predicate argument orders to match the versions in the svset module. Group function definitions with the corresponding predicates rather than at the end of the file. Delete Ralph's comments regarding the argument order in the module interface: readers of the library reference guide are unlikely to be interested in his opinion of the argument ordering ten or so years ago. Add extra modes for set.map/3 and set.map_fold/5. library/svset.m: library/eqvclass.m: library/tree234.m: library/varset.m: browser/.m: compiler/.m: deep_profiler/*.m: mdbcomp/trace_counts.m: extras/moose/grammar.m: extras/moose/lalr.m: extras/moose/moose.m: tests/hard_coded/bitset_tester.m: Conform to the above change. NEWS: Announce the above changes.	2011-05-06 05:03:29 +00:00
Julien Fischer	9f68c330f0	Change the argument order of many of the predicates in the map, bimap, and Branches: main Change the argument order of many of the predicates in the map, bimap, and multi_map modules so they are more conducive to the use of state variable notation, i.e. make the order the same as in the sv* modules. Prepare for the deprecation of the sv{bimap,map,multi_map} modules by removing their use throughout the system. library/bimap.m: library/map.m: library/multi_map.m: As above. NEWS: Announce the change. Separate out the "highlights" from the "detailed listing" for the post-11.01 NEWS. Reorganise the announcement of the Unicode support. benchmarks//.m: browser/.m: compiler/.m: deep_profiler/.m: extras//.m: mdbcomp/.m: profiler/.m: tests//.m: ssdb/.m: samples//.m slice/*.m: Conform to the above change. Remove any dependencies on the sv{bimap,map,multi_map} modules.	2011-05-03 04:35:04 +00:00
Paul Bone	322feaf217	Add more threadscope instrumentation. This change introduces instrumentation that tracks sparks as well as parallel conjunctions and their conjuncts. This should hopefully give us more information to diagnose runtime performance issues. As of this date the ThreadScope program hasn't been updated to read or understand these new events. runtime/mercury_threadscope.[ch]: Added a function and types to register all the threadscope strings from an array. Add functions to post the new events (see below). runtime/mercury_threadscope.c: Added support for 5 new threadscope events. Registering a string so that other messages may refer to a constant string. Marking the beginning and ends of parallel conjunctions. Creating a spark for a parallel conjunct. Finishing a parallel conjunct. Re-arranged event IDs, I've started allocating IDs from 38 onwards for general purposes and 100 onwards for mercury specific events after talking with Duncan Coutts. Trimmed excess whitespace from the end of lines. runtime/mercury_context.h: Post a beginning parallel conjunction message when the sync term for the parallel conjunction is initialized. Post an event when creating a spark for a parallel conjunction. Add a MR_spark_id field to the MR_Spark structure, these identify sparks to threadscope. runtime/mercury_context.c: Post threadscope messages when a spark is about to be executed. Post a threadscope event when a parallel conjunct is completed. Add a missing memory barrier. runtime/mercury_wrapper.[ch]: Create a global function pointer for the code that registers strings in the threadscope string table, this is filled in by mkinit. Call this function pointer immediatly after setting up threadscope. runtime/mercury_wsdeque.[ch]: Modify MR_wsdeque_pop_bottom to return the spark pointer (which points onto the queue) rather then returning a result through a pointer and bool if the operation was successful. This pointer is safe to dereference until MR_wsdeque_push_bottom is used. runtime/mercury_wsdeque.c: Corrected a code comment. runtime/mercury_engine.h: Documented some of the fields of the engine structure that hadn't been documented. Add a next spark ID field to the engine structure. Change the type of the engine ID field to MR_uint_least16_t compiler/llds.m: Add a third field to the init_sync_term instruction that stores the index into the threadscope string table of the static conjunction ID. Add a field to the c_file structure containing the threadscope string table. compiler/layout.m: Added a new layout array name for the threadscope string table. compiler/layout_out.m: Implement code to write out the threadscope string table. compiler/llds_out_file.m: Write out the threadscope string table when writing out the c_file. compiler/par_conj_gen.m: Create strings that statically identify parallel conjunctions for each init_sync_term LLDS instruction. These strings are added to a table in the !CodeInfo and the index of the string is added to the init_sync_term instruction. Add an extra instruction after a parallel conjunction to post the message that the parallel conjunction has completed. compiler/global_data.m: Add fields to the global data structure to represent the threadscope string table and its current size. Add predicates to update and retrieve the table. Handle merging of threadscope string tables in global data by allowing the references to the strings to be remapped. Refactored remapping code so that a caller such as proc_gen only needs to call one remapping predicate after merging global data.. compiler/code_info.m: Add a table of strings for use with threadscope to the code_info_persistent type. Modify the code_info_init to initialise the threadscope string table fields. Add a predicate to get the string table and another to update it. compiler/proc_gen.m: Build the containing goal map before code generation for procedures with parallel conjunctions in a parallel grade. par_conj_gen.m depends on this. Conform to changes in code_info.m and global_data.m compiler/llds_out_instr.m: Write out the extra parameter in the init_sync_term instruction. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/mercury_compile_llds_back_end.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/reassign.m: compiler/use_local_vars.m: Conform to changes in llds.m compiler/opt_debug.m: Conform to changes in layout.m compiler/mercury_compile_llds_back_end.m: Fix some trailing whitespace. util/mkinit.c: Build an initialisation function that registers all the strings in threadscope string tables. Correct the layout of a comment.	2011-03-25 03:13:42 +00:00
Zoltan Somogyi	1c3bc03415	Make the system compiler with --warn-unused-imports. Estimated hours taken: 2 Branches: main, release Make the system compiler with --warn-unused-imports. browser/.m: library/.m: compiler/*.m: Remove unnecesary imports as flagged by --warn-unused-imports. In some files, do some minor cleanup along the way.	2010-12-30 11:18:04 +00:00
Zoltan Somogyi	8a28e40c9b	Add the predicates sorry, unexpected and expect to library/error.m. Estimated hours taken: 2 Branches: main Add the predicates sorry, unexpected and expect to library/error.m. compiler/compiler_util.m: library/error.m: Move the predicates sorry, unexpected and expect from compiler_util to error. Put the predicates in error.m into the same order as their declarations. compiler/.m: Change imports as needed. compiler/lp.m: compiler/lp_rational.m: Change imports as needed, and some minor cleanups. deep_profiler/.m: Switch to using the new library predicates, instead of calling error directly. Some other minor cleanups. NEWS: Mention the new predicates in the standard library.	2010-12-15 06:30:36 +00:00
Zoltan Somogyi	9bdc5db590	Try to work around the Snow Leopard linker's performance problem with Estimated hours taken: 20 Branches: main Try to work around the Snow Leopard linker's performance problem with debug grade object files by greatly reducing the number of symbols needed to represent the debugger's data structures. Specifically, this diff groups all label layouts in a module, each of which previously had its own named global variable, into only a few (one to four) global variables, each of which is an array. References to the old global variables are replaced by references to slots in these arrays. This same treatment could also be applied to other layout structures. However, most layouts are label layouts, so doing just label layouts gets most of the available benefit. When the library and compiler are compiled in grade asm_fast.gc.debug, this diff leads to about a 1.5% increase in the size of their generated C source files (from 338 to 343 Mb), but a more significant reduction (about 17%) in the size of the corresponding object files (from 155 to 128 Mb). This leads to an overall reduction in disk requirements from 493 to 471 Mb (about 4.5%). Since we generate the same code and data as before, with the data just being arranged differently, the decrease in object file sizes is coming from the reduction in relocation information, the information processed by the linker. This should speed up the linker. compiler/layout.m: Make the change described above. We now define up to four arrays: one each for label layouts with and without information about variables, one for the layout structures of user events, and one for the variable number lists of user events. compiler/layout_out.m: Generate the new arrays that the module being compiled needs. Use purpose-specific types instead of booleans. compiler/trace_gen.m: Use a new field in foreign_proc_code instructions to record the identity of any labels whose layout structures we want to refer to, even though layout structures have not been generated yet. The labels will be looked up in a map (generated together with the layout structures) by llds_out.m. compiler/llds.m: Add this extra field to foreign_proc_code instructions. Add the map (which is actually in two parts) to the c_file type, which is the data structure representing the entire LLDS. Also add to the c_file type some other data structures that previously we used to hand around alongside it. Some of these data structures used to conmingle layout structures that we now separate. compiler/stack_layout.m: Generate array slots instead of separate structures for label layouts. Return the different arrays separately. compiler/llds_out.m: Order the output of layout structures to require fewer forward declarations. The forward declarations of the few arrays holding the label layout structures replace a lot of the declarations previously needed. Include the information needed by layout_out.m in the llds_out_info, and conform to the changes above. As a side-effect of all these changes, we now generate proc layout structures in the same order as the procedures' appearence in the HLDS, which is the same as their order in the source code, modulo any procedures added by the compiler itself (for lambdas, unification predicates, etc). compiler/code_info.m: compiler/dupelim.m: compiler/dup_proc.m: compiler/exprn_aux.m: compiler/frameopt.m: compiler/global_data.m: compiler/ite_gen.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/mercury_compile_llds_back_end.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/pragma_c_gen.m: compiler/proc_gen.m: compiler/reassign.m: compiler/use_local_vars.m: Conform to the changes above. runtime/mercury_goto.h: Add the macros used by the new code in layout_out.m and llds_out.m. We need new macros because the old ones assumed that the C preprocessor can construct the address of a label's layout structure from the name of the label, which is obviously no longer possible. Make even existing families of macros handle in bulk up to 10 labels, up from the previous 8. runtime/mercury_stack_layout.h: Add macros for use by the new code in layout.m. tests/debugger/.{inp,exp}: tests/debugger/declarative/.{inp,exp}: Update these test cases to account for the new (and better) order of proc layout structures. Where inputs changed, this was to ensure that we still select the same procedures from lists of procedures, e.g. to put a breakpoint on.	2009-10-21 06:36:37 +00:00
Peter Wang	e0ff2b1903	Implement conditional structure reuse for LLDS backends using Boehm GC. Estimated hours taken: 15 Branches: main Implement conditional structure reuse for LLDS backends using Boehm GC. Verify at run time, just before reusing a dead cell, that the base address of the cell was dynamically allocated. If not, fall back to allocating a new object on the heap. This makes structure reuse safe without having to disable static data. In the simple case, the generated C code looks like this: MR_tag_reuse_or_alloc_heap(dest, tag, addr_of_reuse_cell, MR_tag_alloc_heap(dest, tag, count)); ...assign fields... If some of the fields are known to already have the correct values then we can avoid assigning them. We need to handle both reuse and non-reuse cases: MR_tag_reuse_or_alloc_heap_flag(dest, flag_reg, tag, addr_of_reuse_cell, MR_tag_alloc_heap(dest, tag, count)); /* flag_reg is non-zero iff reuse is possible */ if (flag_reg) { goto skip; } ...assign fields which don't need to be assigned in reuse case... skip: ...assign fields which must be assigned in both cases... It may be that it is not worth the branch to avoid assigning known fields. I haven't yet checked. compiler/llds.m: Extend the `incr_hp' instruction to hold information for structure reuse. compiler/code_info.m: Generate a label and pass it to `var_locn_assign_cell_to_var'. The label is only needed for the type of code shown above. compiler/var_locn.m: Change the code generated for cell reuse. Rather than assigning the dead cell's address to the target lval unconditionally, generate an `incr_hp' instruction with the reuse field filled in. Generate code that avoids filling in known fields if possible. Abort if we see `construct_statically(_)' in `var_locn_assign_dynamic_cell_to_var'. runtime/mercury_heap.h: runtime/mercury_conf_param.h: Add a macro to check if an address is between `GC_least_plausible_heap_addr' and `GC_greatest_plausible_heap_addr', which are therefore in the heap. Add macros to conditionally reuse a cell or otherwise fall back to allocating a new object. Make it possible to revert to unconditional structure reuse by defining the C macro `MR_UNCONDITIONAL_STRUCTURE_REUSE'. compiler/llds_out.m: Call the new macros in `mercury_heap.h' for `incr_hp' instructions with reuse information filled in. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/reassign.m: compiler/unify_gen.m: compiler/use_local_vars.m: Conform to the changed `incr_hp' instruction.	2008-02-11 03:56:13 +00:00
Peter Wang	fa80b9a01a	Make the parallel conjunction execution mechanism more efficient. Branches: main Make the parallel conjunction execution mechanism more efficient. 1. Don't allocate sync terms on the heap. Sync terms are now allocated in the stack frame of the procedure call which originates a parallel conjunction. 2. Don't allocate individual sparks on the heap. Sparks are now stored in preallocated, growing arrays using an algorithm that doesn't use locks. 3. Don't have one mutex per sync term. Just use one mutex to protect concurrent accesses to all sync terms (it's is rarely needed anyway). This makes sync terms smaller and saves initialising a mutex for each parallel conjunction encountered. 4. We don't bother to acquire the global sync term lock if we know a parallel conjunction couldn't be executing in parallel. In a highly parallel program, the majority of parallel conjunctions will be executed sequentially so protecting the sync terms from concurrent accesses is unnecessary. par_fib(39) is ~8.4 times faster (user time) on my laptop (Linux 2.6, x86_64), which is ~3.5 as slow as sequential execution. configure.in: Update the configuration for a changed MR_SyncTerm structure. compiler/llds.m: Make the fork instruction take a second argument, which is the base stack slot of the sync term. Rename it to fork_new_child to match the macro name in the runtime. compiler/par_conj_gen.m: Change the generated code for parallel conjunctions to allocate sync terms on the stack and to pass the sync term to fork_new_child. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/reassign.m: compiler/use_local_vars.m: Conform to the change in the fork instruction. compiler/liveness.m: compiler/proc_gen.m: Disable use of the parallel conjunction operator in the compiler as older versions of the compiler will generate code incompatible with the new runtime. runtime/mercury_context.c: runtime/mercury_context.h: Remove the next pointer field from MR_Spark as it's no longer needed. Remove the mutex from MR_SyncTerm. Add a field to record if a spark belonging to the sync term was scheduled globally, i.e. if the parallel conjunction might be executed in parallel. Define MR_SparkDeque and MR_SparkArray. Use MR_SparkDeques to hold per-context sparks and global sparks. Change the abstract machine instructions MR_init_sync_term, MR_fork_new_child, MR_join_and_continue as per the main change log. Use a preprocessor macro MR_LL_PARALLEL_CONJ as a shorthand for !MR_HIGHLEVEL_CODE && MR_THREAD_SAFE. Take the opportunity to clean things up a bit. runtime/mercury_wsdeque.c: runtime/mercury_wsdeque.h: New files containing an implementation of work-stealing deques. We don't do work stealing yet but we use the underlying data structure. runtime/mercury_atomic.c: runtime/mercury_atomic.h: New files to contain atomic operations. Currently it just contains compare-and-swap for gcc/x86_64, gcc/x86 and gcc-4.1. runtime/Mmakefile: Add the new files. runtime/mercury_engine.h: runtime/mercury_mm_own_stacks.c: runtime/mercury_wrapper.c: Conform to runtime changes. runtime/mercury_conf_param.h: Update an outdated comment.	2007-10-11 11:45:22 +00:00
Zoltan Somogyi	b48eaf8073	Add a first draft of the code generator support for region based memory Estimated hours taken: 30 Branches: main Add a first draft of the code generator support for region based memory management. It is known to be incomplete; the missing parts are marked by XXXs. It may also be buggy; it will be tested after Quan adds the runtime support, i.e. the C macros invoked by the new LLDS instructions. However, the changes in this diff shouldn't affect non-RBMM operations. compiler/llds.m: Add five new LLDS instructions. Four are specific to RBMM operations. RBMM embeds three new stacks in compiler-reserved temp slots in procedure's usual Mercury stack frames, and the new LLDS instructions respectively (i) push those stack frames onto their respective stacks, (ii) fill some variable parts of those stack frames, (iii) fill fixed slots of those stack frames, and (iv) use the contents of and/or pop those stack frames. (The pushing and popping affect only the new embedded stacks, not the usual Mercury stacks.) The last instruction is a new variant of the old assign instruction. It has identical semantics, but restricts optimization. An assign (a) can be deleted if its target lval is not used, and (b) its target lval can be changed (e.g. to a temp register) as long as all the later instructions referring to that lval are changed to use the new lval instead. Neither is permitted for the new keep_assign instruction. This is required because in an earlier draft we used it to assign to stack variables (parts of the embedded stack frames) that aren't explicitly referred to in later LLDS code, but are nevertheless implicitly referred to by some instructions (specifically iv above). We now use a specialized instruction (iii above) for this (since the macro it invokes can refer to C structure names, this makes it easier to keep the compiler in sync with the runtime system), but given that keep_assign is already implemented, may be useful later and shouldn't cause appreciable slowdown of the compiler, this diff keeps it. Extend the type that describe the contents of lvals to allow it to describe the new kinds of things we can now store in them. Add types to manage and describe the new embedded stack frames, and some utility functions. Change some existing utility functions to make all this more conceptually consistent. compiler/ite_gen.m: Surround the code we generate for the condition of if-then-elses with the code required to ensure that regions that are logically removed in the condition aren't physically destroyed until we know that the condition succeeds (since the region may still be needed in the else branch), and to make sure that if the condition fails, all the memory allocated since the entry into the condition is reclaimed instantly. compiler/disj_gen.m: Surround the code we generate for disjunctions with the code required to ensure that regions that are logically removed in a disjunct aren't physically destroyed if a later disjunct needs them, and to make sure that at entry into a non-first disjunct, all the memory allocated since the entry into the disjunction is reclaimed instantly. compiler/commit_gen.m: compiler/code_info.m: The protection against destruction offered by a disjunction disappears when a commit cuts away all later alternatives in that disjunct, so we must undo that protection. We therefore surround the scope of a commit goal with goal that achieves that objective. Add some new utility predicates to code_info. Remove some old utility functions that are now in llds.m. compiler/continuation_info.m: Extend the type that describe the contents of stack slots to allow it to describe the new kinds of things we can now store in them. Rename the function symbols of that type to eliminate some ambiguities. compiler/code_gen.m: Remember the set of variables live at the start of the goal (before the pre_goal_update updates it), since the region operations need to know this. Leave the lookup of AddTrailOps (and now AddRegionOps) to the specific kinds of goals that need it (the most frequent goals, unify and call, do not). Make both AddTrailOps and AddRegionOps use a self-explanatory type instead of a boolean. compiler/lookup_switch.m: Conform to the change to AddTrailOps. Fix some misleading variable names. compiler/options.m: Add some options to control the number of stack slots needed for various purposes. These have to correspond to the sizes of some C structures in the runtime system. Eventually these will be constants, but it is handy to keep them easily changeable while the C data structures are still being worked on. Add an option for optimizing away region ops whereever possible. The intention is that these should be on all the time, but we will want to turn them off for benchmarking. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/frameopt.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/par_conj_gen.m: compiler/reassign.m: compiler/stack_layout.m: compiler/stdlabel.m: compiler/trace_gen.m: compiler/use_local_vars.m: Conform to the changes above, which mostly means handling the new LLDS instructions. In some cases, factor out existing common code, turn if-then-elses into switches, group common cases in switches, rationalize argument orders or variable names, and/or put code in execution order. In reassign.m, fix some old oversights that could (in some unlikely cases) cause bugs in the generated code. compiler/pragma_c_gen.m: Exploit the capabilities of code_info.m. compiler/prog_type.m: Add a utility predicate.	2007-07-31 01:56:41 +00:00
Quan Phan	d4818a3ca4	Modify the code generator so that it recognizes construct_in_region and Estimated hours taken: 35. Branch: main. Modify the code generator so that it recognizes construct_in_region and generates suitable code when RBMM is used. The main changes are in unify_gen.m. incr_hp is also changed to receive one more (maybe) argument for region. compiler/unify_gen.m: Make it aware of HowToConstruct. This is the starting point of the changes in the code generator so that it can generate code which constructs terms in regions. compiler/code_info.m: compiler/var_locn.m: Change in accordance with the introduction of how_to_construct in unify_gen.m. compiler/llds.m: Add one extra argument to incr_hp for the region to construct terms in. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/par_conj_gen.m: compiler/reassign.m: compiler/use_local_vars.m: Change to deal with the extra maybe region argument in incr_hp. compiler/llds_out.m: Modify so that when RBMM is used it generates suitable call to the region runtime for allocating terms in regions. The region runtime (in C code) will be posted in anothe email. compiler/hlds_data.m: Fix a typo. compiler/rbmm.interproc_region_lifetime.m: Change to comply with coding standard.	2007-07-09 13:28:36 +00:00
Zoltan Somogyi	ba93a52fe7	This diff changes a few types from being defined as equivalent to a pair Estimated hours taken: 10 Branches: main This diff changes a few types from being defined as equivalent to a pair to being discriminated union types with their own function symbol. This was motivated by an error message (one of many, but the one that broke the camel's back) about "-" being used in an ambiguous manner. It will reduce the number of such messages in the future, and will make compiler data structures easier to inspect in the debugger. The most important type changed by far is hlds_goal, whose function symbol is now "hlds_goal". Second and third in importance are llds.instruction (function symbol "llds_instr") and prog_item.m's item_and_context (function symbol "item_and_context"). There are some others as well. In several places, I rearranged predicates to factor the deconstruction of goals into hlds_goal_expr and hlds_goal_into out of each clause into a single point. In many places, I changed variable names that used "Goal" to refer to just hlds_goal_exprs to use "GoalExpr" instead. I also changed variable names that used "Item" to refer to item_and_contexts to use "ItemAndContext" instead. This should make reading such code less confusing. I renamed some function symbols and predicates to avoid ambiguities. I only made one algorithmic change (at least intentionally). In assertion.m, comparing two goals for equality now ignores goal_infos for all kinds of goals, whereas previously it ignored them for most kinds of goals, but for shorthand goals it was insisting on them being equal. This seemed to me to be a bug. Pete, can you confirm this?	2007-01-06 09:23:59 +00:00
Zoltan Somogyi	d66ed699a1	Add fields to structures representing the C code itself that says whether Estimated hours taken: 4 Branches: main Add fields to structures representing the C code itself that says whether or not the C code affects the liveness of lvals. This is intended as the basis for future improvements in the optimization of such code. Implement a new foreign_proc attribute that allows programmers to set the value of this field. Eliminate names referring to `pragma c_code' in the LLDS backend in favor of names referring to foreign_procs. compiler/llds.m: Make the changes described above. Consistently put the field containing C code last in the function symbols that contain them. compiler/prog_data.m: Make the changes described above. Rename some other function symbols to avoid ambiguity. compiler/prog_io_pragma.m: Parse the new foreign_proc attribute. doc/reference_manual.texi: Document the new attribute. compiler/pragma_c_gen.m: Rename the main predicates. compiler/opt_util.m: Change some predicates into functions, for more convenient invocation. compiler/livemap.m: Rename the predicates in this module to avoid ambiguity and the need for module qualification. compiler/*.m: Conform to the changes above.	2007-01-03 07:20:47 +00:00
Julien Fischer	b4c3bb1387	Clean up in unused module imports in the Mercury system detected Estimated hours taken: 3 Branches: main Clean up in unused module imports in the Mercury system detected by --warn-unused-imports. analysis/.m: browser/.m: deep_profiler/.m: compiler/.m: library/.m: mdbcomp/.m: profiler/.m: slice/.m: Remove unused module imports. Fix some minor departures from our coding standards. analysis/Mercury.options: browser/Mercury.options: deep_profiler/Mercury.options: compiler/Mercury.options: library/Mercury.options: mdbcomp/Mercury.options: profiler/Mercury.options: slice/Mercury.options: Set --no-warn-unused-imports for those modules that are used as packages or otherwise break --warn-unused-imports, e.g. because they contain predicates with both foreign and Mercury clauses and some of the imports only depend on the latter.	2006-12-01 15:04:40 +00:00
Zoltan Somogyi	ecf1ee3117	Add a mechanism for growing the stacks on demand by adding new segments Estimated hours taken: 20 Branches: main Add a mechanism for growing the stacks on demand by adding new segments to them. You can ask for the new mechanism via a new grade component, stseg (short for "stack segments"). The mechanism works by adding a test to each increment of a stack pointer (sp or maxfr). If the test indicates that we are about to run out of stack, we allocate a new stack segment, allocate a placeholder frame on the new segment, and then allocate the frame we wanted in the first place on top of the placeholder. We also override succip to make it point code that will (1) release the new segment when the newly created stack frame returns, and then (2) go to the place indicated by the original, overridden succip. For leaf procedures on the det stack, we optimize away the check of the stack pointer. We can do this because we reserve some space on each stack for the use of such stack frames. My intention is that doc/user_guide.texi and NEWS will be updated once we have used the feature ourselves for a while and it seems to be stable. runtime/mercury_grade.h: Add the new grade component. runtime/mercury_conf_param.h: Document the new grade component, and the option used to debug stack segments. runtime/mercury_context.[ch]: Add new fields to contexts to hold the list of previous segments of the det and nondet stacks. runtime/mercury_memory_zones.[ch]: Include a threshold in all zones, for use in stack segments. Set it when a zone is allocated. Restore the previous #ifdef'd out function MR_unget_zone, for use when freeing stack segments execution has fallen out of. runtime/mercury_debug.[ch]: When printing the offsets of pointers into the det and nondet stacks, print the number of the segment the pointer points into (unless it is the first, in which case we suppress this in the interest of brevity and simplicity). Make all the functions in this module take a FILE * as an input argument; don't print to stdout by default. runtime/mercury_stacks.[ch]: Modify the macros that allocate stack frames to invoke the code for adding new stack segments when we are about to run out of stack. Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. Conform to the changes in mercury_debug.c. runtime/mercury_stack_trace.c: When traversing the stack, step over the placeholder stack frames at the bottoms of stack segments. Conform to the changes in mercury_debug.c. runtime/mercury_wrapper.[ch]: Make the default stack size small in grades that support stack segments. Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. Conform to the changes in mercury_debug.c. runtime/mercury_memory.c: Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. runtime/mercury_engine.[ch]: runtime/mercury_overflow.h: Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. Convert these files to four-space indentation. runtime/mercury_minimal_model.c: trace/mercury_trace.c: trace/mercury_trace_util.c: Conform to the changes in mercury_debug.c. compiler/options.m: Add the new grade option for stack segments. compiler/compile_target_code.m: compiler/handle_options.m: Add the new grade component, and handle its exclusions with other grade components and optimizations. compiler/llds.m: Extend the incr_sp instruction to record whether the stack frame is for a leaf procedure. compiler/llds_out.m: Output the extended incr_sp instruction. compiler/proc_gen.m: Fill in the extra slot in incr_sp instructions. compiler/goal_util.m: Provide a predicate for testing whether a procedure body is a leaf. compiler/delay_slot.m: compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/frameopt.m: compiler/global_data.m: compiler/jumpopt.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/reassign.m: compiler/use_local_vars.m: Conform to the change in llds.m. scripts/canonicate_grade.sh-subr: scripts/init_grade_options.sh-subr: scripts/parse_grade_options.sh-subr: scripts/final_grade_options.sh-subr: scripts/mgnuc.in: Handle the new grade component. Convert parse_grade_options.sh-subr to four-space indentation. Mmake.workspace: Fix an old bug that prevented bootcheck from working in the new grade: when computing the gc grade, use the workspace's version of ml (which in this case understands the new grade components), rather than the installed ml (which does not). (This was a devil to track down, because neither make --debug nor strace on make revealed how the installed ml was being invoked, and there was no explicit invocation in the Makefile either; the error message appeared to come out of thin air just before the completion of the stage 2 library. It turned out the invocation happened implicitly, as a result of expanding a make variable.)	2006-11-01 02:31:19 +00:00
Zoltan Somogyi	e21193c283	Rename a bunch of predicates and function symbols to eliminate Estimated hours taken: 6 Branches: main browser/.m: compiler/.m: Rename a bunch of predicates and function symbols to eliminate ambiguities. The only real change is factoring out some common code in the mlds and llds code generators, replacing them with single definitions in switch_util.m.	2006-10-15 23:26:56 +00:00
Peter Wang	712027f307	This patch changes the parallel execution mechanism in the low level backend. Estimated hours taken: 100 Branches: main This patch changes the parallel execution mechanism in the low level backend. The main idea is that, even in programs with only moderate parallelism, we won't have enough processors to exploit it all. We should try to reduce the cost in the common case, i.e. when a parallel conjunction gets executed sequentially. This patch does two things along those lines: (1) Instead of unconditionally executing all parallel conjuncts (but the last) in separate Mercury contexts, we allow a context to continue execution of the next conjunct of a parallel conjunction if it has just finished executing the previous conjunct. This saves on allocating unnecessary contexts, which can be a big reduction in memory usage. We also try to execute conjuncts left-to-right so as to minimise the need to suspend contexts when there are dependencies between conjuncts. (2) Conjuncts that are executed in parallel still need separate contexts. We used to pass variable bindings to those conjuncts by flushing input variable values to stack slots and copying the procedure's stack frame to the new context. When the conjunct finished, we would copy new variable bindings back to stack slots in the original context. What happens now is that we don't do any copying back and forth. We introduce a new abstract machine register `parent_sp' which points to the location of the stack pointer at the time that a parallel conjunction began. In parallel conjuncts we refer to all stack slots via the `parent_sp' pointer, since we could be running on a different context altogether and `sp' would be pointing into a new detstack. Since parallel conjuncts now share the procedure's stack frame, we have to allocate stack slots such that all parallel conjuncts in a procedure that could be executing simultaneously have distinct sets of stack slots. We currently use the simplest possible strategy, i.e. don't allow variables in parallel conjuncts to reuse stack slots. Note: in effect parent_sp is a frame pointer which is only set for and used by the code of parallel conjuncts. We don't call it a frame pointer as it can be confused with "frame variables" which have to do with the nondet stack. compiler/code_info.m: Add functionality to keep track of how deep inside of nested parallel conjunctions the code generator is. Add functionality to acquire and release "persistent" temporary stack slots. Unlike normal temporary stack slots, these don't get implicitly released when the code generator's location-dependent state is reset. Conform to additions of `parent_sp' and parent stack variables. compiler/exprn_aux.m: Generalise the `substitute_lval_in_' predicates by `transform_lval_in_' predicates. Instead of performing a fixed substitution, these take a higher order predicate which performs some operation on each lval. Redefine the substitution predicates in terms of the transformation predicates. Conform to changes in `fork', `join_and_terminate' and `join_and_continue' instructions. Conform to additions of `parent_sp' and parent stack variables. Remove `substitute_rval_in_args' and `substitute_rval_in_arg' which were unused. compiler/live_vars.m: Introduce a new type `parallel_stackvars' which is threaded through `build_live_sets_in_goal'. We accumulate the sets of variables which are assigned stack slots in each parallel conjunct. At the end of processing a parallel conjunction, use this information to force variables which are assigned stack slots to have distinct slots. compiler/llds.m: Change the semantics of the `fork' instruction. It now takes a single argument: the label of the next conjunct after the current one. The instruction now "sparks" the next conjunct to be run, either in a different context (possibly in parallel, on another Mercury engine) or is queued to be executed in the current context after the current conjunct is finished. Change the semantics of the `join_and_continue' instruction. This instruction now serves to end all parallel conjuncts, not just the last one in a parallel conjunction. Remove the `join_and_terminate' instruction (no longer used). Add the new abstract machine register `parent_sp'. Introduce "parent stack slots", which are similar to normal stack slots but relative to the `parent_sp' register. compiler/par_conj_gen.m: Change the code generated for parallel conjunctions. That is: - use the new `fork' instruction at the beginning of a parallel conjunct; - use the `join_and_continue' instruction at the end of all parallel conjuncts; - keep track of how deep the code generator currently is in parallel conjunctions; - set and restore the `parent_sp' register when entering a non-nested parallel conjunction; - after generating the code of a parallel conjunct, replace all references to stack slots by parent stack slots; - remove code to copy back output variables when a parallel conjunct finishes. Update some comments. runtime/mercury_context.c: runtime/mercury_context.h: Add the type `MR_Spark'. Sparks are allocated on the heap and contain enough information to begin execution of a single parallel conjunct. Add globals `MR_spark_queue_head' and `MR_spark_queue_tail'. These are pointers to the start and end of a global queue of sparks. Idle engines can pick up work from this queue in the same way that they can pick up work from the global context queue (the "run queue"). Add new fields to the MR_Context structure. `MR_ctxt_parent_sp' is a saved copy of the `parent_sp' register for when the context is suspended. `MR_ctxt_spark_stack' is a stack of sparks that we decided not to put on the global spark queue. Update `MR_load_context' and `MR_save_context' to save and restore `MR_ctxt_parent_sp'. Add the counters `MR_num_idle_engines' and `MR_num_outstanding_contexts_and_sparks'. These are used to decide, when a `fork' instruction is reached, whether a spark should be put on the global spark queue (with potential for parallelism but also more overhead) or on the calling context's spark stack (no parallelism and less overhead). Rename `MR_init_context' to `MR_init_context_maybe_generator'. When initialising contexts, don't reset redzones of already allocated stacks. It seems to be unnecessary (and the reset implementation is buggy anyway, though it's fine on Linux). Rename `MR_schedule' to `MR_schedule_context'. Add new functions `MR_schedule_spark_globally' and `MR_schedule_spark_locally'. In `MR_do_runnext', add code for idle engines to get work from the global spark queue. Resuming contexts are prioritised over sparks. Rename `MR_fork_new_context' to `MR_fork_new_child'. Change the definitions of `MR_fork_new_child' and `MR_join_and_continue' as per the new behaviour of the `fork' and `join_and_continue' instructions. Delete `MR_join_and_terminate'. Add a new field `MR_st_orig_context' to the MR_SyncTerm structure to record which context originated the parallel conjunction instance represented by a MR_SyncTerm instance, and update `MR_init_sync_term'. This is needed by the new behaviour of `MR_join_and_continue'. Update some comments. runtime/mercury_engine.h: runtime/mercury_regs.c: runtime/mercury_regs.h: runtime/mercury_stacks.h: Add the abstract machine register `parent_sp' and code to copy it to and from the fake_reg array. Add a macro `MR_parent_sv' to access stack slots via `parent_sp'. Add `MR_eng_parent_sp' to the MercuryEngine structure. runtime/mercury_wrapper.c: runtime/mercury_wrapper.h: Add Mercury runtime option `--max-contexts-per-thread' which is saved in the global variable `MR_max_contexts_per_thread'. The number `MR_max_outstanding_contexts' is derived from this. It sets a soft limit on the number of sparks we put in the global spark queue, relative to the number of threads we are running. We don't want to put too many sparks on the global queue if there are plenty of ready contexts or sparks already on the global queues, as they are likely to result in new contexts being allocated. When initially creating worker engines, wait until all the worker engines have acknowledged that they are idle before continuing. This is mainly so programs (especially benchmarks and test cases) with only a few fork instructions near the beginning of the program don't execute the forks before any worker engines are ready, resulting in no parallelism. runtime/mercury_engine.c: runtime/mercury_thread.c: Don't allocate a context at the time a Mercury engine is created. An engine only needs a new context when it is about to pick up a spark. configure.in: compiler/options.m: scripts/Mercury.config.in: Update to reflect the extra field in MR_SyncTerm. Add the option `--sync-term-size' and actually make use the result of the sync term size calculated during configuration. compiler/code_util.m: compiler/continuation_info.m: compiler/dupelim.m: compiler/dupproc.m: compiler/global_data.m: compiler/hlds_llds.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/reassign.m: compiler/stack_layout.m: compiler/use_local_vars.m: compiler/var_locn.m: Conform to changes in `fork', `join_and_terminate' and `join_and_continue' instructions. Conform to additions of `parent_sp' and parent stack variables. XXX not sure about the changes in stack_layout.m library/par_builtin.m: Conform to changes in the runtime system.	2006-09-26 03:53:23 +00:00
Zoltan Somogyi	4924dfb1c9	One of Hans Boehm's papers says that heap cells allocated by GC_MALLOC_ATOMIC Estimated hours taken: 5 Branches: main One of Hans Boehm's papers says that heap cells allocated by GC_MALLOC_ATOMIC are grouped together into pages, and these pages aren't scanned during the sweep phase of the garbage collector. I therefore modified the compiler to use GC_MALLOC_ATOMIC instead of GC_MALLOC whereever possible, i.e when the cell being allocated is guaranteed not to have any pointer to GCable memory inside it. My first benchmarking run showed a speedup of 4.5% in asm_fast.gc: EXTRA_MCFLAGS = --use-atomic-cells mercury_compile.01 average of 6 with ignore=1 18.30 EXTRA_MCFLAGS = --no-use-atomic-cells mercury_compile.02 average of 6 with ignore=1 19.17 However, later benchmarks, after the upgrade to version 7.0 of boehm_gc, show a less favourable and more mixed picture, with e.g. a 4% speedup in hlc.gc at -O3, a 3% slowdown in asm_fast.gc at -O4, and little effect otherwise: EXTRA_MCFLAGS = -O1 --use-atomic-cells GRADE = asm_fast.gc mercury_compile.01 average of 6 with ignore=1 23.30 EXTRA_MCFLAGS = -O1 --no-use-atomic-cells GRADE = asm_fast.gc mercury_compile.02 average of 6 with ignore=1 23.28 EXTRA_MCFLAGS = -O2 --use-atomic-cells GRADE = asm_fast.gc mercury_compile.03 average of 6 with ignore=1 18.51 EXTRA_MCFLAGS = -O2 --no-use-atomic-cells GRADE = asm_fast.gc mercury_compile.04 average of 6 with ignore=1 18.66 EXTRA_MCFLAGS = -O3 --use-atomic-cells GRADE = asm_fast.gc mercury_compile.05 average of 6 with ignore=1 18.44 EXTRA_MCFLAGS = -O3 --no-use-atomic-cells GRADE = asm_fast.gc mercury_compile.06 average of 6 with ignore=1 18.48 EXTRA_MCFLAGS = -O4 --use-atomic-cells GRADE = asm_fast.gc mercury_compile.07 average of 6 with ignore=1 18.28 EXTRA_MCFLAGS = -O4 --no-use-atomic-cells GRADE = asm_fast.gc mercury_compile.08 average of 6 with ignore=1 17.70 EXTRA_MCFLAGS = -O1 --use-atomic-cells GRADE = hlc.gc mercury_compile.09 average of 6 with ignore=1 24.78 EXTRA_MCFLAGS = -O1 --no-use-atomic-cells GRADE = hlc.gc mercury_compile.10 average of 6 with ignore=1 24.69 EXTRA_MCFLAGS = -O2 --use-atomic-cells GRADE = hlc.gc mercury_compile.11 average of 6 with ignore=1 19.36 EXTRA_MCFLAGS = -O2 --no-use-atomic-cells GRADE = hlc.gc mercury_compile.12 average of 6 with ignore=1 19.26 EXTRA_MCFLAGS = -O3 --use-atomic-cells GRADE = hlc.gc mercury_compile.13 average of 6 with ignore=1 18.64 EXTRA_MCFLAGS = -O3 --no-use-atomic-cells GRADE = hlc.gc mercury_compile.14 average of 6 with ignore=1 19.38 EXTRA_MCFLAGS = -O4 --use-atomic-cells GRADE = hlc.gc mercury_compile.15 average of 6 with ignore=1 19.39 EXTRA_MCFLAGS = -O4 --no-use-atomic-cells GRADE = hlc.gc mercury_compile.16 average of 6 with ignore=1 19.41 runtime/mercury_heap.h: Define atomic equivalents of the few heap allocation macros that didn't already have one. These macros are used by the LLDS backend. runtime/mercury.h: Define an atomic equivalent of the MR_new_object macro. These macros are used by the MLDS backend. Use MR_new_object_atomic instead of MR_new_object to box floats. compiler/hlds_data.m: compiler/llds.m: compiler/mlds.m: Modify the representations of the heap allocations constructs to include a flag that says whether we should use the atomic variants of the heap allocation macros. compiler/llds_out.m: compiler/mlds_to_c.m: Respect this extract flag when emitting C code. In mlds_to_c.m, also add some white space that makes the code easier for humans to read. compiler/type_util.m: Add a mechanism for finding out whether we can put a value of a given type into an atomic cell. Put the definitions of functions and predicates in this module in the same order as their declarations. Turn some predicates into functions. Change the argument order of some predicates to conform to our usual conventions. compiler/unify_gen.m: compiler/ml_unify_gen.m: Use the new mechanism in type_util.m to generate code that creates atomic heap cells if this is possible and is requested. compiler/code_info.m: compiler/var_locn.m: Act on the information provided by unify_gen.m. compiler/options.m: doc/user_guide.texi: Add an option to control whether the compiler should try to use atomic cells. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/higher_order.m: compiler/jumpopt.m: compiler/livemap.m: compiler/middle_rec.m: compiler/ml_code_util.m: compiler/ml_elim_nested.m: compiler/ml_optimize.m: compiler/ml_util.m: compiler/mlds_to_gcc.m: compiler/mlds_to_il.m: compiler/mlds_to_java.m: compiler/modecheck_unify.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/par_conj_gen.m: compiler/polymorphism.m: compiler/reassign.m: compiler/size_prof.m: compiler/structure_sharing.domain.m: compiler/use_local_vars.m: Minor diffs to conform to the changes above. compiler/structure_reuse.direct.choose_reuse.m: Add an XXX comment about the interaction of the new capability with structure reuse.	2006-08-20 05:01:48 +00:00
Julien Fischer	aeeedd2c13	Standardize formatting of comments at the beginning of modules. compiler/*.m: Standardize formatting of comments at the beginning of modules.	2006-07-31 08:32:11 +00:00
Zoltan Somogyi	469f1dc09b	This diff contains no algorithmic changes. Estimated hours taken: 1.5 Branches: main This diff contains no algorithmic changes. compiler/llds.m: compiler/mlds.m: Rename some function symbols and field names to avoid ambiguities with respect to language keywords. compiler/*.m: Conform to the changes in llds.m and mlds.m.	2006-07-28 05:08:15 +00:00
Zoltan Somogyi	9d23d8e2e7	Implement the trace goal construct we discussed, for now for the LLDS backends Estimated hours taken: 70 Branches: main Implement the trace goal construct we discussed, for now for the LLDS backends only. Since the syntax of trace goals is non-trivial, useful feedback on syntax errors inside trace goal attributes is essential. With the previous setup, this wasn't possible, since the code that turned terms into parse tree goals turned all terms into goals; it couldn't recognize any errors, sweeping them under the rug as calls. This diff changes that. Now, if this code recognizes a keyword that indicates a particular construct, it insists on the rest of the code following the syntax required for that construct, and returns error messages if it doesn't. We handle the trace goal attributes that specify state variables to be threaded through the trace goal (either the I/O state or a mutable variable) in add_clause.m, at the point at which we transform the list of items to the HLDS. We handle the compile-time condition on trace goals in the invocation of simplify at the end of semantics analysis, by eliminating the goal if the compile-time condition isn't met. We handle run-time conditions on trace goals partially in the same invocation of simplify: we transform trace goals with runtime conditions into an if-then-else with the trace goal as the then part and `true' as the else part, the condition being a foreign_proc that is handled specially by the code generator, that special handling being to replace the actual code of the foreign_proc (which is a dummy) with the evaluation of the runtime condition. Since these changes require significant changes to some of our key data structures, I took the liberty of doing some renaming of function symbols at the same time to avoid using ambiguities with respect to language keywords. library/ops.m: Add "trace" as an operator. compiler/prog_data.m: Define data types to represent the various attributes of trace goals. Rename some function symbols to avoid ambiguities. compiler/prog_item.m: Extend the parse tree representation of goals with a trace goal. compiler/mercury_to_mercury.m: Output the new kind of goal and its components. compiler/hlds_goal.m: Extend the HLDS representation of scopes with a scope_reason representing trace goals. Add a mechanism (an extra argument in foreign_procs) to allow the representation of goals that evaluate runtime trace conditions. Since this requires modifying all code that traverses the HLDS, do some renames that were long overdue: rename not as negation, rename call as plain_call, and rename foreign_proc as call_foreign_proc. These renames all avoid using language keywords as function symbols. Change the way we record goals' purities. Instead of optional features to indicate impure or semipure, which is error-prone, use a plain field in the goal_info, accessed in the usual way. Add a way to represent that a goal contains a trace goal, and should therefore be treated as if it were impure when considering whether to optimize it away. Reformat some comments describing function symbols. compiler/hlds_out.m: Output the new construct in the HLDS. compiler/prog_io_util.m: Generalize the maybe[123] types to allow the representation of more than one error message. Add functions to extract the error messages. Add a maybe4 type. Rename the function symbols of these types to avoid massive ambiguity. Change the order of some predicates to bring related predicates next to each other. compiler/prog_io.m: compiler/prog_io_dcg.m: compiler/prog_io_goal.m: compiler/prog_io_pragma.m: Rework these modules almost completely to find and accumulate syntax errors as terms are being parsed. In some cases, this allowed us to replace "XXX this is a hack" markers with meaningful error-reporting code. In prog_io_goal.m, add code for parsing trace goals. In a bunch of places, update obsolete coding practices, such as using nested chains of closures instead of simple sequential code, and using A0 and A to refer to values of different types (terms and goals respectively). Use more meaningful variable names. Break up some too-large predicates. compiler/superhomogeneous.m: Find and accumulate syntax errors as terms are being parsed. compiler/add_clause.m: Add code to transform trace goals from the parse tree to the HLDS. This is where the IO state and mutable variable attributes of trace goals are handled. Eliminate the practice of using the naming scheme Body0 and Body to refer to values of different types (prog_item.goal and hlds_goal respectively). Use error_util for some error messages. library/private_builtin.m: Add the predicates referred to by the transformation in add_clause.m. compiler/goal_util.m: Rename a predicate to avoid ambiguity. compiler/typecheck.m: Do not print error messages about missing clauses if some errors have been detected previously. compiler/purity.m: Instead of just computing purity, compute (and record) also whether a goal contains a trace goal. However, treat trace goals as pure. compiler/mode_info.m: Add trace goals as a reason for locking variables. Rename some function symbols to avoid ambiguity. compiler/modes.m: When analyzing trace goal scopes, lock the scope's nonlocal variables to prevent them from being further instantiated. compiler/det_analysis.m: Insist on the code in trace goal scopes being det or cc_multi. compiler/det_report.m: Generate the error message if the code in a trace goal scope isn't det or cc_multi. compiler/simplify.m: At the end of the front end, eliminate trace goal scopes if their compile-time condition is false. Transform trace goals with runtime conditions as described at the top. Treat goals that contain trace goals as if they were impure when considering whether to optimize them away. compiler/mercury_compile.m: Tell simplify when it is being invoked at the end of the front end. Rename a predicate to avoid ambiguity. compiler/trace_params.m: Provide the predicates simplify.m need to be able to evaluate the trace goal conditions regarding trace levels. compiler/trace.m: compiler/trace_gen.m: Rename the trace module as trace_gen, since "trace" is now an operator. Rename some predicates exported by the module, now that it is no longer possible to preface calls with "trace." as a module qualifier. compiler/notes/compiler_design.html: Document this name change. compiler/options.m: Rename the trace option as trace_level internally, since "trace" is now an operator. The user-visible name remains the same. Add the new --trace-flag option. Delete an obsolete option. compiler/handle_options.m: Rename the function symbols of the grade_component type, since "trace" is now an operator. compiler/llds.m: Extend the LLDS with a mechanism to refer to C global variables. For now, these are used to refer to C globals that will be created by mkinit to represent the initial values of the environment variables referred to by trace goals. compiler/commit_gen.m: Check that no trace goal with a runtime condition survives to code generation; they should have been transformed by simplify.m. compiler/code_gen.m: Tell commit_gen.m what kind of scope it is generating code for. compiler/pragma_c_gen.m: Generate code for runtime conditions when handling the foreign_procs created by simplify.m. compiler/code_info.m: Allow pragma_c_gen.m to record what environment variables it has generated references to. compiler/proc_gen.m: Record the set of environment variables a procedure refers to in the LLDS procedure header, for efficient access by llds_out.m. compiler/llds_out.m: Handle the new LLDS construct, and tell mkinit which environment variables need C globals created for them. compiler/pd_util.m: Rename some predicates to avoid ambiguity. compiler/.m: Conform to the changes above, mainly the renames of function symbols and predicates, the changed signatures of some predicates, and the new handling of purity. util/mkinit.c: Generate the definitions and the initializations of any C globals representing the initial status (set or not set) of environment variables needed by trace goals. library/assoc_list.m: Add some predicates that are useful in prog_io.m. library/term_io.m: Minor cleanup. tests/hard_coded/trace_goal_{1,2}.{m,exp}: New test cases to test the new construct, identical except for whether the trace goal is enabled at compile time. tests/hard_coded/trace_goal_env_{1,2}.{m,exp}: New test cases to test the new construct, identical except for whether the trace goal is enabled at run time. tests/hard_coded/Mercury.options: tests/hard_coded/Mmakefile: Enable the new test cases. tests/invalid/*.err_exp: Update the expected output for the new versions of the error messages now being generated.	2006-07-27 05:03:54 +00:00
Peter Wang	d3ca8ee50d	Fix a compiler abort when `--optimise-dups' detects some duplicate code Estimated hours taken: 1.5 Branches: main, release Fix a compiler abort when `--optimise-dups' detects some duplicate code sequences in parallel conjunctions. compiler/dupelim.m: Allow `most_specific_instr' to generalise two `fork', `init_sync_term', `join_and_continue' or `join_and_terminate' instructions if they take exactly the same arguments. tests/valid/Mercury.options: tests/valid/Mmakefile: tests/valid/par_dupelim.m: Add a test case.	2006-05-04 08:34:56 +00:00
Zoltan Somogyi	d5d5986472	Implement lookup switches in which a switch arm may contain more than one Estimated hours taken: 40 Branches: main Implement lookup switches in which a switch arm may contain more than one solution, such as this code here: p(d, "four", f1, 4.4). p(e, "five", f2, 5.5). p(e, "five2", f3(5), 55.5). p(f, "six", f4("hex"), 6.6). p(g, "seven", f5(77.7), 7.7). p(g, "seven2", f1, 777.7). p(g, "seven3", f2, 7777.7). Such code occurs frequently in benchmark programs used to evaluate the performance of tabled logic programming systems. Change frameopt.m, which previously worked only on det and semidet code, to also work for nondet code. For predicates such as the one above, frameopt can now arrange for the predicate's nondet stack frame to be created only when a switch arm that has more than one solution is selected. compiler/lookup_switch.m: Extend the existing code for recognizing and implementing lookup switches to recognize and implement them even if they are model_non. compiler/lookup_util.m: New module containing utility predicates useful for implementing both lookup switches, and in the future, lookup disjunctions (i.e. disjunctions that correspond to a nondet arm of a lookup switch). compiler/ll_backend.m: Include the new module. compiler/notes/compiler_design.html: Mention the new module. compiler/global_data.m: Move the job of filling in dummy slots to our caller, in this case lookup_switch.m. compiler/frameopt.m: Generalize the existing code for delaying stack frame creation, which worked only on predicates that live on the det stack, to work also on predicates that live on the nondet stack. Without this, predicates whose bodies are model_non lookup switches would create a nonstack stack frame before the switch is ever entered, which is wasteful if the selected switch arm has at most one solution. Since the structure of model_non predicates is more complex (you can cause a branch to a label by storing its address in a redoip slot, you can succeed from the frame without removing the frame), this required considerable extra work. To make the new code debuggable, record, for each basic block that needs a stack frame, why it needs that stack frame. compiler/opt_util.m: Be more conservative about what refers to the stack. Export some previously internal functionality for frameopt. Turn some predicates into functions, and rename them to better reflect their purpose. compiler/opt_debug.m: Print much more information about pragma_c and call LLDS instructions. compiler/prog_data.m: Add an extra attribute to foreign_procs that says that the code of the foreign_proc assumes the existence of a stack frame. This is needed to avoid frameopt optimizing the stack frame away. compiler/add_pragma.m: When processing fact tables, we create foreign_procs that assume the existence of the stack frame, so set the new attribute. compiler/pragma_c_gen.m: When processing foreign_procs, transmit the information in the attribute to the generated LLDS code. compiler/llds.m: Rename the function symbols referring to the fixed slots in nondet stack frames to make them clearer and to avoid overloading function symbols such as curfr and succip. Rename the function symbols of the call_model type to avoid overloading the function symbols of the code_model type. Add a new field to the c_procedure type giving the code_model of the procedure, and give names to all the fields. Describe the stack slots used by lookup switches to the debugger and native gc. compiler/options.m: doc/user_guide.texi: Add a new option, --debug-opt-pred-name, that does when the existing --debug-opt-pred-id options does, but taking a user-friendly predicate name rather than a pred_id as its argument. compiler/handle_options.m: Process --debug-opt-pred-name, and make --frameopt-comments imply --auto-comments, since it is not useful without it. Reformat some existing comments that were written in the days of 8-space indentation. compiler/optimize.m: Implement the new option. Use the new field of the c_procedure type to try only the version of frameopt appropriate for the code model of the current procedure. Do a peephole pass after frameopt, since frameopt can generate code sequences that peephole can optimize. Make the mechanism for recording the process of optimizing procedure bodies more easily usable by including the name of the optimization that created a given version of the code in the name of the file that contains that version of the code, and ensuring that all numbers are two characters long, so that "vi procname.opt" looks at the relevant files in the proper chronological sequence, instead of having version 11 appear before version 2. compiler/peephole.m: Add a new optimization pattern: a "mkframe, goto fail" pair (which can be generated by frameopt) should be replaced by a simple "goto redo". compiler/code_gen.m: Factor out some common code. compiler/llds_out.m: Ensure that C comments nested inside comment(_) LLDS instructions aren't emitted as nested C comments, since C compilers cannot handle these. compiler/code_info.m: compiler/code_util.m: compiler/continuation_info.m: compiler/dupelim.m: compiler/exprn_aux.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/mercury_compile.m: compiler/middle_rec.m: compiler/ml_code_gen.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/stack_layout.m: compiler/transform_llds.m: compiler/var_locn.m: Conform to the change to prog_data.m, opt_util.m and/or llds.m. compiler/handle_options.m: Don't execute the code in stdlabel.m if doing so would cause a compiler abort. tests/hard_coded/dense_lookup_switch_non.{m,exp}: New test case to exercise the new algorithm. tests/hard_coded/Mmakefile: Enable the new test case. tests/hard_coded/cycles.m: Make this test case conform to our coding convention.	2006-04-26 03:06:29 +00:00
Zoltan Somogyi	e8832be3a5	A new module that contains code to standardize labels in the LLDS. Estimated hours taken: 1 Branches: main compiler/stdlabel.m: A new module that contains code to standardize labels in the LLDS. compiler/ll_backend.m: Include the new module in this package. compiler/options.m: Add an option that governs whether stdlabel.m is invoked or not. compiler/optimize.m: If the option is set, invoke stdlabel.m. compiler/opt_util.m: Add an option to opt_util.replace_labels_instruction_list to allow it to replace labels in label instructions themselves. compiler/dupelim.m: Conform to the changes in opt_util.m compiler/notes/compiler_design.html: Document the new module.	2006-04-10 04:28:24 +00:00
Julien Fischer	459847a064	Move the univ, maybe, pair and unit types from std_util into their own Estimated hours taken: 18 Branches: main Move the univ, maybe, pair and unit types from std_util into their own modules. std_util still contains the general purpose higher-order programming constructs. library/std_util.m: Move univ, maybe, pair and unit (plus any other related types and procedures) into their own modules. library/maybe.m: New module. This contains the maybe and maybe_error types and the associated procedures. library/pair.m: New module. This contains the pair type and associated procedures. library/unit.m: New module. This contains the types unit/0 and unit/1. library/univ.m: New module. This contains the univ type and associated procedures. library/library.m: Add the new modules. library/private_builtin.m: Update the declaration of the type_ctor_info struct for univ. runtime/mercury.h: Update the declaration for the type_ctor_info struct for univ. runtime/mercury_mcpp.h: runtime/mercury_hlc_types.h: Update the definition of MR_Univ. runtime/mercury_init.h: Fix a comment: ML_type_name is now exported from type_desc.m. compiler/mlds_to_il.m: Update the the name of the module that defines univs (which are handled specially by the il code generator.) library/.m: compiler/.m: browser/.m: mdbcomp/.m: profiler/.m: deep_profiler/.m: Conform to the above changes. Import the new modules where they are needed; don't import std_util where it isn't needed. Fix formatting in lots of modules. Delete duplicate module imports. tests/*: Update the test suite to confrom to the above changes.	2006-03-29 08:09:58 +00:00
Zoltan Somogyi	be5b71861b	Convert almost all the compiler modules to use . instead of __ as Estimated hours taken: 6 Branches: main compiler/*.m: Convert almost all the compiler modules to use . instead of __ as the module qualifier. In some cases, change the names of predicates and types to make them meaningful without the module qualifier. In particular, most of the types that used to be referred to with an "mlds__" prefix have been changed to have a "mlds_" prefix instead of changing the prefix to "mlds.". There are no algorithmic changes.	2006-03-17 01:40:46 +00:00
Julien Fischer	45fdb6c451	Use expect/3 in place of require/2 throughout most of the Estimated hours taken: 4 Branches: main compiler/*.m: Use expect/3 in place of require/2 throughout most of the compiler. Use unexpected/2 (or sorry/2) in place of error/1 in more places. Fix more dodgy assertion error messages. s/map(prog_var, mer_type)/vartypes/ where the latter is meant.	2005-11-28 04:11:59 +00:00
Julien Fischer	5f589e98fb	Various cleanups for the modules in the compiler directory. Estimated hours taken: 4 Branches: main Various cleanups for the modules in the compiler directory. The are no changes to algorithms except the replacement of some if-then-elses that would naturally be switches with switches and the replacement of most of the calls to error/1. compiler/.m: Convert calls to error/1 to calls to unexpected/2 or sorry/2 as appropriate throughout most or the compiler. Fix inaccurate assertion failure messages, e.g. identifying the assertion failure as taking place in the wrong module. Add :- end_module declarations. Fix formatting problems and bring the positioning of comments into line with our current coding standards. Fix some overlong lines. Convert some more modules to 4-space indentation. Fix some spots where previous conversions to 4-space indentation have stuffed the formatting of the code up. Fix a bunch of typos in comments. Use state variables in more places; use library predicates from the sv modules where appropriate. Delete unnecessary and duplicate module imports. Misc. other small cleanups.	2005-11-17 15:57:34 +00:00
Zoltan Somogyi	f9fe8dcf61	Improve the error messages generated for determinism errors involving committed Estimated hours taken: 8 Branches: main Improve the error messages generated for determinism errors involving committed choice contexts. Previously, we printed a message to the effect that e.g. a cc pred is called in context that requires all solutions, but we didn't say why the context requires all solutions. We now keep track of all the goals to the right that could fail, since it is these goals that may reject the first solution of a committed choice goal. The motivation for this diff was the fact that I found that locating the failing goal can be very difficult if the conjunction to the right is a couple of hundred lines long. This would have been a nontrivial problem, since (a) unifications involving values of user-defined types are committed choice goals, and (b) we can expect uses of user-defined types to increase. compiler/det_analysis.m: Keep track of goals to the right of the current goal that could fail, and include them in the error representation if required. compiler/det_report.m: Include the list of failing goals to the right in the representations of determinism errors involving committed committed choice goals. Convert the last part of this module that wasn't using error_util to use error_util. Make most parts of this module just construct error message specifications; print those specifications (using error_util) in only a few places. compiler/hlds_out.m: Add a function for use by the new code in det_report.m. compiler/error_util.m: Add a function for use by the new code in det_report.m. compiler/error_util.m: compiler/compiler_util.m: Error_util is still changing reasonably often, and yet it is included in lots of modules, most of which need only a few simple non-parse-tree-related predicates from it (e.g. unexpected). Move those predicates to a new module, compiler_util.m. This also eliminates some undesirable dependencies from libs to parse_tree. compiler/libs.m: Include compiler_util.m. compiler/notes/compiler_design.html: Document compiler_util.m, and fix the documentation of some other modules. compiler/*.m: Import compiler_util instead of or in addition to error_util. To make this easier, consistently use . instead of __ for module qualifying module names. tests/invalid/det_errors_cc.{m,err_exp}: Add this new test case to test the error messages for cc contexts. tests/invalid/det_errors_deet.{m,err_exp}: Add this new test case to test the error messages for unifications inside function symbols. tests/invalid/Mmakefile: Add the new test cases. tests/invalid/det_errors.err_exp: tests/invalid/magicbox.err_exp: Change the expected output to conform to the change in det_report.m, which is now more consistent.	2005-10-28 02:11:03 +00:00
Julien Fischer	876263f18a	Fix an incomplete switch in dupelim.m. Estimated hours taken: 0.1 Branches: main compiler/dupelim.m: Fix an incomplete switch in dupelim.m.	2005-09-19 08:53:48 +00:00
Zoltan Somogyi	878b0d1cbc	Factor out some common code using the new capability of switch Estimated hours taken: 0.2 Branches: main compiler/dupelim.m: Factor out some common code using the new capability of switch detection. configure.in: Require the installed compiler to have this capability. Unrelated: also add a test for posix_memalign.	2005-09-19 08:02:44 +00:00
Zoltan Somogyi	753d9755ae	When returning from det and semidet predicates, load the return address into a Estimated hours taken: 3 Branches: main When returning from det and semidet predicates, load the return address into a local C variable instead of the succip abstract machine "register" before popping the stack frame and returning. This gives the C compiler more freedom to reorder instructions. This diff gets a 1.4% speed increase on the compiler. runtime/mercury_stacks.h: Provide a new macro, MR_decr_sp_and_return, to do the combined job that its name describes. compiler/llds.m: Add a new LLDS instruction that corresponds to the new macro. compiler/llds_out.m: Output the new LLDS instruction. compiler/peephole.m: Add a predicate that looks for and exploits opportunities for using the new instruction. compiler/optimize.m: Invoke the new peephole predicate as the next-to-last optimization pass. (The last is wrapping up blocks created by --use-local-vars.) compiler/*.m: Minor changes to handle the new instruction.	2005-09-14 01:29:21 +00:00
Zoltan Somogyi	1ed891b7b1	Introduce a mechanism for extending the det and nondet stacks when needed. Estimated hours taken: 24 Branches: main Introduce a mechanism for extending the det and nondet stacks when needed. The mechanism takes the form of a new grade component, .exts ("extend stacks"). While the new mechanism may be useful in its own right, it is intended mainly to support a new implementation of minimal model tabling, which will use a separate Mercury context for each distinct subgoal. Each context has its own det and nondet stack. Clearly, we can't have hundreds of contexts each with megabyte sized det stacks. The intention is that the stacks of the subgoals will start small, and be expanded when needed. The runtime expansion of stacks doesn't work yet, but it is unnecessarily hard to debug without an installed compiler that understands the new grade component, which is why this diff will be committed before that is fixed. compiler/handle_options.m: compiler/options.m: runtime/mercury_grade.h: scripts/canonical_grade.sh-subr scripts/init_grade_options.sh-subr scripts/parse_grade_options.sh-subr scripts/mgnuc.in Handle the new grade component. runtime/mercury_memory_zones.h: Add MR_ prefixes to the names of the fields of the zone structure. Record not just the actual size of each zone, which includes various kinds of buffers, but also the desired size of the zone exclusive of buffers. Format the documentation of the zone structure fields more comprehensibly. runtime/mercury_memory_zones.c: Instead of implementing memalign if it is not provided by the operating system, implement a function that allows us to reallocate the returned area of memory. Provide a prototype implementation of memory zone extension. It doesn't work yet. Factor out the code for setting up redzones, since it is now needed in more than place. Convert to four space indentation. Make the debugging functions a bit more flexible. runtime/mercury_wrapper.c: Conform to the improved interface of the debugging functions. runtime/mercury_overflow.h: runtime/mercury_std.h: Move a generally useful macro from mercury_overflow.h to mercury_std.h. runtime/mercury_stacks.c: Add functions to extend the stacks. runtime/mercury_stacks.h: Add the tests required to invoke the functions that extend the stacks. Add the macros needed by the change to compiler/llds.m. Convert to four space indentation. runtime/mercury_conf.h.in: Prepare for the use of the posix_memalign function, which is the current replacement of the obsolete memalign library function. We don't yet use it. runtime/mercury_context.h: Format the documentation of the context structure fields more comprehensibly. Put MR_ prefixes on the names of the fields of some structures that didn't previously have them. Conform to the new names of the fields of the zone structure. runtime/mercury_context.c: runtime/mercury_debug.c: runtime/mercury_deep_copy.c: runtime/mercury_engine.c: runtime/mercury_memory_handlers.c: library/benchmarking.m: library/exception.m: Conform to the new names of the fields of the zone structure. In some cases, add missing MR_ prefixes to function names and/or convert to four space indentation. runtime/mercury_engine.h: Add a new low level debug flag for debugging stack extensions. Format the documentation of the engine structure fields more comprehensibly. Convert to four space indentation. runtime/mercury_conf_param.h: Document a new low level debug flag for debugging stack extensions. compiler/compile_target_code.m: compiler/handle_options.m: compiler/options.m: Handle the new grade component. compiler/llds.m: Add two new kinds of LLDS instructions, save_maxfr and restore_maxfr. These are needed because the nondet stack may be relocated between saving and the restoring of maxfr, and the saved maxfr may point to the old stack. In .exts grades, these instructions will save not a pointer but the offset of maxfr from the start of the nondet stack, since offsets are not affected by the movement of the nondet stack. compiler/code_info.m: Use the new instructions where relevant. (Some more work may be needed on this score; the relevant places are marked with XXX.) compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/reassign.m: compiler/use_local_vars.m: Handle the new LLDS instructions. tools/bootcheck: Provide a mechanism for setting the initial stack sizes for a bootcheck.	2005-09-13 08:25:44 +00:00
Zoltan Somogyi	f0dbbcaa34	Generate better code for base relations such as the ones in the transitive Estimated hours taken: 16 Branches: main Generate better code for base relations such as the ones in the transitive closure benchmarkings in the paper on minimal model tabling. These improvements yield speedups ranging from 5 to 25% on those benchmarks. compiler/use_local_vars.m: Make this optimization operate on extended basic blocks instead of plain basic blocks. The greater length of extended basic blocks allows the local variables to have maximum scope possible. The price is that the test for whether assignment to a given lvalue can be avoided or not is now dependent on which of the constituent basic blocks of extended basic block contains the assignment, and thus the test has to be evaluate once for each assignment we try to optimize instead of once per block. Don't allocate temporary variables if the optimization they are intended for turns out not to be allowed. This change avoids having declarations for unused temporary variables in the resulting C code. If --auto-comments is set, insert use_local_vars.m's main data structure, the livemap, into the generated LLDS code as a comment. compiler/peephole.m: Look for the pattern mkframe(Size, Redoip) <straight line instructions that don't use stack slots> succeed and optimize away the mkframe. This pattern always arises for procedures that are actually semidet but are declared nondet (such as the base relations in the tabling benchmarks), and may also arise for semidet branches of nondet procedures. compiler/llds.m: Allow an existing peephole pattern to work better. The pattern is mkframe(Seize, do_fail) <straight line instructions> redoip(curfr) = Redoip Previously, if some compiler-generated C code was among the straight line instructions, the pattern couldn't be applied, since peephole.m couldn't know whether it branched away through the redoip slot of the frame. This diff adds an extra slot to the relevant pragma_c component that tells peephole.m (actually, the predicate in opt_util.m that peephole relies on) whether this is the case. compiler/basic_block.m: Provide functionality for merging basic blocks into extended basic blocks. compiler/dupelim.m: Conform to the change in basic_block.m's interface. Convert to four-space indentation, and fix departures from our style guidelines. compiler/opt_util.m: Provide extra information now needed by use_local_vars. Convert to four-space indentation, and fix departures from our style guidelines. compiler/opt_debug.m: Show the user friendly versions of label names when dumping livemaps and instructions. Shorten the dumped descriptions of registers and stack slots. Dump instructions inside blocks. compiler/frameopt.m: Conform to the changes in opt_util and opt_debug's interfaces. compiler/optimize.m: Use the facilities of opt_debug instead of llds_out when dumping the LLDS after each optimization, since these are now more compact and thus reader friendly. Print unmangled names when writing progress messages. Put the dump files we generate with --opt-debug in a separate subdirectory, since when compiling e.g. tree234.m, the process can generate more than a thousand files. Give the dump files minimally mangled names. compiler/code_gen.m: compiler/pragma_c_gen.m: Convert to four-space indentation, and fix departures from our style guidelines. Conform to the change in llds.m. compiler/code_info.m: compiler/exprn_aux.m: compiler/ite_gen.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/middle_rec.m: compiler/trace.m: Conform to the change in llds.m.	2005-09-07 06:51:57 +00:00
Zoltan Somogyi	828d969e67	Significantly improve the capabilities of the LLDS optimization that tries Estimated hours taken: 20 Branches: main Significantly improve the capabilities of the LLDS optimization that tries to delay the creation of the stack frame, in the hope that on some computation paths the frame won't need to be created at all. Previously, the delayed setup of the stack frame could take place only when a block without a stack frame fell through to a block that needed a stack frame. If block B1 jumped to another block B2 that needed a frame, this was taken as meaning that B1 also had to have a frame. This was a problem, because if B1 ends with a computed goto, some of whose targets need stack frames and some do not, this limitation effectively gave all of them a stack frame, whether they wanted it or not, and thus required them to execute the stack frame teardown code. This diff removes the limitation, optimization allows B1 in this case to not have a stack frame. Instead of jumping to B2, B1 will not jump to a label B3 it inserts immediately before B2, the code at B3 setting up the stack frame and falling through to B2. (We also insert code to jump around B3 if the code immediately preceding it could fall into it accidentally.) The new code in frameopt is conceptually cleaner than it was before, because we now handle transitions from blocks that don't have a stack stack to blocks that do in a much more uniform manner. Most of the changes to other modules are to make the change to frameopt.m easier to debug. The motivation for the change was that we were beaten by YAP (Yet Another Prolog) on the deriv benchmark due to the limitation of frameopt. I haven't measured against YAP yet, but the runtime for 1.5 million iterations has been reduced from about 20 seconds to about 13. Since the compiler doesn't have any predicates that are both frequently used and can benefit from the removal of that old limitation (which is why the limitation wasn't really noticed before), there is no measurable effect on the speed of the compiler itself. compiler/frameopt.m: Effectively rewrite the optimization that delays stack frame creation along the lines above. The code for the optimization that keeps the stack frame for recursive calls if possible is unaffected. If the new option --frameopt-comments is specified, insert into the generated LLDS code a nicely formatted description of the main frameopt.m data structures. These are much easier to read that the term browser in the debugger. compiler/options.m: Add the new developer-only option --frameopt-comments. compiler/llds_out.m: Change the way we output comments to make the coments generated by frameopt.m easier to read. (We output comments only if --auto-comments is given, which it usually isn't.) compiler/opt_debug.m: Provide the functionality of printing local labels in an easier-to-read form that doesn't repeat the (possibly long) procedure name. Local labels can now be printed as e.g. local_15. Rewrite the module to use functions instead of predicates for appending strings, since this makes the code shorter, easier to use and to read. The original code was written before Mercury had functions. compiler/switch_util.m: When gathering information about switches, return the cons_id with each goal. Switch to four-space indentation. compiler/tag_switch.m: When generating code for switches, insert a comment at the start of each case saying what cons_id it is for, using the new information from switch_util. This is to make the generated code easier to understand. Switch to four-space indentation. compiler/ml_tag_switch.m: Conform to the change in switch_util. compiler/optimize.m: Conform to the slightly modified interface of frameopt.m. Switch to four-space indentation. compiler/peephole.m: Switch to four-space indentation, and fix some coding style issues. compiler/basic_block.m: When dividing a procedure body into basic blocks, remember for each block whether it could be fallen into. This modification is not strictly required for this change, since frameopt has its own (specialized) code for creating basic blocks, but it could be useful in the future. compiler/dupelim.m: compiler/use_local_vars.m: Conform to the change in basic_block.m.	2005-08-25 03:19:48 +00:00
Zoltan Somogyi	68b1a6c0ea	Add a new LLDS optimization we discussed on thursday: elimination of procedures Estimated hours taken: 4 Branches: main Add a new LLDS optimization we discussed on thursday: elimination of procedures whose code is an exact copy of the code of another mode of the same predicate. This happens with in,out vs di,uo and also possibly with in,out vs any,any. The new optimization reduces the compiler's code size by 0.6%. compiler/dupproc.m: A new module implementing the new optimization. compiler/ll_backend.m: Add dupproc.m as a new submodule. compiler/notes/compiler_design.html: Mention the new module. compiler/options.m: Add an option, --optimize-proc-dups, enabling the new optimization. Make --opt-space imply the new option. doc/user_guide.texi: Document the new option. compiler/mercury_compile.m: Invoke the new optimization when compiling by predicates. Move the imports of library modules to their own section. compiler/handle_options.m: Make --optimize-proc-dups imply compiling by predicates. The rest of these changes are cosmetic only. compiler/llds.m: Delete an obsolete form of constant we haven't used in a long time. compiler/exprn_aux.m: compiler/jumpopt.m: compiler/llds_out.m: compiler/opt_debug.m: compiler/opt_util.m: Conform to the change in llds.m. compiler/dependency_graph.m: Clean up some comments. compiler/dupelim.m: Fix some variable names. compiler/hlds_module.m: compiler/hlds_pred.m: Minor cleanups.	2005-07-08 04:22:13 +00:00
Zoltan Somogyi	eae96bf198	Minor cleanups I did while browsing files related to the bug in jumpopt. Estimated hours taken: 0.5 Branches: main Minor cleanups I did while browsing files related to the bug in jumpopt. compiler/dupelim.m: Factor out some common code dealing with pragma_c instructions. Reorder arguments to allow the use of state variable notation. Use switches in preference to if-then-elses if possible. compiler/labelopt.m: Convert comments to our preferred format. Use switches in preference to if-then-elses if possible. compiler/opt_util.m: Convert comments to our preferred format. Factor out some common code. Use switches in preference to if-then-elses if possible. Use state variable notation where appropriate.	2005-06-16 05:19:56 +00:00
Zoltan Somogyi	c08ca7fbc8	Import only one module per line in the modules of the compiler Estimated hours taken: 3 Branches: main compiler/*.m: Import only one module per line in the modules of the compiler where my previous diff did not already do so. Misc other cleanups. Where relevant, use the new mechanism in tree.m. compiler/tree.m: Fix a performance problem I noticed while update :- import_module items. Instead of supplying a function to convert lists of trees to a tree, make the tree data structure able to hold a list of subtrees directly. This reduces the number of times where we have to convert list of trees to trees that are sticks just to stay within the old definition of what a tree is.	2005-03-24 02:00:43 +00:00

1 2 3

112 Commits