mirror of
https://github.com/Mercury-Language/mercury.git
synced 2025-12-19 07:45:09 +00:00
efda38a2d6ded4a529c73f5915e547035a335f8d
225 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
517fbac88e |
Add four LLDS instructions Paul will soon need to implement the loop control
Estimated hours taken: 8 Branches: main Add four LLDS instructions Paul will soon need to implement the loop control transformation. compiler/llds.m: Add the new instructions. compiler/llds_out_instr.m: Output the new instructions. Paul may want to change the code we generate. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/reassign.m: compiler/use_local_vars.m: Handle the new instructions. In opt_util.m, fix two old bugs. First, the restore_maxfr instruction behaved as if it updated hp, not maxfr. Second, the keep_assign instruction wasn't being handled as an assignment operation. In peephole.m, fix an old bug, in which assignments through mem_refs were not considered to invalidate the cached value of an lval. In use_local_vars, fix an old bug: the keep_assign instruction wasn't being handled as an assignment operation. Assignments themselves weren't being as optimized as they could be. |
||
|
|
58e305e4c0 |
Implement the source-to-source part of the loop control transformation. The
remaining part is the code generation for code that is to be spawned off. It
must be handled in the code generator since it uses the parent stack pointer in
many cases.
I'm committing this now so that Zoltan can begin to review it while I work on
the code generator component.
compiler/par_loop_control.m:
This new file contains the source-to-source part of the parallel loop
control transformation..
compiler/transform_hlds.m.
Include the par_loop_control module within the transform_hlds module.
compiler/mercury_compile_middle_passes.m:
Call the loop control transformation at stage 206 - after the dependant
parallel conjunction transformation.
Move the last call optimisation pass from stage 175 to 206 since it will
most-likely prevent loop control from working. Where both transformations
are applicable, the loop control transformation is preferred.
compiler/options.m:
Add new options for loop control.
compiler/handle_options.m:
Disable loop control if we're not in a grade that supports parallel
conjunctions.
Other tests that should have been testing for parallel conjunction support
but only tested parallel support have been fixed.
compiler/hlds_goal.m:
Add the feature_do_not_tailcall feature.
compiler/call_gen.m:
Mark LLCS call goals that may not have last call optimisation applied to
them if they have the feature_do_not_tailcall feature set in their HLDS
info.
compiler/goal_util.m:
Create a new predicate expand_plain_conj, this returns a list of the sub
goals of a plain conjunction, or returns the goal in a singleton list.
XXX: Could someone review the name of this predicate.
compiler/hlds_pred.m:
Add a symbol for the new transformation in the pred_transformation type.
Corrected a comment to match the arguments in the predicate it refers to.
compiler/prog_util.m:
Add support to make_pred_name for creating names for loop control
predicates.
compiler/dep_par_conj.m:
Fix grammer in a comment.
compiler/saved_vars.m:
Conform to the change in hlds_goal.m
compiler/layout_out.m:
Conform to the change in hlds_pred.m
runtime/mercury_par_builtin.[ch]:
Add support for lc_wait_free_slot/2, the blocking version of
lc_get_free_slot/2. This means that other loop control builtins have
changed, for instance, lc_join_and_terminate/2 must wake up a context
blocked in lc_wait_free_slot/2 after making the slot it was using free.
Use a spin lock in the loop control structure rather than a POSIX mutex.
runtime/mercury_wrapper.[ch]:
Add support for a runtime variable, the number of contexts per loop control.
This can be controlled with a MERCURY_OPTIONS option.
mdbcomp/program_representation.m:
Include lc_wait_free_slot/2 in the list of external predicates.
mdbcomp/mdbcomp.goal_path.m:
Add two new predicates goal_path_remove_first/3 and goal_path_get_first/2.
library/par_builtin.m:
Add new builtins to support the loop control transformation:
lc_wait_free_slot/2 will block the context until a new slot is
available.
lc_default_num_contexts/1 will return the number of contexts to use, by
default, for a loop-controlled loop.
Add myself as an author of this module.
doc/user_guide.texi:
Document the runtime --num-contexts-per-lc-per-thread option. It is
currently commented out since it is not intended for users, at least for
now.
Document the loop control options for the compiler.
---
The change below was written by Zoltan, I reviewed when I applied his diff to
my workspace.
Allow the compiler to mark calls in the LLDS as calls that cannot have last
call optimization applied to them. Paul will soon need this capability
in order to implement parallel conjunctions in which earlier conjuncts
are spawned off, and later conjuncts contain recursive calls, but the
earlier conjuncts need the stack frame.
compiler/llds.m:
Add a flag to det and semi calls. (Model_non calls have had a similar
flag for a long time, for a totally different reason.)
compiler/call_gen.m:
By default, say that det and semi calls may have LCO applied to them.
compiler/jumpopt.m:
Apply LCO to det and semi calls only if this flag allows it.
compiler/opt_debug.m:
Include the flag in debugging dumps.
|
||
|
|
4c2846593a |
Make it possible to store double-precision `float' constructor arguments in
Branches: main Make it possible to store double-precision `float' constructor arguments in unboxed form, in low-level C grades on 32-bit platforms, i.e. `float' (and equivalent) arguments may occupy two machine words. However, until we implement float registers, this does more harm than good so it remains disabled. compiler/llds.m: Add a type `cell_arg' to hold information about an argument of a cell being constructed. Change `heap_ref' so that we can refer to a pointer with an unknown tag. compiler/unify_gen.m: Use the `cell_arg' type to simplify code related to generating constructions. Handle double word arguments in constructions and deconstructions. Update enumeration packing code to account for the presence of double width arguments and the `cell_arg' type. Take double width arguments into account when generating ground terms. compiler/code_info.m: Extend `assign_field_lval_expr_to_var' to work for expressions involving multiple field lvals of the same variable. Make `assign_cell_to_var' conform to changes. compiler/code_util.m: Add a predicate to calculate the size of a cell given its cell_args. compiler/var_locn.m: Conform to the use of the `cell_arg' type and the presense of double width arguments. Calculate cell size correctly in places. Move sanity checking from `var_locn_assign_field_lval_expr_to_var' to `code_info.assign_field_lval_expr_to_var'. compiler/global_data.m: Make `rval_type_as_arg' take into account the width of the argument. Conform to changes. compiler/c_util.m: Add a new binop category. Unlike the existing macro_binop category, the arguments of macros in this category cannot all be assumed to be of integral types. compiler/llds_out_data.m: compiler/llds_out_instr.m: Output calls to the macros `MR_float_word_bits', `MR_float_from_dword' and `MR_float_from_dword_ptr' which were introduced previously. When a `heap_ref' has an unknown tag, make the generated code mask off the tag bits. compiler/lco.m: Disable the optimisation when float arguments are present, on the basis of whether Mercury floats are wider than a machine word. The comments about when floats are stored in boxed form are out of date. compiler/arg_pack.m: Rename a predicate. compiler/make_hlds_passes.m: Update a comment. compiler/disj_gen.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/llds_to_x86_64.m: compiler/lookup_switch.m: compiler/mlds_to_c.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/stack_layout.m: compiler/string_switch.m: Conform to changes. runtime/mercury_float.h: Add a cast to `MR_float_word_bits' to avoid a gcc error. tests/hard_coded/Mercury.options: tests/hard_coded/Mmakefile: tests/hard_coded/heap_ref_mask_tag.exp: tests/hard_coded/heap_ref_mask_tag.m: tests/hard_coded/reuse_double.exp: tests/hard_coded/reuse_double.m: Add test cases. tests/hard_coded/lookup_disj.exp: tests/hard_coded/lookup_disj.m: Extend existing test case. |
||
|
|
257efbd678 |
Store double-precision `float' constructor arguments in unboxed form,
Branches: main Store double-precision `float' constructor arguments in unboxed form, in high-level C grades on 32-bit platforms, i.e. `float' (and equivalent) arguments may occupy two machine words. As the C code generated by the MLDS back-end makes use of MR_Float variables and parameters, float (un)boxing may be reduced substantially in many programs. compiler/prog_data.m: Add `double_word' as a new option for constructor argument widths, only used for float arguments as yet. compiler/make_hlds_passes.m: Set constructor arguments to have `double_word' width if required, and possible. compiler/type_util.m: Add helper predicate. compiler/builtin_ops.m: compiler/c_util.m: compiler/llds.m: Add two new binary operators used by the MLDS back-end. compiler/arg_pack.m: Handle `double_word' arguments. compiler/ml_code_util.m: Deciding whether or not a float constructor argument requires boxing now depends on the width of the field. compiler/ml_global_data.m: When a float constant appears as an initialiser of a generic array element, it is now always unboxed, irrespective of --unboxed-float. compiler/ml_type_gen.m: Take double-word arguments into account when generating structure fields. compiler/ml_unify_gen.m: Handle double-word float constructor arguments in (de)constructions. In some cases we break a float argument into its two words, so generating two assignments statements or two separate rvals. Take double-word arguments into account when calculating field offsets. compiler/mlds_to_c.m: The new binary operators require no changes here. As a special case, write `MR_float_from_dword_ptr(&X)' instead of `MR_float_from_dword(X, Y)' when X, Y are consecutive words within a field. The definition of `MR_float_from_dword_ptr' is more straightforward, and gcc produces better code than if we use the more general `MR_float_from_dword'. compiler/rtti_out.m: For double-word arguments, generate MR_DuArgLocn structures with MR_arg_bits set to -1. compiler/rtti_to_mlds.m: Handle double-word arguments in field offset calculation. compiler/unify_gen.m: Partially handle double_word arguments in LLDS back-end. compiler/handle_options.m: Set --unboxed-float when targetting Java, C# and Erlang. compiler/structure_reuse.direct.choose_reuse.m: Rename a predicate. compiler/bytecode.m: compiler/equiv_type.m: compiler/equiv_type_hlds.m: compiler/llds_to_x86_64.m: compiler/mlds_to_gcc.m: compiler/mlds_to_il.m: compiler/opt_debug.m: Conform to changes. library/construct.m: library/store.m: Handle double-word constructor arguments. runtime/mercury_conf.h.in: Clarify what `MR_BOXED_FLOAT' now means. runtime/mercury_float.h: Add helper macros for converting between doubles and word/dwords. runtime/mercury_deconstruct.c: runtime/mercury_deconstruct.h: Add a macro `MR_arg_value' and a helper function to extract a constructor argument value. This replaces `MR_unpack_arg'. runtime/mercury_type_info.h: Remove `MR_unpack_arg'. Document that MR_DuArgLocn.MR_arg_bits may be -1. runtime/mercury_deconstruct_macros.h: runtime/mercury_deep_copy_body.h: runtime/mercury_ml_arg_body.h: runtime/mercury_table_type_body.h: runtime/mercury_tabling.c: runtime/mercury_type_info.c: Handle double-word constructor arguments. tests/hard_coded/Mercury.options: tests/hard_coded/Mmakefile: tests/hard_coded/lco_double.exp: tests/hard_coded/lco_double.m: tests/hard_coded/pack_args_float.exp: tests/hard_coded/pack_args_float.m: Add test cases. trace/mercury_trace_vars.c: Conform to changes. |
||
|
|
0ae65de577 |
Pack consecutive enumeration arguments in discriminated union types into a
Branches: main Pack consecutive enumeration arguments in discriminated union types into a single word to reduce cell sizes. Argument packing is only enabled on C back-ends with low-level data, and reordering arguments to improve opportunities for packing is not yet attempted. The RTTI implementations for other back-ends will need to be updated, but that is best left until after any argument reordering change. Modules which import abstract enumeration types are notified so by writing declarations of the form: :- type foo where type_is_abstract_enum(NumBits). into the interface file for the module which defines the type. compiler/prog_data.m: Add an `arg_width' argument to constructor arguments. Replace `is_solver_type' by `abstract_type_details', with an extra option for abstract exported enumeration types. compiler/handle_options.m: compiler/options.m: Add an internal option `--allow-argument-packing'. compiler/make_hlds_passes.m: Determine whether and how to pack enumeration arguments, updating the `arg_width' fields of constructor arguments before constructors are added to the HLDS. compiler/mercury_to_mercury.m: compiler/modules.m: Write `where type_is_abstract_enum(NumBits)' to interface files for abstract exported enumeration types. compiler/prog_io_type_defn.m: Parse `where type_is_abstract_enum(NumBits)' attributes on type definitions. compiler/arg_pack.m: compiler/backend_libs.m: Add a new module. This mainly contains a predicate which packs rvals according to arg_widths, which is used by both LLDS and MLDS back-ends. compiler/ml_unify_gen.m: compiler/unify_gen.m: Take argument packing into account when generating code for constructions and deconstructions. Only a relatively small part of the compiler actually needs to understand argument packing. The rest works at the HLDS level with constructor arguments and variables, or at the LLDS and MLDS levels with structure fields. compiler/code_info.m: compiler/var_locn.m: Add assign_field_lval_expr_to_var and var_locn_assign_field_lval_expr_to_var. Allow more kinds of rvals in assign_cell_arg. I do not know why it was previously restricted, except that the other kinds of rvals were not encountered as cell arguments before. compiler/mlds.m: We can now rely on the compiler to pack arguments in the mlds_decl_flags type instead of doing it manually. A slight downside is that though the type is packed down to a single word cell, it will still incur a memory allocation per cell. However, I did not notice any difference in compiler speed. compiler/rtti.m: compiler/rtti_out.m: Add and output a new field for MR_DuFunctorDesc instances, which, if any arguments are packed, points to an array of MR_DuArgLocn. Each array element describes the offset in the cell at which the argument's value is held, and which bits of the word it occupies. In the more common case where no arguments are packed, the new field is simply null. compiler/rtti_to_mlds.m: Generate the new field to MR_DuFunctorDesc. compiler/structure_reuse.direct.choose_reuse.m: For now, prevent structure reuse reusing a dead cell which has a different constructor to the new cell. The code to determine whether a dead cell will hold the arguments of a new cell with a different constructor will need to be updated to account for argument packing. compiler/type_ctor_info.m: Bump RTTI version number. Conform to changes. compiler/add_type.m: compiler/check_typeclass.m: compiler/equiv_type.m: compiler/equiv_type_hlds.m: compiler/erl_rtti.m: compiler/hlds_data.m: compiler/hlds_out_module.m: compiler/intermod.m: compiler/make_tags.m: compiler/mlds_to_gcc.m: compiler/opt_debug.m: compiler/prog_type.m: compiler/recompilation.check.m: compiler/recompilation.version.m: compiler/special_pred.m: compiler/type_constraints.m: compiler/type_util.m: compiler/unify_proc.m: compiler/xml_documentation.m: Conform to changes. Reduce code duplication in classify_type_defn. compiler/hlds_goal.m: Clarify a comment. library/construct.m: Make `construct' pack arguments when necessary. Remove an old RTTI version number check as recommended in mercury_grade.h. library/store.m: Deal with packed arguments in this module. runtime/mercury_grade.h: Bump binary compatibility version number. runtime/mercury_type_info.c: runtime/mercury_type_info.h: Bump RTTI version number. Add MR_DuArgLocn structure definition. Add a macro to unpack an argument as described by MR_DuArgLocn. Add a function to determine a cell's size, since the number of arguments is no longer correct. runtime/mercury_deconstruct.c: runtime/mercury_deconstruct.h: runtime/mercury_deconstruct_macros.h: runtime/mercury_ml_arg_body.h: runtime/mercury_ml_expand_body.h: Deal with packed arguments when deconstructing. Remove an old RTTI version number check as recommended in mercury_grade.h. runtime/mercury_deep_copy_body.h: Deal with packed arguments when copying. runtime/mercury_table_type_body.h: Deal with packed arguments in tabling. runtime/mercury_dotnet.cs.in: Add DuArgLocn field to DuFunctorDesc. Argument packing is not enabled for the C# back-end yet so this is unused. trace/mercury_trace_vars.c: Deal with packed arguments in MR_select_specified_subterm, use for the `hold' command. java/runtime/DuArgLocn.java: java/runtime/DuFunctorDesc.java: Add DuArgLocn field to DuFunctorDesc. Argument packing is not enabled for the Java back-end yet so this is unused. extras/trailed_update/tr_store.m: Deal with packed arguments in this module (untested). extras/trailed_update/samples/interpreter.m: extras/trailed_update/tr_array.m: Conform to argument reordering in the array, map and other modules in previous changes. tests/hard_coded/Mercury.options: tests/hard_coded/Mmakefile: tests/hard_coded/lco_pack_args.exp: tests/hard_coded/lco_pack_args.m: tests/hard_coded/pack_args.exp: tests/hard_coded/pack_args.m: tests/hard_coded/pack_args_copy.exp: tests/hard_coded/pack_args_copy.m: tests/hard_coded/pack_args_intermod1.exp: tests/hard_coded/pack_args_intermod1.m: tests/hard_coded/pack_args_intermod2.m: tests/hard_coded/pack_args_reuse.exp: tests/hard_coded/pack_args_reuse.m: tests/hard_coded/store_ref.exp: tests/hard_coded/store_ref.m: tests/invalid/Mmakefile: tests/invalid/where_abstract_enum.err_exp: tests/invalid/where_abstract_enum.m: tests/tabling/Mmakefile: tests/tabling/pack_args_memo.exp: tests/tabling/pack_args_memo.m: Add new test cases. tests/hard_coded/deconstruct_arg.exp: tests/hard_coded/deconstruct_arg.exp2: tests/hard_coded/deconstruct_arg.m: Add constructors with packed arguments to these cases. tests/invalid/where_direct_arg.err_exp: Update expected output. |
||
|
|
8e7fe1075c |
Delete LLDS support for nondet foreign code.
Branches: main Delete LLDS support for nondet foreign code. compiler/llds.m: Remove support for save structs from the LLDS. compiler/dupproc.m: compiler/frameopt.m: compiler/llds_out_instr.m: compiler/opt_debug.m: compiler/peephole.m: compiler/proc_gen.m: Conform to the above change. compiler/.cvsignore: Ignore .compiler_tags; delete references to Aditi backend files. |
||
|
|
7e26b55e74 |
Implement a new form of memory profiling, which tells the user what memory
Branches: main
Implement a new form of memory profiling, which tells the user what memory
is being retained during a program run. This is done by allocating an extra
word before each cell, which is used to "attribute" the cell to an
allocation site. The attribution, or "allocation id", is an address to an
MR_AllocSiteInfo structure generated by the Mercury compiler, giving the
procedure, filename and line number of the allocation, and the type
constructor and arity of the cell that it allocates.
The user must manually instrument the program with calls to
`benchmarking.report_memory_attribution', which forces a GC and summarises
the live objects on the heap using the attributions. The mprof tool is
extended with a new mode to parse and present that data.
Objects which are unattributed (e.g. by hand-written C code which hasn't
been updated) are still accounted for, but show up in profiles as "unknown".
Currently this profiling mode only works in conjunction with the Boehm
garbage collector, though in principle it can work with any memory allocator
for which we can access a list of the live objects. Since term size
profiling relies on the same technique of using an extra word per memory
cell, the two profiling modes are incompatible.
The output from `mprof -s' looks like this:
------ [1] some label ------
cells words cumul procedure / type (location)
14150 38872 total
* 1949/ 13.8% 4872/ 12.5% 12.5% <predicate `parser.parse_rest/7' mode 0>
975/ 6.9% 1950/ 5.0% list.list/1 (parser.m:502)
487/ 3.4% 1948/ 5.0% term.term/1 (parser.m:501)
487/ 3.4% 974/ 2.5% term.const/0 (parser.m:501)
* 1424/ 10.1% 4272/ 11.0% 23.5% <predicate `parser.parse_simple_term_2/6' mode 0>
708/ 5.0% 2832/ 7.3% term.term/1 (parser.m:643)
708/ 5.0% 1416/ 3.6% term.const/0 (parser.m:643)
...
boehm_gc/alloc.c:
boehm_gc/include/gc.h:
boehm_gc/misc.c:
boehm_gc/reclaim.c:
Add a callback function to be called for every live object after a GC.
Add a function to write out the GC_size_map array.
compiler/layout.m:
Define the alloc_site_info type which is equivalent to the
MR_AllocSiteInfo C structure.
Add alloc_site_array as a kind of "layout" array.
compiler/llds.m:
Add allocation sites to `cfile' structure.
Replace TypeMsg argument (which was also for profiling) on `incr_hp'
instructions by an allocation site identifier.
Add a new foreign_proc_component for allocation site ids.
compiler/code_info.m:
compiler/global_data.m:
compiler/proc_gen.m:
Keep the set of allocation sites in the code_info and global_data
structures.
compiler/unify_gen.m:
Add allocation sites to LLDS allocation instructions.
compiler/layout_out.m:
compiler/llds_out_file.m:
compiler/llds_out_instr.m:
Output MR_AllocSiteInfo arrays in generated C files.
Output code to register the MR_AllocSiteInfo array with the Mercury
runtime.
Output allocation site ids for memory allocation instructions.
compiler/llds_out_util.m:
Add allocation sites to llds_out_info.
compiler/pragma_c_gen.m:
compiler/ml_foreign_proc_gen.m:
Generate a macro MR_ALLOC_ID which resolves to an allocation site
structure, for every foreign_proc whose C code contains the string
"MR_ALLOC_ID". This is to be used by hand-written C code which
allocates memory.
MR_PROC_LABELs are retained for backwards compatibility. Though
they were introduced for profiling, they seem to have been co-opted
for printf-debugging since then.
compiler/ml_global_data.m:
Add allocation site structures to the MLDS global data.
compiler/mlds.m:
compiler/ml_unify_gen.m:
Add allocation site id to `new_object' instruction.
compiler/mlds_to_c.m:
Output allocation site arrays and allocation ids in high-level C code.
Output a call to register the allocation site array with the Mercury
runtime.
Delete an unused predicate.
compiler/exprn_aux.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/mercury_compile_llds_back_end.m:
compiler/middle_rec.m:
compiler/ml_accurate_gc.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_util.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/use_local_vars.m:
compiler/var_locn.m:
Conform to changes.
compiler/pickle.m:
compiler/prog_event.m:
compiler/timestamp.m:
Conform to changes in memory allocation macros.
library/benchmarking.m:
Add the `report_memory_attribution' instrumentation predicates.
Conform to changes to MR_memprof_record.
library/array.m:
library/bit_buffer.m:
library/bitmap.m:
library/construct.m:
library/deconstruct.m:
library/dir.m:
library/io.m:
library/mutvar.m:
library/store.m:
library/string.m:
library/thread.semaphore.m:
library/version_array.m:
Use attributed memory allocation throughout the standard library so
that objects don't show up in the memory profile as "unknown".
Replace MR_PROC_LABEL by MR_ALLOC_ID.
mdbcomp/program_representation.m:
mdbcomp/rtti_access.m:
Replace MR_PROC_LABEL by MR_ALLOC_ID.
profiler/Mercury.options:
profiler/globals.m:
profiler/mercury_profile.m:
profiler/options.m:
profiler/output.m:
profiler/snapshots.m:
Add a new mode to `mprof' to parse and present the data from
`Prof.Snapshots' files.
Add options for the new profiling mode.
profiler/process_file.m:
Fix a typo.
runtime/mercury_conf_param.h:
#define MR_MPROF_PROFILE_MEMORY_ATTRIBUTION if memory profiling
is enabled and we are using Boehm GC.
runtime/mercury.h:
Make MR_new_object take an allocation id argument.
Conform to changes in memory allocation macros.
runtime/mercury_memory.c:
runtime/mercury_memory.h:
runtime/mercury_types.h:
Define MR_AllocSiteInfo.
Add memory allocation functions and macros which take into the
account the additional word necessary for the new profiling mode.
These should be used in preferences to the raw memory allocation
functions wherever possible so that objects do not show up in the
profile as "unknown".
Add analogues of realloc/free which take into account the offset
introduced by the attribution word.
Add function versions of the MR_new_object macros, which can't be
written in standard C. They are only used when necessary.
Add built-in allocation site ids, to be used in the runtime and
other hand-written code when context-specific ids are unavailable.
runtime/mercury_heap.h:
Make MR_tag_offset_incr_hp_msg and MR_tag_offset_incr_hp_atomic_msg
allocate an extra word when memory attribution is desired, and store
the allocation id there.
Similarly for MR_create{1,2,3}_msg.
Replace proclabel arguments in allocation macros by alloc_id
arguments.
Replace MR_hp_alloc_atomic by MR_hp_alloc_atomic_msg. It was only
used for boxing floats.
Conform to change to MR_new_object macro.
runtime/mercury_bootstrap.h:
Delete obsolete macro hp_alloc_atomic.
runtime/mercury_heap_profile.c:
runtime/mercury_heap_profile.h:
Add the code to summarise the live objects on the Boehm GC heap and
writes out the data to `Prof.Snapshots', for display by mprof.
Don't store the procedure name in MR_memprof_record: the procedure
address is enough and faster to compare.
runtime/mercury_prof.c:
Finish and close the `Prof.Snapshots' file when the program
terminates.
Conform to changes in MR_memprof_record.
runtime/mercury_misc.h:
Add a macro to expand to the name of the allocation sites array
in LLDS grades.
runtime/mercury_bitmap.c:
runtime/mercury_bitmap.h:
Pass allocation id through bitmap allocation functions.
Delete unused function MR_string_to_bitmap.
runtime/mercury_string.h:
Add MR_make_aligned_string_copy_msg.
Make string allocation macros take allocation id arguments.
runtime/mercury.c:
runtime/mercury_array_macros.h:
runtime/mercury_context.c:
runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_dlist.c:
runtime/mercury_engine.c:
runtime/mercury_float.h:
runtime/mercury_hash_table.c:
runtime/mercury_ho_call.c:
runtime/mercury_label.c:
runtime/mercury_prof_mem.c:
runtime/mercury_stacks.c:
runtime/mercury_stm.c:
runtime/mercury_string.c:
runtime/mercury_thread.c:
runtime/mercury_trace_base.c:
runtime/mercury_trail.c:
runtime/mercury_type_desc.c:
runtime/mercury_type_info.c:
runtime/mercury_wsdeque.c:
Use attributed memory allocation throughout the runtime so that
objects don't show up in the profile as "unknown".
runtime/mercury_memory_zones.c:
Attribute memory zones to the Mercury runtime.
runtime/mercury_tabling.c:
runtime/mercury_tabling.h:
Use attributed memory allocation macros for tabling structures.
Delete unused MR_table_realloc_* and MR_table_copy_bytes macros.
runtime/mercury_deep_copy_body.h:
Try to retain the original attribution word when copying values.
runtime/mercury_ml_expand_body.h:
Conform to changes in memory allocation macros.
runtime/mercury_tags.h:
Replace proclabel arguments by alloc_id arguments in allocation macros.
runtime/mercury_wrapper.c:
If memory attribution is enabled, tell Boehm GC that pointers may be
displaced by an extra word.
trace/mercury_trace.c:
trace/mercury_trace_tables.c:
Conform to changes in memory allocation macros.
extras/net/tcp.m:
extras/solver_types/library/any_array.m:
extras/trailed_update/tr_array.m:
Conform to changes in memory allocation macros.
doc/user_guide.texi:
Document the new profiling mode.
doc/reference_manual.texi:
Update a commented out example.
|
||
|
|
322feaf217 |
Add more threadscope instrumentation.
This change introduces instrumentation that tracks sparks as well as parallel
conjunctions and their conjuncts. This should hopefully give us more
information to diagnose runtime performance issues.
As of this date the ThreadScope program hasn't been updated to read or
understand these new events.
runtime/mercury_threadscope.[ch]:
Added a function and types to register all the threadscope strings from an
array.
Add functions to post the new events (see below).
runtime/mercury_threadscope.c:
Added support for 5 new threadscope events.
Registering a string so that other messages may refer to a constant
string.
Marking the beginning and ends of parallel conjunctions.
Creating a spark for a parallel conjunct.
Finishing a parallel conjunct.
Re-arranged event IDs, I've started allocating IDs from 38 onwards for
general purposes and 100 onwards for mercury specific events after talking
with Duncan Coutts.
Trimmed excess whitespace from the end of lines.
runtime/mercury_context.h:
Post a beginning parallel conjunction message when the sync term for the
parallel conjunction is initialized.
Post an event when creating a spark for a parallel conjunction.
Add a MR_spark_id field to the MR_Spark structure, these identify sparks to
threadscope.
runtime/mercury_context.c:
Post threadscope messages when a spark is about to be executed.
Post a threadscope event when a parallel conjunct is completed.
Add a missing memory barrier.
runtime/mercury_wrapper.[ch]:
Create a global function pointer for the code that registers strings in the
threadscope string table, this is filled in by mkinit.
Call this function pointer immediatly after setting up threadscope.
runtime/mercury_wsdeque.[ch]:
Modify MR_wsdeque_pop_bottom to return the spark pointer (which points onto
the queue) rather then returning a result through a pointer and bool if the
operation was successful. This pointer is safe to dereference until
MR_wsdeque_push_bottom is used.
runtime/mercury_wsdeque.c:
Corrected a code comment.
runtime/mercury_engine.h:
Documented some of the fields of the engine structure that hadn't been
documented.
Add a next spark ID field to the engine structure.
Change the type of the engine ID field to MR_uint_least16_t
compiler/llds.m:
Add a third field to the init_sync_term instruction that stores the index
into the threadscope string table of the static conjunction ID.
Add a field to the c_file structure containing the threadscope string
table.
compiler/layout.m:
Added a new layout array name for the threadscope string table.
compiler/layout_out.m:
Implement code to write out the threadscope string table.
compiler/llds_out_file.m:
Write out the threadscope string table when writing out the c_file.
compiler/par_conj_gen.m:
Create strings that statically identify parallel conjunctions for each
init_sync_term LLDS instruction. These strings are added to a table in the
!CodeInfo and the index of the string is added to the init_sync_term
instruction.
Add an extra instruction after a parallel conjunction to post the message
that the parallel conjunction has completed.
compiler/global_data.m:
Add fields to the global data structure to represent the threadscope string
table and its current size.
Add predicates to update and retrieve the table.
Handle merging of threadscope string tables in global data by allowing the
references to the strings to be remapped.
Refactored remapping code so that a caller such as proc_gen only needs to
call one remapping predicate after merging global data..
compiler/code_info.m:
Add a table of strings for use with threadscope to the code_info_persistent
type.
Modify the code_info_init to initialise the threadscope string table fields.
Add a predicate to get the string table and another to update it.
compiler/proc_gen.m:
Build the containing goal map before code generation for procedures with
parallel conjunctions in a parallel grade. par_conj_gen.m depends on this.
Conform to changes in code_info.m and global_data.m
compiler/llds_out_instr.m:
Write out the extra parameter in the init_sync_term instruction.
compiler/dupelim.m:
compiler/dupproc.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/llds_to_x86_64.m:
compiler/mercury_compile_llds_back_end.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/peephole.m:
compiler/reassign.m:
compiler/use_local_vars.m:
Conform to changes in llds.m
compiler/opt_debug.m:
Conform to changes in layout.m
compiler/mercury_compile_llds_back_end.m:
Fix some trailing whitespace.
util/mkinit.c:
Build an initialisation function that registers all the strings in
threadscope string tables.
Correct the layout of a comment.
|
||
|
|
1c3bc03415 |
Make the system compiler with --warn-unused-imports.
Estimated hours taken: 2 Branches: main, release Make the system compiler with --warn-unused-imports. browser/*.m: library/*.m: compiler/*.m: Remove unnecesary imports as flagged by --warn-unused-imports. In some files, do some minor cleanup along the way. |
||
|
|
bf5c9a79a5 |
Until now, all hash tables on strings used a single standard hash function.
Estimated hours taken: 3 Branches: main Until now, all hash tables on strings used a single standard hash function. However, any single hash function has a pretty good probability of generating collisions on small hash tables. This diff adds two new hash functions on strings. When generating hash tables, we now try out all three hash functions, and use the one that generates the fewest collisions. runtime/mercury_string.h: Add C implementations of the new hash functions. library/string.m: Add Mercury implementations of the new hash functions. compiler/builtin_ops.m: Add the new hash functions as builtin operations. compiler/switch_util.m: Select the best hash function for each hash switch on strings. compiler/ml_string_switch.m: compiler/string_switch.m: Use the selected hash function for each hash table. compiler/bytecode.m: compiler/c_util.m: compiler/java_util.m: compiler/llds.m: compiler/llds_tox86_64.m: compiler/mlds_to_gcc.m: compiler/mlds_to_il.m: compiler/opt_debumlds_to_ilg.m: Conform to the presence of the new hash functions. |
||
|
|
4db9b2adbf |
Until now, the only indexing we did for switches on strings was using a hash
Estimated hours taken: 40
Branches: main
Until now, the only indexing we did for switches on strings was using a hash
table containing jump targets (represented as indices into a list of labels).
This diff supplements this with
- binary searches of tables containing jump targets,
- binary searches of tables containing values (lookup tables), and
- hash searches of tables containing values (lookup tables).
For now, the new methods exist in the LLDS backend only.
NEWS:
Mention the new capability.
compiler/string_switch.m:
Add predicates that implement the new indexing methods on strings.
Factor out code from existing predicates as required for this.
compiler/switch_gen.m:
Invoke the new predicates in string_switch.m when relevant.
Avoid passing the constant "no" as the initial value of !MaybeEnd
to predicates where we know this will ALWAYS happen.
compiler/options.m:
doc/user_guide.texi:
Add an option to control when we use binary searches for switches
on strings.
compiler/lookup_switch.m:
This module previously handled lookup switches on integers.
Generalize it so that pieces of it are now also usable to help
implement lookup switches on strings. Rename the predicates specific
to switches on integers to make clear this specificity, and separate
them from the predicates that help implement lookup switches on
variables of all the supported types.
Export some types, predicates and functions for use in string_switch.m.
Fix the code so that it correctly handles det switches, which
can happen e.g. if we know the possible set of values of the
switched-on variable.
Use tail-recursive code to handle the list of switch arms, to allow us
to handle very large switches.
Remove an obsolete comment from the top about a previously implemented
optimization.
compiler/lookup_util.m:
Make set_liveness_and_end_branch update MaybeEnd, to account for the
reservation of stack slots for holding the current and last rows in
later solutions tables for model_non lookup switches.
compiler/switch_util.m:
Make the exported predicates of this module more general, making them
usable for switches on strings as well as ints. Also make them easier
to use. In one case this meant bundling two predicates that were always
used together into one predicate. In another, it meant splitting one
predicate into two, since some of its callers needed an intermediate
result. In the case of a type, it means reordering its fields
to make the order match the order of their use in the implementation.
Add some predicates specifically for switches on strings.
compiler/ml_lookup_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
Conform to the changes to switch_util.m.
compiler/jumpopt.m:
If the comment associated with a label ends with "nofulljump", then
inhibit fulljump optimization of jumps to that label. That
optimization would replace the jumps with the code starting at that
label. This is avoids the overhead of jump instructions, and it is a
good idea in the usual case of forward jumps. However, for the few
backward jumps we generate, the block that replaces the jump
instruction can actually END with the same jump instruction (which may
be conditionally executed), which means that our usual repeated
invocation of jumpopt can replace the original jump instruction
with MANY copies of the block it jumps to. In some cases, such as those
in hash switches, you get more copies than can ever be executed in any
actual execution. Lookup switches therefore now mark the labels that
are targets of backward jumps with this marker.
compiler/llds.m:
Document the new behavior of jumpopt.
compiler/code_info.m:
Export a predicate for use in improving the code we generate for lookup
switches.
Make some other predicates simpler and/or more efficient.
compiler/builtin_ops.m:
Add a builtin op for doing string comparisons by calling strcmp.
This is to prevent the need for two traversals of the strings being
compared in each iteration of binary search.
compiler/bytecode.m:
compiler/c_util.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/llds.m:
compiler/llds_to_x86_64.m:
Conform to the change in builtin_ops.m.
compiler/disj_gen.m:
Conform to the change in lookup_util.m
compiler/frameopt.m:
compiler/proc_gen.m:
compiler/unify_gen.m:
Take advantage of the change in fulljump optimization.
compiler/opt_debug.m:
Improve the string representation of rvals by recording the types of
the operands of binary operations, and making the output a bit more
consistent looking.
compiler/dupproc.m:
compiler/var_locn.m:
Minor style fixes.
runtime/mercury_string.h:
Add a version of strcmp for use by our code generator. This version
casts the arguments before calling the real strcmp. We need it since we
usually specify the arguments as r1, r2 etc, which are declared as
MR_Word, not char *.
tests/hard_coded/lookup_disj.{m,exp}:
tests/hard_coded/string_switch.{m,exp}:
Make these existing tests significantly tougher by making them exercise
a wider range of use scenarios.
tests/hard_coded/string_switch{2,3}.{m,exp}:
tests/hard_coded/Mercury.options
While the string_switch test case tests the handling of jump switches,
these two new test cases test the handling of binary search tables and
hash tables respectively. Their code is identical to the code of
the string_switch test case, but Mercury.options causes them to be
compiled with different options.
tests/hard_coded/int_switch.{m,exp}:
A new test case, equivalent in structure to the string switch test
cases, to test the handling of lookup switches on atomic values.
|
||
|
|
30aafc69a0 |
Split up three big compiler modules: llds_out.m, hlds_out.m (5000+ lines each)
Estimated hours taken: 12 Branches: main Split up three big compiler modules: llds_out.m, hlds_out.m (5000+ lines each) and deep_profiling.m (3000+ lines). Put the predicates in the resulting smaller modules into cohesive groups where possible. A few of the predicates in the original modules were unused; this diff deletes them. There are no algorithmic changes. compiler/llds_out_code_addr.m: New module containing the part of llds_out.m that outputs code addresses and labels. compiler/llds_out_data.m: New module containing the part of llds_out.m that outputs lvals, rvals and their components. compiler/llds_out_global.m: New module containing the part of llds_out.m that generates global static C data structures. compiler/llds_out_instr.m: New module containing the part of llds_out.m that outputs instructions compiler/llds_out_file.m: New module containing the top level part of llds_out.m, which coordinates the generation of a whole C source file. compiler/llds_out_util.m: New module containing the utility parts of llds_out.m. compiler/llds_out.m: Replace everything in this file with just the includes of the submodules that now have all its previous contents. compiler/hlds_llds.m: Move a predicate here from llds_out.m, since it is a utility predicate operating on a type defined here. compiler/rtti_out.m: Move a predicate here from llds_out.m, since it is a predicate generating output from a rtti type. compiler/hlds_out_mode.m: The part of hlds_out.m that deals with writing out insts and modes. compiler/hlds_out_goal.m: The part of hlds_out.m that deals with writing out goals. compiler/hlds_out_pred.m: The part of hlds_out.m that deals with writing out predicates and procedures. compiler/hlds_out_module.m: The part of hlds_out.m that deals with writing out module-wide tables. compiler/hlds_out_util.m: Parts of hlds_out.m that don't fit in anywhere else. compiler/hlds_out.m: Replace everything in this file with just the includes of the submodules that now have all its previous contents. compiler/simplify.m: compiler/hlds_goal.m: Move some insts from simplify.m to hlds_goal.m to allow hlds_out_goal.m to use them also. compiler/coverage_profiling.m: The part of deep_profiling.m that deals with coverage profiling. compiler/deep_profiling.m: Remove the code moved to coverage_profiling.m, and export the utility predicates needed by coverage_profiling.m. Remove the things moved to prog_data.m and hlds_goal.m. Put the predicates into a more logical order. compiler/hlds_goal.m: Move some predicates here from deep_profiling.m, since they belong here. compiler/prog_data.m: Move a type from deep_profiling.m here, since it belongs here. compiler/add_pragma.m: Add a predicate from llds_out.m that is used only here. compiler/*.m: Conform to the changes above. |
||
|
|
d4bbcda309 |
Move all the frequently occurring layout structures and components of layout
Estimated hours taken: 40 Branches: main Move all the frequently occurring layout structures and components of layout structures into arrays where possible. By replacing N global variables holding individual layout structures or layout structure components with one global variable holding an array of them, we reduce the sizes of the symbol tables stored in object files, which should speed up both the C compiler and the linker. Measured on the modules of the library, mdbcomp and compiler directories compiled in grade asm_fast.gc.debug, this diff reduces the size of the generated C source files by 7.8%, the size of the generated object files by 10.4%, and the number of symbols in the symbol tables of those object files by a whopping 42.8%. (These improvements include, and are not on top of, the improvements in my previous similar diff.) runtime/mercury_stack_layout.h: Each label layout structure has information about the type and location of every variable that is live at that label. We store this information in three arrays: an array of pseudo-typeinfos giving the types of all these variables, and two arrays MR_ShortLvals and MR_LongLvals respectively giving their locations. (Most of the time, the location's encoded form fits into one byte (the MR_ShortLval) but sometimes it needs more bits (this is when we use MR_LongLval)). We used to store these three arrays, whose elements are different types, in a single occurrence-specific common structure, one after the other, with a cumbersome mechanism being required to access them. We now store them as segments of three separate arrays, of pseudo-typeinfos, MR_ShortLvals and MR_LongLvals respectively. This makes access simpler and faster (which will matter more to any accurate garbage collector than it does to the debugger). It also allows more scope for compression, since reusing an existing segment of one of the three arrays is easier than reusing an entire common structure, which would require the equivalent of exact matches on all three arrays. Since most label layout structures that have information about variables can encode the variables' locations using only MR_ShortLvals, create a version of the label layout structure type that omits the field used to record the whereabouts of the long location descriptors. Add macros now generated by the compiler to initialize layout structures. Simplify a one-field struct. runtime/mercury_grade.h: Increment the binary compatibility version number for debuggable executables, since .c and .o files from before and after the change to label layout structures are NOT compatible. runtime/mercury_type_info.h: Fix some binary-compatibility-related bit rot. runtime/mercury_misc.h: Move here the existing macros used by the compiler when generating references to layout arrays, and add new ones. runtime/mercury_goto.h: Delete the macros moved to mercury_misc.h. Conform to the changes in mercury_stack_layout.h. runtime/Mmakefile: Prevent the unnecessary rebuilding of mercury_conf.h. runtime/mercury_accurate_gc.c: runtime/mercury_agc_debug.c: runtime/mercury_layout_util.c: runtime/mercury_stack_trace.c: runtime/mercury_types.h: trace/mercury_trace.c: trace/mercury_trace_vars.c: Conform to the changes in mercury_stack_layout.h. runtime/mercury_wrapper.c: Improve the debug support a bit. runtime/mercury_engine.h: Fix style. compiler/layout.m: Make the change described at the top. Almost all layout structures are now in arrays. The only exceptions are those that occur rarely, and proc layouts, whose names need to be derivable from the name of the procedure itself. Instead of having a single type "layout_data" that can represent different kinds of single global variables (not array slots), have different kinds for different purposes. This makes the code clearer and allows traversals that do not have to skip over inapplicable kinds of layout structures. compiler/layout_out.m: Output the new arrays. compiler/stack_layout.m: Generate the new arrays. Previously, an individual term generated by stack_layout.m could represent several components of a layout structure, with the components separated by layout_out.m. We now do the separation in stack_layout.m itself, adding each component to the array to which it belongs. Instead of passing around a single stack_layout_info structure, pass around several smaller one. This is preferable, since I found out the hard way that including everything in one structure would give the structure 51 fields. Most parts of the module work with only one or two of these structures, which makes their role clearer. Cluster related predicates together. compiler/options.m: doc/user_guide.texi: Add an option that control whether stack_layout.m will attempt to compress the layout arrays that can meaningfully be comressed. compiler/llds.m: Remove the old distinction between a data_addr and a data_name, replacing both types with a single new one: data_id. Since different kinds of data_names were treated differently in many places, the distinction in types (which was intended to allow us to process data_addrs that wrapped data_names differently from other kinds of data_addrs) wasn't buying us anything anymore. The new data_id type allows for the possibility that the code generator wants to generate a reference to an address it does not know yet, because it is a slot in a layout array, and the slot has not been allocated yet. Add the information from which the new layout array structures will be generated to the LLDS. compiler/llds_out.m: Call layout_out.m to output the new layout arrays. Adapt the decl_id type to the replacement of data_addrs by data_ids. Don't both keeping track of the have-vs-have-not-declared status of structures that are always declared at the start. When writing out a data_addr, for some kinds of data_addr, llds_out.m would write out the name of the relevant variable, while for some other kinds, it would write out its address. This diff separates out those those things into separate predicates, each of which behaves consistently. compiler/mercury_compile_llds_back_end.m: Convey the intended contents of the new layout arrays from stack_layout.m to llds_out.m. compiler/continuation_info.m: Add a type required by the way we now generate proc_static structures for deep profiling. compiler/hlds_rtti.m: Add distinguishing prefixes to the field names of the rtti_proc_label type. compiler/code_info.m: compiler/code_util.m: compiler/erl_rtti.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/ll_pseudo_type_info.m: compiler/ml_code_util.m: compiler/opt_debug.m: compiler/proc_gen.m: compiler/prog_rep.m: compiler/rtti_out.m: compiler/unify_gen.m: Conform to the changes above. tests/debugger/declarative/track_through_catch.exp: Expect procedures to be listed in the proper order. tests/EXPECT_FAIL_TESTS.asm_fast.gc.debug: tests/EXPECT_FAIL_TESTS.asm_fast.gc.profdeep: Add these files to ignore expected failues in these grades. |
||
|
|
9bdc5db590 |
Try to work around the Snow Leopard linker's performance problem with
Estimated hours taken: 20
Branches: main
Try to work around the Snow Leopard linker's performance problem with
debug grade object files by greatly reducing the number of symbols needed
to represent the debugger's data structures.
Specifically, this diff groups all label layouts in a module, each of which
previously had its own named global variable, into only a few (one to four)
global variables, each of which is an array. References to the old global
variables are replaced by references to slots in these arrays.
This same treatment could also be applied to other layout structures. However,
most layouts are label layouts, so doing just label layouts gets most of the
available benefit.
When the library and compiler are compiled in grade asm_fast.gc.debug,
this diff leads to about a 1.5% increase in the size of their generated C
source files (from 338 to 343 Mb), but a more significant reduction (about 17%)
in the size of the corresponding object files (from 155 to 128 Mb). This leads
to an overall reduction in disk requirements from 493 to 471 Mb (about 4.5%).
Since we generate the same code and data as before, with the data just being
arranged differently, the decrease in object file sizes is coming from the
reduction in relocation information, the information processed by the linker.
This should speed up the linker.
compiler/layout.m:
Make the change described above. We now define up to four arrays:
one each for label layouts with and without information about
variables, one for the layout structures of user events,
and one for the variable number lists of user events.
compiler/layout_out.m:
Generate the new arrays that the module being compiled needs.
Use purpose-specific types instead of booleans.
compiler/trace_gen.m:
Use a new field in foreign_proc_code instructions to record the
identity of any labels whose layout structures we want to refer to,
even though layout structures have not been generated yet. The labels
will be looked up in a map (generated together with the layout
structures) by llds_out.m.
compiler/llds.m:
Add this extra field to foreign_proc_code instructions.
Add the map (which is actually in two parts) to the c_file type,
which is the data structure representing the entire LLDS.
Also add to the c_file type some other data structures that previously
we used to hand around alongside it. Some of these data structures
used to conmingle layout structures that we now separate.
compiler/stack_layout.m:
Generate array slots instead of separate structures for label layouts.
Return the different arrays separately.
compiler/llds_out.m:
Order the output of layout structures to require fewer forward
declarations. The forward declarations of the few arrays holding the
label layout structures replace a lot of the declarations previously
needed.
Include the information needed by layout_out.m in the llds_out_info,
and conform to the changes above.
As a side-effect of all these changes, we now generate proc layout
structures in the same order as the procedures' appearence in the HLDS,
which is the same as their order in the source code, modulo any
procedures added by the compiler itself (for lambdas, unification
predicates, etc).
compiler/code_info.m:
compiler/dupelim.m:
compiler/dup_proc.m:
compiler/exprn_aux.m:
compiler/frameopt.m:
compiler/global_data.m:
compiler/ite_gen.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/llds_to_x86_64.m:
compiler/mercury_compile_llds_back_end.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/pragma_c_gen.m:
compiler/proc_gen.m:
compiler/reassign.m:
compiler/use_local_vars.m:
Conform to the changes above.
runtime/mercury_goto.h:
Add the macros used by the new code in layout_out.m and llds_out.m.
We need new macros because the old ones assumed that the
C preprocessor can construct the address of a label's layout structure
from the name of the label, which is obviously no longer possible.
Make even existing families of macros handle in bulk up to 10 labels,
up from the previous 8.
runtime/mercury_stack_layout.h:
Add macros for use by the new code in layout.m.
tests/debugger/*.{inp,exp}:
tests/debugger/declarative/*.{inp,exp}:
Update these test cases to account for the new (and better) order
of proc layout structures. Where inputs changed, this was to ensure
that we still select the same procedures from lists of procedures,
e.g. to put a breakpoint on.
|
||
|
|
4ebe3d0d7e |
Stop storing globals in the I/O state, and divide mercury_compile.m
Estimated hours taken: 60 Branches: main Stop storing globals in the I/O state, and divide mercury_compile.m into smaller, more cohesive modules. (This diff started out as doing only the latter, but it became clear that this was effectively impossible without the former, and the former ended up accounting for the bulk of the changes.) Taking the globals out of the I/O state required figuring out how globals data flowed between pieces of code that were often widely separated. Such flows were invisible when globals could be hidden in the I/O state, but now they are visible, because the affected code now passes around globals structures explicitly. In some cases, the old flow looked buggy, as when one job invoked by mmc --make could affect the globals value of its parent or the globals value passed to the next job. I tried to fix such problems when I saw them. I am not 100% sure I succeeded in every case (I may have replaced old bugs with new ones), but at least now the flow is out in the open, and any bugs should be much easier to track down and fix. In most cases, changes the globals after the initial setup are intended to be in effect only during the invocation of a few calls. This used to be done by remembering the initial values of the to-be-changed options, changing their values in the globals in the I/O state, making the calls, and restoring the old values of the options. We now simply create a new version of the globals structure, pass it to the calls to be affected, and then discard it. In two cases, when discovering reasons why (1) smart recompilation should not be done or (2) item version numbers should not be generated, the record of the discovery needs to survive this discarding. This is why in those cases, we record the discovery by setting a mutable attached to the I/O state. We use pure code (with I/O states) both to read and to write the mutables, so this is no worse semantically than storing the information in the globals structure inside the I/O state. (Also, we were already using such a mutable for recording whether -E could add more information.) In many modules, the globals information had to be threaded through several predicates in the module. In some places, this was made more difficult by predicates being defined by many clauses. In those cases, this diff converts those predicates to using explicit disjunctions. compiler/globals.m: Stop storing the globals structure in the I/O state, and remove the predicates that accessed it there. Move a mutable and its access predicate here from handle_options.m, since here is when the mutables treated the same way are. In a couple of cases, the value of an option is available in a mutable for speed of access from inside performance-critical code. Set the values of those mutables from the option when the processing of option values is finished, not when it is starting, since otherwise the copies of each option could end up inconsistent. Validate the reuse strategy option here, since doing it during ctgc analysis (a) is too late, and (b) would require an update to the globals to be done at an otherwise inconvenient place in the code. Put the reuse strategy into the globals structure. Two fields in the globals structure were unused. One (have_printed_usage) was made redundant when the one predicate that used it itself became unused; the other (source_file_map) was effectively replaced by a mutable some time ago. Delete these fields from the globals. Give the fields of the globals structure a distinguishing prefix. Put the type declarations, predicate declarations and predicate definitions in a consistent order. compiler/source_file_map.m: Record this module's results only in the mutable (it serves as a cache), not in globals structure. Use explicitly passed globals structure for other purposes. compiler/handle_options.m: Rename handle_options as handle_given_options, since it does not process THE options to the program, but the options it is given, and even during the processing of a single module, it can be invoked up the three times in a row, each time being given different options. (It was up to four times in a row before this diff.) Make handle_given_options explicitly return the globals structure it creates. Since it does not take an old global structure as input and globals are not stored in the I/O state, it is now clear that the globals structure it returns is affected only by the default values of the options and the options it processes. Before this diff, in the presence of errors in the options, handle_options *could* return (implicitly, in the I/O state) the globals structure that happened to be in the I/O state when it was invoked. Provide a separate predicate for generating a dummy globals based only on the default values of options. This allows by mercury_compile.m to stop abusing a more general-purpose predicate from handle_options.m, which we no longer export. Remove the mutable and access predicate moved to globals.m. compiler/options.m: Document the fact that two options, smart_recompilation and generate_item_version_numbers, should not be used without seeing whether the functionalities they call for have been disabled. compiler/mercury_compile_front_end.m: compiler/mercury_compile_middle_passes.m: compiler/mercury_compile_llds_back_end.m: compiler/mercury_compile_mlds_back_end.m: compiler/mercury_compile_erl_back_end.m: New modules carved out of the old mercury_compile.m. They each cover exactly the areas suggested by their names. Each of the modules is more cohesive than the old mercury_compile.m. Their code is also arranged in a more logical order, with predicates representing compiler passes being defined in the order of their invocation. Some of these modules export predicates for use by their siblings, showing the dependencies between the groups of passes. compiler/top_level.m: compiler/notes/compiler_design.html: Add the new modules. compiler/mark_static_terms.m: Move this module from the ml_backend package to the hlds package, since (a) it does not depend on the MLDS in any way, and (b) it is also needed by a compiler pass (loop invariants) in the middle passes. compiler/hlds.m: compiler/ml_backend.m: compiler/notes/compiler_design.html: Reflect mark_static_terms.m's change of package. compiler/passes_aux.m: Move the predicates for dumping out the hLDS here from mercury_compile.m, since the new modules also need them. Look up globals in the HLDS, not the I/O state. compiler/hlds_module.m: Store the prefix (common part) of HLDS dump file names in the HLDS itself, so that the code moved to passes_aux.m can figure out the file name for a HLDS dump without doing system calls. Give the field names of some structures prefixes to avoid ambiguity. compiler/mercury_compile.m: Remove the code moved to the other modules. This module now looks after only option handling (such as deciding whether to generate .int3 files, .int files, .opt files etc), and the compilation passes up to and including the creation of the first version of the HLDS. Everything after that is subcontracted to the new modules. Simplify and make explicit the flow of globals information. When invoking predicates that could disable smart recompilation, check whether they have done so, and if yes, update the globals accordingly. When compiling via gcc, we need to link into the executable the object files of any separate C files we generate for C code foreign_procs, which we cannot translate into gcc's internal structures without becoming a C compiler as well as a Mercury compiler. Instead of adding such files to the accumulating option for extra object files in the globals structure, we return their names using the already existing mechanism we have always used to link the object files of fact tables into the executable. Give several predicates more descriptive names. Put predicates in a more logical order. compiler/make.m: compiler/make.dependencies.m: compiler/make.module_target.m: compiler/make.module_dep_file.m: compiler/make.program_target.m: compiler/make.util.m: Require callers to supply globals structures explicitly, not via the I/O state. Afterward pass them around explicitly, passing modified versions to mercury_compile.m when invoking it with module- and/or task-specific options. Due the extensive use of partial application for higher order code in these modules, passing around the globals structures explicitly is quite tricky here. There may be cases where a predicate uses an old globals structure it got from a closure instead of the updated module- and/or task-specific globals it should be using, or vice versa. However, it is just as likely that, this diff fixes old problems by preventing the implicit flow of updated-only-for-one-invocation globals structures back to the original invoking context. Although I have tried to be careful about this, it is also possible that in some places, the code is using an updated-for-an-invocation globals structure in some but not all of the places where it SHOULD be used. compiler/c_util.m: compiler/compile_target_code.m: compiler/compiler_util.m: compiler/error_util.m: compiler/file_names.m: compiler/file_util.m: compiler/ilasm.m: compiler/ml_optimize.m: compiler/mlds_to_managed.m: compiler/module_cmds.m: compiler/modules.m: compiler/options_file.m: compiler/pd_debug.m: compiler/prog_io.m: compiler/transform_llds.m: compiler/write_deps_file.m: Require callers to supply globals structures explicitly, not via the I/O state. In some cases, the explicit globals structure argument allows a predicate to dispense with the I/O states previously passed to it. In some modules, rename some predicates, types and/or function symbols to avoid ambiguity. compiler/read_modules.m: Require callers to supply globals structures explicitly, not via the I/O state. Record when smart recompilation and the generation of item version numbers should be disabled. compiler/opt_debug.m: compiler/process_util.m: Require callers to supply the needed options explicitly, not via the globals in the I/O state. compiler/analysis.m: compiler/analysis.file.m: compiler/mmc_analysis.m: Make the analysis framework's methods take their global structures as explicit arguments, not as implicit data stored in the I/O state. Stop using `with_type` and `with_inst` declarations unnecessarily. Rename some predicates to avoid ambiguity. compiler/hlds_out.m: compiler/llds_out.m: compiler/mercury_to_mercury.m: compiler/mlds_to_c.m: compiler/mlds_to_java.m: compiler/optimize.m: Make these modules stop accessing the globals from the I/O state. Do this by requiring the callers of their top predicates to explicitly supply a globals structure. To compensate for the cost of having to pass around a representation of the options, look up the values of the options of interest just once, to make further access much faster. (In the case of mlds_to_c.m, the code already did much of this, but it still had a few accesses to globals in the I/O state that this diff eliminates.) If the module exports a predicate that needs these pre-looked-up options, then export the type of this data structure and its initialization function. compiler/frameopt.m: Since this module needs only one option from the globals, pass that option instead of the globals. compiler/accumulator.m: compiler/add_clause.m: compiler/closure_analysis.m: compiler/complexity.m: compiler/deforest.m: compiler/delay_construct.m: compiler/elds_to_erlang.m: compiler/exception_analysis.m: compiler/fact_table.m: compiler/intermod.m: compiler/mode_constraints.m: compiler/mode_errors.m: compiler/pd_util.m: compiler/post_term_analysis.m: compiler/recompilation.usage.m: compiler/size_prof.usage.m: compiler/structure_reuse.analysis.m: compiler/structure_reuse.direct.choose_reuse.m: compiler/structure_reuse.direct.m: compiler/structure_sharing.analysis.m: compiler/tabling_analysis.m: compiler/term_constr_errors.m: compiler/term_constr_fixpoint.m: compiler/term_constr_initial.m: compiler/term_constr_main.m: compiler/term_constr_util.m: compiler/trailing_analysis.m: compiler/trans_opt.m: compiler/typecheck_info.m: Look up globals information from the HLDS, not the I/O state. Conform to the changes above. compiler/gcc.m: compiler/maybe_mlds_to_gcc.pp: compiler/mlds_to_gcc.m: Look up globals information from the HLDS, not the I/O state. Conform to the changes above. Convert these modules to our current programming style. compiler/termination.m: Look up globals information from the HLDS, not the I/O state. Conform to the changes above. Report some warnings with error_specs, instead of immediately printing them out. compiler/export.m: compiler/il_peephole.m: compiler/layout_out.m: compiler/rtti_out.m: compiler/liveness.m: compiler/make_hlds.m: compiler/make_hlds_passes.m: compiler/mlds_to_il.m: compiler/mlds_to_ilasm.m: compiler/recompilation.check.m: compiler/stack_opt.m: compiler/superhomogeneous.m: compiler/tupling..m: compiler/unneeded_code.m: compiler/unused_args.m: compiler/unused_import.m: compiler/xml_documentation.m: Conform to the changes above. compiler/equiv_type_hlds.m: Give the field names of a structure prefixes to avoid ambiguity. Stop using `with_type` and `with_inst` declarations unnecessarily. compiler/loop_inv.m: compiler/pd_info.m: compiler/stack_layout.m: Give the field names of some structures prefixes to avoid ambiguity. compiler/add_pragma.m: Add notes. compiler/string.m: NEWS: Add a det version of remove_suffix, for use by new code above. |
||
|
|
00ea415659 |
Implement scalar and vector global data for the MLDS backend, modelled on
Estimated hours taken: 32
Branches: main
Implement scalar and vector global data for the MLDS backend, modelled on
the implementation of global data for the LLDS backend. Use scalar global
data to eliminate redundant copies of static memory cells. Use vector global
data to implement lookup switches, and to implement string switches more
efficiently.
The diff reduces the compiler executable's size by 3.3% by eliminating
duplicate copies of static cells. The diff can reduce the sizes of object files
not only through this reduction in the size of read-only data, but also through
reductions in the size of the needed relocation info: even in the absence of
duplicated cells, using one global variable that holds an array of all the
cells of the same type requires less relocation info than a whole bunch of
separate global variables each holding one cell. If C debugging is enabled,
we can also expect a significant reduction in the size of the debug information
stored in object files AND in executables, for the same reason. (This was the
original motivation for scalar static data on the LLDS backend; the large
amount of relocation information in object files, especially if Mercury
debugging was enabled, led to long link times.)
compiler/ml_global_data.m:
Make the changes described above.
compiler/ml_lookup_switch.m:
This new module implements lookup switches for the MLDS backend.
For now, it implements only model_det and model_semi switches.
compiler/ml_switch_gen.m:
Call the new module when appropriate.
Do not require the switch generation methods that never generate
definitions to return an empty list of definitions.
compiler/ml_backend.m:
Add the new module.
compiler/notes/compiler_design.html:
Mention the new module, and fix some documentation rot.
compiler/mlds.m:
Extend the relevant types to allow the generated MLDS code to refer
to both scalar and vector global data.
Move a predicate that belongs here from ml_code_util.m.
Rename a predicate to avoid ambiguity with its own return type.
Give the functors of some types distinguishing prefixes.
compiler/ml_util.m:
Replace some semidet predicates with functions returning bool,
since the semidet predicates silently did the wrong thing on the new
additions to the MLDS.
compiler/ml_code_gen.m:
Ensure that we do not generate references to scalar and vector common
cells on platforms which do not (yet) support them. At the moment,
they are supported only when generating C.
Put some code into a predicate of its own.
compiler/builtin_ops.m:
Extend the type that represents array elements to allow them to be
structures, which they are for vector globals.
compiler/ml_code_util.m:
Add some new utility predicates and functions.
Move some predicates that are now needed in more than one module here.
Remove the predicates moved to mlds.m.
Conform to the changes above.
compiler/ml_string_switch.m:
compiler/string_switch.m:
Instead of two separate arrays, use one array of structures (a static
vector), since they way, the string and the next slot indicator,
which are accesses together, are next to each other and thus
in the same cache block.
compiler/lookup_switch.m:
compiler/switch_util.m:
Move several predicates from lookup_switch.m to switch_util.m,
since now ml_lookup_switch.m needs them too. Parameterize the moved
predicates as needed.
Conform to the changes above.
compiler/llds.m:
Add prefixes to some functor names to avoid ambiguities.
compiler/llds_out.m:
compiler/lookup_util.m:
compiler/mercury_compile.m:
Minor style improvements.
compiler/global_data.m:
Minor cleanups. Give names to some data types, and add prefixes to some
field names.
Conform to the changes above.
compiler/jumpopt.m:
Minor style improvements.
Conform to the changes above.
compiler/opt_debug.m:
Fix some misleading variable names.
compiler/reassign.m:
Factor out some duplicated code.
compiler/ll_pseudo_type_info.m:
compiler/ml_closure_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_tag_switch.m:
compiler/ml_tailcall.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_c.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
compiler/rtti_to_mlds.m:
compiler/stack_layout.m:
compiler/unify_gen.m:
Conform to the changes above.
tests/hard_coded/lookup_switch_simple.{m,exp}:
tests/hard_coded/lookup_switch_simple_bitvec.{m,exp}:
tests/hard_coded/lookup_switch_simple_non.{m,exp}:
tests/hard_coded/lookup_switch_simple_opt.{m,exp}:
New test cases to exercise the new functionality.
tests/hard_coded/Mmakefile:
tests/hard_coded/Mercury.options:
Enable the new tests.
|
||
|
|
79147ecae6 |
Speed up the LLDS code generator's handling of code that constructs large
Estimated hours taken: 8 Branches: main Speed up the LLDS code generator's handling of code that constructs large ground terms by specializing it. This diff reduces the compilation time for training_cars_full.m by a further 38% or so, for an overall reduction by about a factor of five since I started. The time on tools/speedtest stays pretty much the same. compiler/unify_gen.m: Add a mechanism to construct code for from_ground_term_construct scopes directly. compiler/code_gen.m: Invoke the new mechanism in unify_gen.m for from_ground_term_construct scopes. Reorganize some trace goal by duplicating some common support code inside them. The compiler wasn't optimizing it away as it should have. compiler/code_info.m: Export a predicate for use by the new code in unify_gen.m. Add some debugging predicates. compiler/opt_debug.m: Rename a predicate to better reflect its function. |
||
|
|
f60a0ab285 |
Add a mechanism that can help debug the compiler itself, specifically
Estimated hours taken: 2 Branches: main Add a mechanism that can help debug the compiler itself, specifically (but not only) the code generator. I used this mechanism a while ago to locate a hard-to-find bug, and it may be useful again in the future. compiler/hlds_desc.m: A new module whose job is to write out goals in a form that is suitable for use in progress messages from compiler passes (the predicates in hlds_out.m write out far too much for this). compiler/hlds.m: compiler/notes/compiler_design.html: Add the new module. compiler/options.m: Add a developer-only option that selects the predicate whose code generation should be traced. compiler/code_gen.m: compiler/code_info.m: compiler/ite_gen.m: Print progress messages if the compiler is compiled with the right compile-time trace flag and the new option says we should. compiler/opt_debug.m: Change the interfaces of the procedures in this module slightly to make them usable in progress messages. compiler/frameopt.m: compiler/optimize.m: Conform to the changes to opt_debug. compiler/ll_backend.m: Fix comments. |
||
|
|
7593b61b70 |
Introduce coverage profiling. While regular profiling shows which procedures
Estimated Hours Taken: 100
Branches: main
Introduce coverage profiling. While regular profiling shows which procedures
are used the most, coverage profiling goes further, and also shows the most
common execution paths through each procedure. We intend coverage profiling
data to be used by a future automatic parallelization pass in the compiler.
However, it should also be useful for other purposes.
This diff adds a compiler option, --coverage-profiling, that, when specified,
adds program instrumentation to record execution counts at selected points
in procedure bodies. The implementation currently stores coverage profiling
information in ProcStatic structures. We will later investigate the impact
of storing this information in ProcDynamic structures instead.
For now coverage statistics can be viewed with the mdprof_dump tool.
compiler/deep_profiling.m:
Introduced coverage profiling transformation after deep profiling
transformation, it will be run if at least one coverage point type is
enabled.
compiler/goal_util.m:
Created create_conj_from_list/3, to create a conjunction from a list of
goals.
compiler/hlds_goal.m:
Added dp_goal_info structure that stores information relevant to the deep
profiling pass.
Added egi_maybe_dp field to extra_goal_info to store a dp_goal_info
structure when required by the deep profiler.
compiler/hlds_pred.m:
Added a list of coverage_point_info structures to the hlds_proc_static
structure.
compiler/hlds_out.m:
Added the ability to dump the dp_goal_info structure when the correct
option is given.
compiler/layout.m:
Added extra layout structures to store coverage point static and dynamic
data for each procedure.
compiler/layout_out.m:
Added code to write out new layout structures in the Low Level C
Backend.
Added code to write out references to coverage point data from the proc
static structures.
Conform to changes in layout.m.
compiler/opt_debug.m:
Conform to changes in layout.m,
compiler/options.m:
Added command line parameters to enable coverage profiling and different
coverage points, as well as options that can cause coverage points not
to be inserted in some circumstances.
compiler/handle_options.m:
Added hlds dump options to the 'all' aliases for dumping the new
dp_goal_info structure in the hlds_info.
deep_profiler/dump.m:
Modified to dump coverage profiling data read in from Deep.data file.
deep_profiler/profile.m:
Added coverage points to proc static structure.
deep_profiler/read_profile.m:
Incremented Deep.data format version number.
Read coverage points from Deep.data file.
Confirm to changes in profile.m.
library/profiling_builtin.m:
Added real declaration and dummy implementation of
increment_coverage_point_count/2. This represents the instrumentation
introduced by the coverage profiling transformation.
mdbcomp/program_representation.m:
Added types to support coverage profiling, including a foreign_enum for
cp_type.
Added coverage_point_type_c_value to convert a coverage point type to a
string representing it's C value.
runtime/mercury_deep_profiling.c:
Incremented Deep.data format version.
Implemented writing out of coverage points to the Deep.data file.
runtime/mercury_deep_profiling.h:
Modified runtime structures to support storing coverage point
information in proc static structures.
doc/user_guide.texi:
Documented coverage profiling options, however this is commented out since
it's experimental.
|
||
|
|
097b45acec |
Fix two problems that together caused bug Mantis bug #44.
Estimated hours taken: 12 Branches: main Fix two problems that together caused bug Mantis bug #44. The first bug was that unify_gen.m wasn't checking whether a variable it was adding to a closure was of dummy type or not. The second bug was that the code for recognizing whether a type is dummy or not recognized only two cases: builtin dummy types such as io.state, and types with one function symbol of arity zero. In this program, there is a notag wrapper around a dummy type. Since the representation of a notag type is always the same as the type it wraps, this notag type should be recognized as a dummy type too. compiler/unify_gen.m: Fix the first bug by adding the required checks. compiler/code_info.m: Add a utility predicate to factor out some now common code in unify_gen.m. (The modifications to all the following files were to fix the second bug.) compiler/hlds_data.m: compiler/prog_type.m: Change the type_category type (in prog_type.m) and the enum_or_dummy type (in hlds_data.m) to separate out the representation of notag types from other du types. This allows the fix for the second bug, and incidentally allows some parts of the compiler to avoid the same tests over and over. To ensure that all places in the compiler that could need special handling for notag types get them, rename those types to type_ctor_category (since it does *not* take argument types into account) and du_type_kind respectively. Since the type_ctor_category type needs to be modified anyway, change it to allow code that manipulates values of the type to factor out common code fragments. Rename some predicates, and turn some into functions where this helps to make code (either here or in clients) more robust. compiler/make_tags.m: When creating a HLDS representation for a du type, record whether it is a notag type (we already recorded whether it is enum or dummy). compiler/type_util.m: Fix the predicate that tests for dummy types by recognizing the third way a type can be a dummy type. Don't test for dummyness of the argument when deciding whether a type could be a notag types; just record it as a notag type, and let later lookup code use the new fixed algorithm to do the right thing. Add a type for recording the is_dummy_type/is_not_dummy_type distinction. Rename some predicates, and turn some into functions where this helps to make code (either here or in clients) more robust. Add an XXX about possible redundant code. compiler/llds.m: Use the new type instead of booleans in some places. compiler/add_pragma.m: compiler/add_special_pred.m: compiler/add_type.m: compiler/bytecode_gen.m: compiler/continuation_info.m: compiler/ctgc.selector.m: compiler/ctgc.util.m: compiler/equiv_type_hlds.m: compiler/erl_call_gen.m: compiler/erl_code_gen.m: compiler/erl_code_util.m: compiler/erl_unify_gen.m: compiler/exception_analysis.m: compiler/export.m: compiler/foreign.m: compiler/higher_order.m: compiler/hlds_data.m: compiler/hlds_out.m: compiler/hlds_pred.m: compiler/inst_match.m: compiler/intermod.m: compiler/llds_out.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_code_util.m: compiler/ml_simplify_switch.m: compiler/ml_switch_gen.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/mlds.m: compiler/mlds_to_c.m: compiler/mlds_to_gcc.m: compiler/mlds_to_il.m: compiler/mlds_to_java.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/polymorphism.m: compiler/pragma_c_gen.m: compiler/prog_type.m: compiler/rtti_to_mlds.m: compiler/simplify.m: compiler/special_pred.m: compiler/stack_layout.m: compiler/switch_gen.m: compiler/switch_util.m: compiler/table_gen.m: compiler/term_constr_util.m: compiler/term_norm.m: compiler/trace_gen.m: compiler/trailing_analysis.m: compiler/type_ctor_info.m: compiler/type_util.m: compiler/unify_proc.m: compiler/var_locn.m: Conform to the changes above. Make a few analyses more precise by using the new detail in the type_ctor_category type to make less conservative assumptions about du types that are either notag or dummy. In ctgc.selector.m, ctgc.util.m, make_tags.m, mlds_to_java.m and special_pred.m, add XXXs about possible bugs. tests/valid/fzn_debug_abort.m: Add the bug demo program from Mantis as a regression test. tests/valid/Mmakefile: tests/valid/Mercury.options: Enable the new test, and run it with the old bug-inducing option. |
||
|
|
e0ff2b1903 |
Implement conditional structure reuse for LLDS backends using Boehm GC.
Estimated hours taken: 15
Branches: main
Implement conditional structure reuse for LLDS backends using Boehm GC.
Verify at run time, just before reusing a dead cell, that the base address of
the cell was dynamically allocated. If not, fall back to allocating a new
object on the heap. This makes structure reuse safe without having to disable
static data.
In the simple case, the generated C code looks like this:
MR_tag_reuse_or_alloc_heap(dest, tag, addr_of_reuse_cell,
MR_tag_alloc_heap(dest, tag, count));
...assign fields...
If some of the fields are known to already have the correct values then we can
avoid assigning them. We need to handle both reuse and non-reuse cases:
MR_tag_reuse_or_alloc_heap_flag(dest, flag_reg, tag, addr_of_reuse_cell,
MR_tag_alloc_heap(dest, tag, count));
/* flag_reg is non-zero iff reuse is possible */
if (flag_reg) {
goto skip;
}
...assign fields which don't need to be assigned in reuse case...
skip:
...assign fields which must be assigned in both cases...
It may be that it is not worth the branch to avoid assigning known fields.
I haven't yet checked.
compiler/llds.m:
Extend the `incr_hp' instruction to hold information for structure
reuse.
compiler/code_info.m:
Generate a label and pass it to `var_locn_assign_cell_to_var'. The
label is only needed for the type of code shown above.
compiler/var_locn.m:
Change the code generated for cell reuse. Rather than assigning the
dead cell's address to the target lval unconditionally, generate an
`incr_hp' instruction with the reuse field filled in.
Generate code that avoids filling in known fields if possible.
Abort if we see `construct_statically(_)' in
`var_locn_assign_dynamic_cell_to_var'.
runtime/mercury_heap.h:
runtime/mercury_conf_param.h:
Add a macro to check if an address is between
`GC_least_plausible_heap_addr' and `GC_greatest_plausible_heap_addr',
which are therefore in the heap.
Add macros to conditionally reuse a cell or otherwise fall back to
allocating a new object.
Make it possible to revert to unconditional structure reuse by
defining the C macro `MR_UNCONDITIONAL_STRUCTURE_REUSE'.
compiler/llds_out.m:
Call the new macros in `mercury_heap.h' for `incr_hp' instructions
with reuse information filled in.
compiler/dupelim.m:
compiler/dupproc.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/llds_to_x86_64.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/reassign.m:
compiler/unify_gen.m:
compiler/use_local_vars.m:
Conform to the changed `incr_hp' instruction.
|
||
|
|
4e1561f97f |
1) Implement the saving of protected regions at disj frames ONLY for semidet
Estimated hours taken: 8 Branch: main 1) Implement the saving of protected regions at disj frames ONLY for semidet disjunctions. We need this information so as to be able to commit the regions' removals when a nonlast disjunct of a semidet disjunction succeeds. Now the fixed part of a disj frame will have 4 slots instead of 3. The extra slot is to store the number of protected regions. It will be zero for nondet disjunctions. For semidet disjunctions we can use the information in their rbmm_goal_info for protected regions and allocated-into regions. Also implement the profiling and debugging messages for the saving. 2) Fix a bug relates to the saving of region size records at ite frame. 3) Do not protect regions that are explicitly removed at the end of the condition of an if-then-else. It is because when the program reaches that point, we know for sure that the condition will succeed. 4) Improving the profiling of RBMM compiler/disj_gen.m: Implement 1). Remove the old code that do the saving for nondet disjunctions. compiler/llds.m compiler/llds_out.m compiler/opt_debug.m Make them conform to the changes for 1). compiler/options.m Make sure the size of the fixed part of disj frames is 4 (slots). Change to conform to the changes for 1). runtime/mercury_region.h runtime/mercury_region.c Implement the saving and its profiling and debugging messages. Implement the improvement of the profiling of RBMM: more information, more structured when displaying the profiling information, correct some bugs. runtime/mercury_types.h: To conform to the changes at 1). |
||
|
|
cc88711d63 |
Implement true multi-cons_id arm switches, i.e. switches in which we associate
Estimated hours taken: 40
Branches: main
Implement true multi-cons_id arm switches, i.e. switches in which we associate
more than one cons_id with a switch arm. Previously, for switches like this:
(
X = a,
goal1
;
( X = b
; X = c
),
goal2
)
we duplicated goal2. With this diff, goal2 won't be duplicated. We still
duplicate goals when that is necessary, i.e. in cases which the inner
disjunction contains code other than a functor test on the switched-on var,
like this:
(
X = a,
goal1
;
(
X = b,
goalb
;
X = c
goalc
),
goal2
)
For now, true multi-cons_id arm switches are supported only by the LLDS
backend. Supporting them on the MLDS backend is trickier, because some MLDS
target languages (e.g. Java) don't support the concept at all. So when
compiling to MLDS, we still duplicate the goal in switch detection (although
we could delay the duplication to just before code generation, if we wanted.)
compiler/options.m:
Add an internal option that tells switch detection whether to look for
multi-cons_id switch arms.
compiler/handle_options.m:
Set this option based on the back end.
Add a version of the "trans" dump level that doesn't print unification
details.
compiler/hlds_goal.m:
Extend the representation of switch cases to allow more than one
cons_id for a switch arm.
Add a type for representing switches that also includes tag information
(for use by the backends).
compiler/hlds_data.m:
For du types, record whether it is possible to speed up tests for one
cons_id (e.g. cons) by testing for the other (nil) and negating the
result. Recording this information once is faster than having
unify_gen.m trying to compute it from scratch for every single
tag test.
Add a type for representing a cons_id together with its tag.
compiler/hlds_out.m:
Print out the cheaper_tag_test information for types, and possibly
several cons_ids for each switch arm.
Add some utility predicates for describing switch arms in terms of
which cons_ids they are for.
Replace some booleans with purpose-specific types.
Make hlds_out honor is documentation, and not print out detailed
information about unifications (e.g. uniqueness and static allocation)
unless the right character ('u') is present in the control string.
compiler/add_type.m:
Fill in the information about cheaper tag tests when adding a du type.
compiler/switch_detection.m:
Extend the switch detection algorithm to detect multi-cons_id switch
arms.
When entering a switch arm, update the instmap to reflect that the
switched-on variable can now be bound only to the cons_ids that this
switch arm is for. We now need to do this, because if the arm contains
another switch on the same variable, computing the can_fail field of
that switch correctly requires us to know this information.
(Obviously, an arm for a single cons_id is unlikely to have switch on
the same variable, and for arms for several cons_ids, we previously
duplicated the arm and left the unification with the cons_id in each
copy, and this unification allowed the correct handling of any later
switches. However, the code of a multi-cons_id switch arm obviously
cannot have a unification with each cons_id in it, which is why
we now need to get the binding information from the switch itself.)
Replace some booleans with purpose-specific types, and give some
predicates better names.
compiler/instmap.m:
Provide predicates for recording that a switched-on variable has
one of several given cons_ids, for use at the starts of switch arms.
Give some predicates better names.
compiler/modes.m:
Provide predicates for updating the mode_info at the start of a
multi-cons_id switch arm.
compiler/det_report.m:
Handle multi-cons_id switch arms.
Update the instmap when entering each switch arm, since this is needed
to provide good (i.e. non-misleading) error messages when one switch on
a variable exists inside another switch on the same variable.
Since updating the instmap requires updating the module_info (since
the new inst may require a new entry in an inst table), thread the
det_info through as updateable state.
Replace some multi-clause predicate definitions with single clauses,
to make it easier to print the arguments in mdb.
Fix some misleading variable names.
compiler/det_analysis.m:
Update the instmap when entering each switch arm and thread the
det_info through as updateable state, since the predicates we call
in det_report.m require this.
compiler/det_util.m:
Handle multi-cons_id switch arms.
Rationalize the argument order of some access predicates.
compiler/switch_util.m:
Change the parts of this module that deal with string and tag switches
to optionally convert each arm to an arbitrary representation of the
arm. In the LLDS backend, the conversion process generated code for
the arm, and the arm's representation is the label at the start of
this code. This way, we can duplicate the label without duplicating
the code.
Add a new part of this module that associates each cons_id with its
tag, and (during the same pass) checks whether all the cons_ids are
integers, and if so what are min and max of these integers (needed
for dense switches). This scan is needed because the old way of making
this test had single-cons_id switch arms as one of its basic
assumptions, and doing it while adding tags to each case reduces
the number of traversals required.
Give better names to some predicates.
compiler/switch_case.m:
New module to handle the tasks associated with managing multi-cons_id
switch arms, including representing them for switch_util.m.
compiler/ll_backend.m:
Include the new module.
compiler/notes/compiler_design.html:
Note the new module.
compiler/llds.m:
Change the computed goto instruction to take a list of maybe labels
instead of a list of labels, with any missing labels meaning "not
reached".
compiler/string_switch.m:
compiler/tag_switch.m:
Reorganize the way these modules work. We can't generate the code of
each arm in place anymore, since it is now possible for more than one
cons_id to call for the execution of the same code. Instead, in
string_switch.m, we generate the codes of all the arms all at once,
and construct the hash index afterwards. (This approach simplifies
the code significantly.)
In tag switches (unlike string switches), we can get locality benefits
if the code testing for a cons_id is close to the code for that
cons_id, so we still try to put them next to each other when such
a locality benefit is available.
In both modules, the new approach uses a utility predicate in
switch_case.m to actually generate the code of each switch arm,
eliminating several copies the same code in the old versions of these
modules.
In tag_switch.m, don't create a local label that simply jumps to the
code address do_not_reached. Previously, we had to do this for
positions in jump tables that corresponded to cons_ids that the switch
variable could not be bound to. With the change to llds.m, we now
simply generate a "no" instead.
compiler/lookup_switch.m:
Get the info about int switch limits from our caller; don't compute it
here.
Give some variables better names.
compiler/dense_switch.m:
Generate the codes of the cases all at once, then assemble the table,
duplicate the labels as needed. This separation of concerns allows
significant simplifications.
Pack up all the information shared between the predicate that detects
whether a dense switch is appropriate and the predicate that actually
generates the dense switch.
Move some utility predicates to switch_util.
compiler/switch_gen.m:
Delete the code for tagging cons_ids, since that functionality is now
in switch_util.m.
The old version of this module could call the code generator to produce
(i.e. materialize) the switched-on variable repeatedly. We now produce
the variable once, and do the switch on the resulting rval.
compiler/unify_gen.m:
Use the information about cheaper tag tests in the type constructor's
entry in the HLDS type table, instead of trying to recompute it
every time.
Provide the predicates switch_gen.m now needs to perform tag tests
on rvals, as opposed to variables, and against possible more than one
cons_id.
Allow the caller to provide the tag corresponding to the cons_id(s)
in tag tests, since when we are generating code for switches, the
required computations have already been done.
Factor out some code to make all this possible.
Give better names to some predicates.
compiler/code_info.m:
Provide some utility predicates for the new code in other modules.
Give better names to some existing predicates.
compiler/hlds_code_util.m:
Rationalize the argument order of some predicates.
Replace some multi-clause predicate definitions with single clauses,
to make it easier to print the arguments in mdb.
compiler/accumulator.m:
compiler/add_heap_ops.m:
compiler/add_pragma.m:
compiler/add_trail_ops.m:
compiler/assertion.m:
compiler/build_mode_constraints.m:
compiler/check_typeclass.m:
compiler/closure_analysis.m:
compiler/code_util.m:
compiler/constraint.m:
compiler/cse_detection.m:
compiler/dead_proc_elim.m:
compiler/deep_profiling.m:
compiler/deforest.m:
compiler/delay_construct.m:
compiler/delay_partial_inst.m:
compiler/dep_par_conj.m:
compiler/distance_granularity.m:
compiler/dupproc.m:
compiler/equiv_type_hlds.m:
compiler/erl_code_gen.m:
compiler/exception_analysis.m:
compiler/export.m:
compiler/follow_code.m:
compiler/follow_vars.m:
compiler/foreign.m:
compiler/format_call.m:
compiler/frameopt.m:
compiler/goal_form.m:
compiler/goal_path.m:
compiler/goal_util.m:
compiler/granularity.m:
compiler/hhf.m:
compiler/higher_order.m:
compiler/implicit_parallelism.m:
compiler/inlining.m:
compiler/inst_check.m:
compiler/intermod.m:
compiler/interval.m:
compiler/lambda.m:
compiler/lambda.m:
compiler/lambda.m:
compiler/lco.m:
compiler/live_vars.m:
compiler/livemap.m:
compiler/liveness.m:
compiler/llds_out.m:
compiler/llds_to_x86_64.m:
compiler/loop_inv.m:
compiler/make_hlds_warn.m:
compiler/mark_static_terms.m:
compiler/middle_rec.m:
compiler/ml_tag_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/mode_constraints.m:
compiler/mode_errors.m:
compiler/mode_util.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/pd_cost.m:
compiler/pd_into.m:
compiler/pd_util.m:
compiler/peephole.m:
compiler/polymorphism.m:
compiler/post_term_analysis.m:
compiler/post_typecheck.m:
compiler/purity.m:
compiler/quantification.m:
compiler/rbmm.actual_region_arguments.m:
compiler/rbmm.add_rbmm_goal_infos.m:
compiler/rbmm.condition_renaming.m:
compiler/rbmm.execution_paths.m:
compiler/rbmm.points_to_analysis.m:
compiler/rbmm.region_transformation.m:
compiler/recompilation.usage.m:
compiler/saved_vars.m:
compiler/simplify.m:
compiler/size_prof.m:
compiler/ssdebug.m:
compiler/store_alloc.m:
compiler/stratify.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/structure_reuse.indirect.m:
compiler/structure_reuse.lbu.m:
compiler/structure_reuse.lfu.m:
compiler/structure_reuse.versions.m:
compiler/structure_sharing.analysis.m:
compiler/table_gen.m:
compiler/tabling_analysis.m:
compiler/term_constr_build.m:
compiler/term_norm.m:
compiler/term_pass1.m:
compiler/term_traversal.m:
compiler/trailing_analysis.m:
compiler/transform_llds.m:
compiler/tupling.m:
compiler/type_ctor_info.m:
compiler/type_util.m:
compiler/unify_proc.m:
compiler/unique_modes.m:
compiler/unneeded_code.m:
compiler/untupling.m:
compiler/unused_args.m:
compiler/unused_imports.m:
compiler/xml_documentation.m:
Make the changes necessary to conform to the changes above, principally
to handle multi-cons_id arm switches.
compiler/ml_string_switch.m:
Make the changes necessary to conform to the changes above, principally
to handle multi-cons_id arm switches.
Give some predicates better names.
compiler/dependency_graph.m:
Make the changes necessary to conform to the changes above, principally
to handle multi-cons_id arm switches. Change the order of arguments
of some predicates to make this easier.
compiler/bytecode.m:
compiler/bytecode_data.m:
compiler/bytecode_gen.m:
Make the changes necessary to conform to the changes above, principally
to handle multi-cons_id arm switches. (The bytecode interpreter
has not been updated.)
compiler/prog_rep.m:
mdbcomp/program_representation.m:
Change the byte sequence representation of goals to allow switch arms
with more than one cons_id. compiler/prog_rep.m now writes out the
updated representation, while mdbcomp/program_representation.m reads in
the updated representation.
deep_profiler/mdbprof_procrep.m:
Conform to the updated program representation.
tools/binary:
Fix a bug: if the -D option was given, the stage 2 directory wasn't
being initialized.
Abort if users try to give that option more than once.
compiler/Mercury.options:
Work around bug #32 in Mantis.
|
||
|
|
1156b8b0de |
Fix a problem in region handling spotted by Quan Phan: when a model_semi
Estimated hours taken: 4
Branches: main
Fix a problem in region handling spotted by Quan Phan: when a model_semi
disjunction commits to a non-last disjunct, we need to clean up the disj frame
it allocated.
This diff should have no effect in non-rbmm grades.
compiler/disj_gen.m:
In model_semi disjunctions, invoke a cleanup instruction on exit from
non-last disjuncts.
compiler/llds.m:
Add a representation for this cleanup instruction as a new region op.
compiler/llds_out.m:
compiler/opt_debug.m:
Handle the new region op.
runtime/mercury_region.h:
Add a draft implementation of this cleanup instruction; the remainder
is up to Quan.
tests/hard_coded/semi_disj.{m,exp}:
New test case to test the fix (for now, only by examining the generated
C code).
tests/hard_coded/Mmakefile:
Enable the new test case.
|
||
|
|
a99aaec791 |
Dump instructions in blocks the same way we dump instructions outside
Estimated hours taken: 0.2 Branches: main compiler/opt_debug.m: Dump instructions in blocks the same way we dump instructions outside blocks, to make it easier to understand the diff generated by the optimization phase that wraps instructions in blocks. |
||
|
|
22cead47a7 |
Fix bug #28 in Mantis. The only substantive change is to code_info.m; the
Estimated hours taken: 4 Branches: main Fix bug #28 in Mantis. The only substantive change is to code_info.m; the changes to the other compiler modules are cosmetic only. compiler/code_info.m: Fix bug #28 in Mantis. The problem was with the code that generated the annotation giving the set of live lvalues at calls: it didn't delete from the set the registers used for passing dummy arguments, such as I/O states. A recursive call for an I/O predicate would thus compute the correct set of live lvalues at the start of the predicate body (in the case of the test case, {r1}), but the wrong set at the recursive call ((in the case of the test case, {r1,r2}, with r2 being the register assigned to hold the I/O state argument). The bug was an abort caused by a sanity check looking for this kind of mismatch. compiler/c_util.m: Make two predicates into functions to make them easier to use. compiler/opt_debug.m: Use those functions. compiler/ml_code_gen.m: compiler/pragma_c_gen.m: Conform to the change to c_util. compiler/jumpopt.m: Delete unnecessary module qualifications. tests/valid/testxmlreader.m: tests/valid/xmlreader.m: A regression test for this bug. It is in valid rather than hard_coded because it cannot be made executable without libraries that not all machines have, and which it would be inappropriate to add to the test suite itself. tests/valid/Mmakefile: Enable the new test case. |
||
|
|
fa80b9a01a |
Make the parallel conjunction execution mechanism more efficient.
Branches: main Make the parallel conjunction execution mechanism more efficient. 1. Don't allocate sync terms on the heap. Sync terms are now allocated in the stack frame of the procedure call which originates a parallel conjunction. 2. Don't allocate individual sparks on the heap. Sparks are now stored in preallocated, growing arrays using an algorithm that doesn't use locks. 3. Don't have one mutex per sync term. Just use one mutex to protect concurrent accesses to all sync terms (it's is rarely needed anyway). This makes sync terms smaller and saves initialising a mutex for each parallel conjunction encountered. 4. We don't bother to acquire the global sync term lock if we know a parallel conjunction couldn't be executing in parallel. In a highly parallel program, the majority of parallel conjunctions will be executed sequentially so protecting the sync terms from concurrent accesses is unnecessary. par_fib(39) is ~8.4 times faster (user time) on my laptop (Linux 2.6, x86_64), which is ~3.5 as slow as sequential execution. configure.in: Update the configuration for a changed MR_SyncTerm structure. compiler/llds.m: Make the fork instruction take a second argument, which is the base stack slot of the sync term. Rename it to fork_new_child to match the macro name in the runtime. compiler/par_conj_gen.m: Change the generated code for parallel conjunctions to allocate sync terms on the stack and to pass the sync term to fork_new_child. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/reassign.m: compiler/use_local_vars.m: Conform to the change in the fork instruction. compiler/liveness.m: compiler/proc_gen.m: Disable use of the parallel conjunction operator in the compiler as older versions of the compiler will generate code incompatible with the new runtime. runtime/mercury_context.c: runtime/mercury_context.h: Remove the next pointer field from MR_Spark as it's no longer needed. Remove the mutex from MR_SyncTerm. Add a field to record if a spark belonging to the sync term was scheduled globally, i.e. if the parallel conjunction might be executed in parallel. Define MR_SparkDeque and MR_SparkArray. Use MR_SparkDeques to hold per-context sparks and global sparks. Change the abstract machine instructions MR_init_sync_term, MR_fork_new_child, MR_join_and_continue as per the main change log. Use a preprocessor macro MR_LL_PARALLEL_CONJ as a shorthand for !MR_HIGHLEVEL_CODE && MR_THREAD_SAFE. Take the opportunity to clean things up a bit. runtime/mercury_wsdeque.c: runtime/mercury_wsdeque.h: New files containing an implementation of work-stealing deques. We don't do work stealing yet but we use the underlying data structure. runtime/mercury_atomic.c: runtime/mercury_atomic.h: New files to contain atomic operations. Currently it just contains compare-and-swap for gcc/x86_64, gcc/x86 and gcc-4.1. runtime/Mmakefile: Add the new files. runtime/mercury_engine.h: runtime/mercury_mm_own_stacks.c: runtime/mercury_wrapper.c: Conform to runtime changes. runtime/mercury_conf_param.h: Update an outdated comment. |
||
|
|
2dc982cfe4 |
Make a representation of the program available to the deep profiler.
Estimated hours taken: 50 Branches: main Make a representation of the program available to the deep profiler. We do this by letting the user request, via the option "--deep-procrep-file" in MERCURY_OPTIONS, that when the Deep.data file is written, a Deep.procrep file should be written alongside it. The intended use of this information is the discovery of profitable parallelism. When a conjunction contains two expensive calls, e.g. p(...) and q(...) connected by some shared variables, the potential gain from executing them in parallel is limited by how early p produces those variables and how late q consumes them, and knowing this requires access to the code of p and q. Since the debugger and the deep profiler both need access to program representations, put the relevant data structures and the operations on them in mdbcomp. The data structures are significantly expanded, since the deep profiler deals with the whole program, while the debugger was interested only in one procedure at a time. The layout structures have to change as well. In a previous change, I changed proc layout structures to make room for the procedure representation even in non-debugging grades, but this isn't enough, since the procedure representation refers to the module's string table. This diff therefore makes some parts of the module layout structure, including of course the string table, also available in non-debugging grades. configure.in: Check whether the installed compiler can process switches on foreign enums correctly, since this diff depends on that. runtime/mercury_stack_layout.[ch]: runtime/mercury_types.h: Add a new structure, MR_ModuleCommonLayout, that holds the part of the module layout that is common to deep profiling and debugging. runtime/mercury_deep_profiling.[ch]: The old "deep profiling token" enum type was error prone, since at each point in the data file, only a subset was applicable. This diff breaks up the this enum into several enums, each consisting of the choice applicable at a given point. This also allows some of the resulting enums to be used in procrep files. Rename some enums and functions to avoid ambiguities, and in one case to conform to our naming scheme. Make write_out_proc_statics take a second argument. This is a FILE * that (if not NULL) asks write_out_proc_statics to write the representation of the current module to specified stream. These module representations go into the middle part of the program representation file. Add functions to write out the prologue and epilogue of this file. Write out procedure representations if this is requested. Factor out some code that is now used in more than one place. runtime/mercury_deep_profiling_hand.h: Conform to the changes to mercury_deep_profiling.h. runtime/mercury_builtin_types.c: Pass the extra argument in the argument lists of invocations of write_out_proc_statics. runtime/mercury_trace_base.[ch]: Conform to the name change from proc_rep to proc_defn_rep in mdbcomp. runtime/mercury_grade.h: Due to the change to layout structures, increment the binary compatibility version numbers for both debug and deep profiling grades. runtime/mercury_wrapper.[ch]: Provide two new MERCURY_OPTION options. The first --deep-procrep-file, allows the user to ask for the program representation to be generated. The second, --deep-random-write, allows tools/bootcheck to request that only a fraction of all program invocations should generate any deep profiling output. The first option will be documented once it is tested much more fully. The second option is deliberately not documented. Update the type of the variable that holds the address of the (mkinit-generated) write_out_proc_statics function to accept the second argument. util/mkinit.c: Pass the extra argument in the argument list of write_out_proc_statics. mdbcomp/program_representation.m: Extend the existing data structures for representing a procedure body to represent a procedure (complete with name), a module and a program. The name is implemented as string_proc_label, a form of proc_label that can be written out to files. This replaces the old proc_id type the deep profiler. Extend the representation of switches to record the identity of the variable being switched on, and the cons_ids of the arms. Without the former, we cannot be sure when a variable is first used, and the latter is needed for meaningful prettyprinting of procedure bodies. Add code for reading in files of bytecodes, and for making sense of the bytecodes themselves. (It is this code that uses foreign enums.) mdbcomp/prim_data.m: Note the relationship of proc_label with string_proc_label. mdbcomp/rtti_access.m: Add the access operations needed to find module string tables with the new organization of layout structures. Provide operations on bytecodes and string tables generally. trace/mercury_trace_cmd_browsing.c: Conform to the change to mdbcomp/program_representation.m. compiler/layout.m: Add support for a MR_ModuleCommonLayout. Rename some function symbols to avoid ambiguities. compiler/layout_out.m: Handle the new structure. compiler/stack_layout.m: Generate the new structure and the procedure representation bytecode in deep profiling grades. compiler/llds_out.m: Generate the code required to write out the prologue and epilogue of program representation files. Pass the extra argument in the argument lists of invocations of write_out_proc_statics that tells those invocations to write out the module representations between the prologue and the epilogue. compiler/prog_rep.m: When generating bytecodes, include the new information for switches. compiler/continuation_info.m: Replace a bool with a more expressive type. compiler/proc_rep.m: Conform to the change to continuation_info.m. compiler/opt_debug.m: Conform to the change to layout.m. deep_profiler/mdprof_procrep.m: A new test program to test the reading of program representations. deep_profiler/DEEP_FLAGS.in: deep_profiler/Mmakefile: Copy the contents of the mdbcomp module to this directory on demand, instead of linking to it. This is necessary now that the deep profiler depends directly on mdbcomp even if it is compiled in a non-debugging grade. The arrangements for doing this were copied from the slice directory, which has long done the same. Avoid a duplicate include of Mmake.deep.params. Add the new test program to the list of programs in this directory. Mmakefile: Go through deep_profiler/Mmakefile when deciding whether to do "mmake depend" in the deep_profiler directory. The old actions won't work correctly now that we need to copy some files from mdbcomp before we can run "mmake depend". deep_profiler/profile.m: Remove the code that was moved (in cleaned-up form) to mdbcomp. deep_profiler/dump.m: deep_profiler/profile.m: Conform to the changes above. browser/declarative_execution.m: browser/declarative_tree.m: Conform to the changes in mdbcomp. doc/user_guide.texi: Add commented out documentation of the two new options. slice/Mmakefile: Fix formatting, and a bug. library/exception.m: library/par_builtin.m: library/thread.m: library/thread.semaphore.m: Update all the handwritten modules to pass the extra argument now required by write_out_proc_statics. tests/debugger/declarative/dependency.exp: Conform to the change from proc_rep to proc_defn_rep. tools/bootcheck: Write out deep profiling data only from every 25th invocation, since otherwise the time for a bootcheck takes six times as long in deep profiling grades than in asm_fast.gc. However, do test the ability to write out program representations. Use the mkinit from the workspace, not the installed one. Don't disable line wrapping. |
||
|
|
1fac629e6d |
Add support for foreign enumerations to Mercury.
Estimated hours taken: 50
Branches: main
Add support for foreign enumerations to Mercury. These allow the
programmer to assign foreign language values as the representation of
enumeration constructors.
e.g.
:- type status
---> optimal
; infeasible
; unbounded
; unknown.
:- pragma foreign_enum("C", status/0, [
optimal - "STATUS_OPTIMAL",
infeasible - "STATUS_INFEASIBLE",
unbounded - "STATUS_UNBOUNDED",
unknown - "STATUS_UNKNOWN"
]).
The advantage of this is that when values of type status/0 are passed to
foreign code (C in this case) no translation is necessary. This should
simplify the task of writing bindings to foreign language libraries.
Unification and comparison for foreign enumerations are the usual
unification and comparison for enumeration types, except that the default
ordering on them is determined by the foreign representation of the
constructors. User-defined equality and comparison also work for foreign
enumeration types.
In order to implement foreign enumerations we have to introduce two
new type_ctor representations. The existing ones for enum type do not
work since they use the value of an enumeration constructor to perform
table lookups in the RTTI data structures. For foreign enumerations
we need to perform a linear search at the corresponding points. This
means that some RTTI operations related to deconstruction are more
expensive.
The dummy type optimisation is not applied to foreign enumerations as
the code generators currently initialise the arguments of non-builtin
dummy type foreign_proc arguments to zero. For unit foreign enumerations
they should be initialised to the correct foreign value. (This is could be
implemented but in practice it's probably not going to be worth it.)
Currently, foreign enumerations are only supported by the C backends.
compiler/prog_io_pragma.m:
Parse foreign_enum pragmas.
Generalise the code used to parse association lists of sym_names
and strings since this is now used by the code to parse foreign_enum
pragmas as well as that for foreign_export_enum pragmas.
Fix a typo: s/foreign_expor_enum/foreign_export_enum/
compiler/prog_item.m:
Represent foreign_enum pragmas in the parse tree.
compiler/prog_type.m:
Add a new type category for foreign enumerations.
compiler/modules.m:
Add any foreign_enum pragmas for enumeration types defined in the
interface of a module to the interface files.
Output foreign_import_module pragmas in the interface file
if any foreign_enum pragmas are included in it. This ensures that
the contents that any foreign declarations that are needed by the
foreign_enum pragmas are visible.
compiler/make_hlds_passes.m:
compiler/add_pragma.m:
Add pragma foreign_enum items to the HLDS after all the types
have been added. As they are added, error check them.
Change the constructor tag values of foreign enum types to their
foreign values.
compiler/module_qual.m:
Module qualify pragma foreign_enum items.
compiler/mercury_to_mercury.m:
Output foreign_enum pragmas.
Generalise some of the existing code for writing out association
lists in foreign_export_enum pragmas for use with foreign_enum
pragmas as well.
compiler/hlds_data.m:
Add the alternative `is_foreign_type' to the type enum_or_dummy/0.
Add new type of cons_tag, foreign_tag, whose values are directly
embedded in the target language.
compiler/intermod.m:
Write out any foreign_enum pragmas for opt_exported types.
(The XXX concerning attaching language information to foreign tags
will be addressed in a subsequent change.)
compiler/llds.m:
compiler/mlds.m:
Support new kinds of rval constants: llconst_foreign and
mlconst_foreign respectively. Both of these represent tag values
as strings that are intended to be directly embedded in the target
language.
compiler/llds_out.m:
Add code to write out the new kind of rval_const.
s/Integer/MR_Integer/ in a spot.
s/Float/MR_Float/ in a spot.
compiler/rtti.m:
compiler/rtti_out.m:
compiler/rtti_to_mlds.m:
compiler/type_ctor_info.m:
Add support the RTTI required by foreign enums.
compiler/switch_util.m:
Handle switches on foreign_enums as-per normal enumerations.
compiler/table_gen.m:
Tabling of foreign_enums is also like normal enumerations.
compiler/type_util.m:
Add a predicate that tests whether a type is a foreign enumeration.
compiler/unify_gen.m:
compiler/unify_proc.m:
compiler/ml_unify_gen.m:
Handle unification and comparison of foreign enumeration values.
They are treated like normal enumerations for the purposes of
implementing these operations.
compiler/ml_type_gen.m:
Handle foreign enumerations when generating the MLDS representation
of enumerations.
compiler/ml_util.m:
Add a function to create an initializer for an object with a
foreign tag.
compiler/mlds_to_c.m:
Handle mlconst_foreign/1 rval constants.
compiler/bytecode_gen.m:
compiler/dupproc.m:
compiler/erl_rtti.m:
compiler/exception_analysis.m:
compiler/export.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/hlds_out.m:
compiler/higher_order.m:
compiler/inst_match.m:
compiler/jumpopt.m:
compiler/llds_to_x86_64.m:
compiler/ml_code_util.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/polymorphism.m:
compiler/recompilation.version.m:
compiler/term_norm.m:
compiler/trailing_analysis.m:
Conform to the above changes.
doc/reference_manual.texi:
Document the new pragma.
Fix some typos: s/pramga/pragma/, s/behavior/behaviour/
library/construct.m:
Handle the two new type_ctor reps.
Break an over-long line.
library/rtti_implementation.m:
Support the two new type_ctor reps.
(XXX The Java versions of some of this cannot be implemented until
support for foreign enumerations is added to mlds_to_java.m.)
Reformat the inst usereq/0 and extend it to include foreign enums.
runtime/mercury_type_info.h:
Add two new type_ctor reps. One for foreign enumerations and
another for foreign enumerations with user equality.
Define new types (and extend existing ones) in order to support
RTTI for foreign enumerations.
runtime/mercury_unify_compare_body.h:
Implement generic unify and compare for foreign enumerations.
(It is the same as that for regular enumerations.)
runtime/mercury_construct.[ch]:
runtime/mercury_deconstruct.h:
Handle (de)construction of foreign enumeration values.
runtime/mercury_deep_copy_body.h:
Implement deep copy for foreign enumerations.
runtime/mercury_table_type_body.h:
runtime/mercury_term_size.c:
Handle the new type_ctor representations.
java/runtime/ForeignEnumFunctorDesc.java:
Add a Java version of the MR_ForeignEnumFuntorDesc structure.
(Note: this is untested, as the java grade runtime doesn't work
anyway.)
java/runtime/TypeFunctors.java:
Add a constructor method for foreign enumerations.
(Likewise, untested.)
NEWS:
Announce pragma foreign_enum.
vim/syntax/mercury.vim:
Highlight the new pragma appropriately.
tests/hard_coded/.cvsignore:
Ignore executables generated by the new tests.
Ignore a bunch of other files create by the Mercury compiler.
tests/hard_coded/Mmakefile:
tests/hard_coded/foreign_enum_rtti.{m,exp}:
Test RTTI for foreign enumerations.
tests/hard_coded/foreign_enum_dummy.{m,exp}:
Check that dummy type optimisation is disabled for foreign
enumerations.
tests/hard_coded/Mercury.options:
tests/hard_coded/foreign_enum_mod1.{m,exp}:
tests/hard_coded/foreign_enum_mod2.m:
Test that foreign_enum pragmas are hoisted into interface files
and that they are handled correctly in optimization interfaces.
tests/invalid/Mercury.options:
tests/invalid/Mmakefile:
tests/invalid/foreign_enum_import.{m,err_exp}:
tests/invalid/foreign_enum_invalid.{m,err_exp}:
Test that errors in foreign_enum pragmas are reported.
tests/tabling/Mmakefile:
tests/hard_coded/table_foreign_enum.{m,exp}:
Test case for tabling of foreign enumerations.
|
||
|
|
b48eaf8073 |
Add a first draft of the code generator support for region based memory
Estimated hours taken: 30 Branches: main Add a first draft of the code generator support for region based memory management. It is known to be incomplete; the missing parts are marked by XXXs. It may also be buggy; it will be tested after Quan adds the runtime support, i.e. the C macros invoked by the new LLDS instructions. However, the changes in this diff shouldn't affect non-RBMM operations. compiler/llds.m: Add five new LLDS instructions. Four are specific to RBMM operations. RBMM embeds three new stacks in compiler-reserved temp slots in procedure's usual Mercury stack frames, and the new LLDS instructions respectively (i) push those stack frames onto their respective stacks, (ii) fill some variable parts of those stack frames, (iii) fill fixed slots of those stack frames, and (iv) use the contents of and/or pop those stack frames. (The pushing and popping affect only the new embedded stacks, not the usual Mercury stacks.) The last instruction is a new variant of the old assign instruction. It has identical semantics, but restricts optimization. An assign (a) can be deleted if its target lval is not used, and (b) its target lval can be changed (e.g. to a temp register) as long as all the later instructions referring to that lval are changed to use the new lval instead. Neither is permitted for the new keep_assign instruction. This is required because in an earlier draft we used it to assign to stack variables (parts of the embedded stack frames) that aren't explicitly referred to in later LLDS code, but are nevertheless implicitly referred to by some instructions (specifically iv above). We now use a specialized instruction (iii above) for this (since the macro it invokes can refer to C structure names, this makes it easier to keep the compiler in sync with the runtime system), but given that keep_assign is already implemented, may be useful later and shouldn't cause appreciable slowdown of the compiler, this diff keeps it. Extend the type that describe the contents of lvals to allow it to describe the new kinds of things we can now store in them. Add types to manage and describe the new embedded stack frames, and some utility functions. Change some existing utility functions to make all this more conceptually consistent. compiler/ite_gen.m: Surround the code we generate for the condition of if-then-elses with the code required to ensure that regions that are logically removed in the condition aren't physically destroyed until we know that the condition succeeds (since the region may still be needed in the else branch), and to make sure that if the condition fails, all the memory allocated since the entry into the condition is reclaimed instantly. compiler/disj_gen.m: Surround the code we generate for disjunctions with the code required to ensure that regions that are logically removed in a disjunct aren't physically destroyed if a later disjunct needs them, and to make sure that at entry into a non-first disjunct, all the memory allocated since the entry into the disjunction is reclaimed instantly. compiler/commit_gen.m: compiler/code_info.m: The protection against destruction offered by a disjunction disappears when a commit cuts away all later alternatives in that disjunct, so we must undo that protection. We therefore surround the scope of a commit goal with goal that achieves that objective. Add some new utility predicates to code_info. Remove some old utility functions that are now in llds.m. compiler/continuation_info.m: Extend the type that describe the contents of stack slots to allow it to describe the new kinds of things we can now store in them. Rename the function symbols of that type to eliminate some ambiguities. compiler/code_gen.m: Remember the set of variables live at the start of the goal (before the pre_goal_update updates it), since the region operations need to know this. Leave the lookup of AddTrailOps (and now AddRegionOps) to the specific kinds of goals that need it (the most frequent goals, unify and call, do not). Make both AddTrailOps and AddRegionOps use a self-explanatory type instead of a boolean. compiler/lookup_switch.m: Conform to the change to AddTrailOps. Fix some misleading variable names. compiler/options.m: Add some options to control the number of stack slots needed for various purposes. These have to correspond to the sizes of some C structures in the runtime system. Eventually these will be constants, but it is handy to keep them easily changeable while the C data structures are still being worked on. Add an option for optimizing away region ops whereever possible. The intention is that these should be on all the time, but we will want to turn them off for benchmarking. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/frameopt.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/llds_to_x86_64.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/par_conj_gen.m: compiler/reassign.m: compiler/stack_layout.m: compiler/stdlabel.m: compiler/trace_gen.m: compiler/use_local_vars.m: Conform to the changes above, which mostly means handling the new LLDS instructions. In some cases, factor out existing common code, turn if-then-elses into switches, group common cases in switches, rationalize argument orders or variable names, and/or put code in execution order. In reassign.m, fix some old oversights that could (in some unlikely cases) cause bugs in the generated code. compiler/pragma_c_gen.m: Exploit the capabilities of code_info.m. compiler/prog_type.m: Add a utility predicate. |
||
|
|
1adaecaf35 |
Fix a problem I noticed while fixing bootstrap at -O6: determinism warnings for
Estimated hours taken: 4
Branches: main
Fix a problem I noticed while fixing bootstrap at -O6: determinism warnings for
private_builtin.m.
compiler/add_pred.m:
Mark builtin predicates (e.g. the recently added compound_{lt,eq})
for which we generate never-to-be-invoked dummy bodies as just stubs,
to avoid determinism warnings from the compiler.
compiler/c_util.m:
Reorganize the way we categorize binary operators. Instead of a set of
predicates for recognizing each class of binops, have just one
predicate that returns the category, along with the C operator (since
the two are usually needed together.
The reason for the change is that some binary operators, including
compound_{lt,eq} but also others, fell through the cracks. The new
predicate is a det switch on binary_op, so new operators won't be
able to fall through the cracks in future.
Unary operators were already being handled this way, though in a
simpler fashion, since they all fall into one category.
compiler/llds_out.m:
compiler/mlds_to_c.m:
compiler/mlds_to_managed.m:
Conform to the change to c_util.m. This involves switching from
if-then-else chains to switches, and deleting now unneeded predicates.
compiler/opt_debug.m:
Don't go through llds_out to get at the printable representations
of binary operators.
compiler/simplify.m:
Fix the formatting of the simplification of compound_{lt,eq}.
compiler/options.m:
Add a name for the compiler_sufficiently_recent option that will allow
configure scripts to test for the fix.
|
||
|
|
d4818a3ca4 |
Modify the code generator so that it recognizes construct_in_region and
Estimated hours taken: 35.
Branch: main.
Modify the code generator so that it recognizes construct_in_region and
generates suitable code when RBMM is used. The main
changes are in unify_gen.m. incr_hp is also changed to receive one more
(maybe) argument for region.
compiler/unify_gen.m:
Make it aware of HowToConstruct. This is the starting point of the
changes in the code generator so that it can generate code which
constructs terms in regions.
compiler/code_info.m:
compiler/var_locn.m:
Change in accordance with the introduction of how_to_construct in
unify_gen.m.
compiler/llds.m:
Add one extra argument to incr_hp for the region to construct terms
in.
compiler/dupelim.m:
compiler/dupproc.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/llds_to_x86_64.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/par_conj_gen.m:
compiler/reassign.m:
compiler/use_local_vars.m:
Change to deal with the extra maybe region argument in incr_hp.
compiler/llds_out.m:
Modify so that when RBMM is used it generates suitable call to
the region runtime for allocating terms in regions. The region
runtime (in C code) will be posted in anothe email.
compiler/hlds_data.m:
Fix a typo.
compiler/rbmm.interproc_region_lifetime.m:
Change to comply with coding standard.
|
||
|
|
0d6c3c9fdf |
Satisfy a request by Peter Ross: give mdb users the ability to specify
Estimated hours taken: 8
Branches: main
Satisfy a request by Peter Ross: give mdb users the ability to specify
exactly which events inside a procedure they want to put breakpoints on.
Amongst other things, this can make checking postconditions easier:
you can now put a conditional breakpoint on only the exit event.
runtime/mercury_stack_layout.h:
Add to the exec trace part of proc layouts fields describing an array
of the label layouts describing the events of the procedure. This is
the most direct way to implement the new functionality. (In theory,
we could search the data structures that map file names and line
numbers to label layouts, but that would be complex, inefficient,
and *error prone*.)
To make room for the new fields, and to prepare for making the
procedure body available to the deep profiler as well (which could
use it to map execution frequencies of calls back to their location
in the procedure body, and thus to things like the frequencies with
which various switch arms are selected), move the procedure body
of the execution-trace-specific part of the proc layout, and to
where the deep profiler could in the future also get at it.
runtime/mercury_goto.h:
Add macros for declaring more than one label layout structure at a
time.
compiler/layout.m:
Implement the changes in mercury_stack_layout.h in the compiler's
data structures as well.
compiler/stack_layout.m:
Conform to the changes in mercury_stack_layout.h.
Turn some tuples into named types.
compiler/layout_out.m:
Conform to the changes in mercury_stack_layout.h.
Add a mechanism to declare the label layouts in the new array before
referring to them, by generalizing some existing code. Make this code
require that the label layouts we refer to via the macros in
mercury_goto.h match the declarations generated by those macros,
i.e. that they have information about variables (if debugging is
enabled, they will).
compiler/opt_debug.m:
Conform to the change to layout.m.
compiler/prog_rep.m:
Make a predicate name more expressive.
trace/mercury_trace_cmd_breakpoint.c:
Implement the new way of using the "break" command, which is
to add a port name after the procedure specification.
Register procedures at the start of the function implementing
the "break" command, instead of "on demand", since all alternatives
eventually do demand it.
Write ambiguity reports wholly to mdb's stdout, instead of partially to
stderr and partially to stdout.
trace/mercury_trace_spy.c:
Print the port and the goal path for breakpoints on specific events.
Make the invocation conditions left-aligned, not right-aligned.
doc/user_guide.texi:
Document the new way of using the "break" command.
NEWS:
Announce the new capability.
tests/queens.{inp,exp,exp2}:
Add tests of the new capability.
tests/breakpoints.{exp,exp}:
tests/lval_desc_array.{exp,exp2}:
Expect the new alignment of invocation conditions.
|
||
|
|
5647714667 |
Make all functions which create strings from characters throw an exception
Estimated hours taken: 15 Branches: main Make all functions which create strings from characters throw an exception or fail if the list of characters contains a null character. This removes a potential source of security vulnerabilities where one part of the program performs checks against the whole of a string passed in by an attacker (processing the string as a list of characters or using `unsafe_index' to look past the null character), but then passes the string to another part of the program or an operating system call that only sees up to the first null character. Even if Mercury stored the length with the string, allowing the creation of strings containing nulls would be a bad idea because it would be too easy to pass a string to foreign code without checking. For examples see: <http://insecure.org/news/P55-07.txt> <http://www.securiteam.com/securitynews/5WP0B1FKKQ.html> <http://www.securityfocus.com/archive/1/445788> <http://www.securityfocus.com/archive/82/368750> <http://secunia.com/advisories/16420/> NEWS: Document the change. library/string.m: Throw an exception if null characters are found in string.from_char_list and string.from_rev_char_list. Add string.from_char_list_semidet and string.from_rev_char_list_semidet which fail rather throwing an exception. This doesn't match the normal naming convention, but string.from_{,rev_}char_list are widely used, so changing their determinism would be a bit too disruptive. Don't allocate an unnecessary extra word for each string created by from_char_list and from_rev_char_list. Explain that to_upper and to_lower only work on un-accented Latin letters. library/lexer.m: Check for invalid characters when reading Mercury strings and quoted names. Improve error messages by skipping to the end of any string or quoted name containing an error. Previously we just stopped processing at the error leaving an unmatched quote. library/io.m: Make io.read_line_as_string and io.read_file_as_string return an error code if the input file contains a null character. Fix an XXX: '\0\' is not recognised as a character constant, but char.det_from_int can be used to make a null character. library/char.m: Explain the workaround for '\0\' not being accepted as a char constant. Explain that to_upper and to_lower only work on un-accented Latin letters. compiler/layout.m: compiler/layout_out.m: compiler/c_util.m: compiler/stack_layout.m: compiler/llds.m: compiler/mlds.m: compiler/ll_backend.*.m: compiler/ml_backend.*.m: Don't pass around strings containing null characters (the string tables for the debugger). This doesn't cause any problems now, but won't work with the accurate garbage collector. Use lists of strings instead, and add the null characters when writing the strings out. tests/hard_coded/null_char.{m,exp}: Change an existing test case to test that creation of a string containing a null throws an exception. tests/hard_coded/null_char.exp2: Deleted because alternative output is no longer needed. tests/invalid/Mmakefile: tests/invalid/null_char.m: tests/invalid/null_char.err_exp: Test error messages for construction of strings containing null characters by the lexer. tests/invalid/unicode{1,2}.err_exp: Update the expected output after the change to the handling of invalid quoted names and strings. |
||
|
|
7b7dabb89a |
Extend this optimization to handle temporaries being both defined in
Estimated hours taken: 12
Branches: main
compiler/use_local_vars.m:
Extend this optimization to handle temporaries being both defined in
and used by foreign_proc_code instructions. This should eliminate
unnecessary accesses to the MR_fake_reg array, and thus speed up
programs that use foreign code a lot, including typeclass- and
tabling-intensive programs, since those features are implemented using
inline foreign code. I/O intensive should also benefit, but not much,
since the cost of the I/O itself overwhelms the cost of the
MR_fake_reg accesses.
Group together the LLDS instructions that are handled similarly.
Factor out some common code.
compiler/opt_util.m:
Allow for the fact that foreign_proc_codes can now refer to
temporaries.
compiler/opt_debug.m:
Print more useful information about foreign_proc_code components.
compiler/prog_data.m:
Rename the types and function symbols of the recently added
foreign_proc attributes to avoid clashing with the keywords
representing them in source code.
Add a new foreign_proc attribute, proc_may_duplicate that governs
whether the body of foreign code is allowed to be duplicated.
compiler/table_gen.m:
Include does_not_affect_liveness among the annotations for the
foreign_proc calls generated by this module. Some of these procedures
affect memory beyond their arguments, but that memory is in tables,
not in unlisted registers.
Allow some of the smaller code fragments generated by this module
to be duplicated.
compiler/inlining.m:
Respect the may_not_duplicate foreign_proc attribute.
compiler/pragma_c_gen.m:
Transmit any annotations about liveness from the HLDS to the LLDS,
since without does_not_affect_liveness annotations use_local_vars.m
cannot optimize foreign_proc_codes.
Transmit any annotations about may_duplicate from the HLDS to the LLDS,
since with them jumpopt can do a better job.
compiler/llds.m:
Use the new foreign_proc attribute instead of a boolean to represent
whether a foreign code fragment may be duplicated.
compiler/simplify.m:
Generate an error message if a may_duplicate or may_not_duplicate
attribute on a foreign_proc conflicts with a no_inline or inline pragma
(respectively) on the predicate it belongs to.
compiler/hlds_pred.m:
Fix some comment rot.
compiler/jumpopt.m:
compiler/livemap.m:
compiler/proc_gen.m:
compiler/trace_gen.m:
Conform to the changes above.
doc/reference_manual.texi:
Document the new foreign_proc attribute.
library/array.m:
library/builtin.m:
library/char.m:
library/dir.m:
library/float.m:
library/int.m:
library/io.m:
library/lexer.m:
library/math.m:
library/private_builtin.m:
library/string.m:
library/version_array.m:
Add does_not_affect_liveness annotations to the C foreign_procs that
deserve them.
configure.in:
Require the installed compiler to support does_not_affect_liveness.
tests/invalid/test_may_duplicate.{m,err_exp}:
Add a new test case to test the error checking code in simplify.m.
tests/invalid/Mmakefile:
Enable the new test case.
|
||
|
|
e53d899ba0 |
Add some foreign_proc attributes that Ben needs for his work on the native
Estimated hours taken: 1 Branches: main Add some foreign_proc attributes that Ben needs for his work on the native garbage collector. One group of attributes (allocates_memory) is intended for optimization: they let the compiler figure out whether it needs to emit code to check the amount of available heap space before the foreign_proc. The second group (registers_roots) is intended for catching errors: foreign procs that do not register (or at least do not *assert* that they register) the roots they may hold. compiler/prog_data.m: Define the attributes. Standardize on spelling "does_not" instead of "doesnt". compiler/prog_io_pragma.m: Parse the attributes. compiler/*.m: Conform to the new spelling of attribute names. doc/reference_manual.texi: Add documentation for the attributes. Since they are for developers only at the moment, leave the documentation commented out. |
||
|
|
ba93a52fe7 |
This diff changes a few types from being defined as equivalent to a pair
Estimated hours taken: 10 Branches: main This diff changes a few types from being defined as equivalent to a pair to being discriminated union types with their own function symbol. This was motivated by an error message (one of many, but the one that broke the camel's back) about "-" being used in an ambiguous manner. It will reduce the number of such messages in the future, and will make compiler data structures easier to inspect in the debugger. The most important type changed by far is hlds_goal, whose function symbol is now "hlds_goal". Second and third in importance are llds.instruction (function symbol "llds_instr") and prog_item.m's item_and_context (function symbol "item_and_context"). There are some others as well. In several places, I rearranged predicates to factor the deconstruction of goals into hlds_goal_expr and hlds_goal_into out of each clause into a single point. In many places, I changed variable names that used "Goal" to refer to just hlds_goal_exprs to use "GoalExpr" instead. I also changed variable names that used "Item" to refer to item_and_contexts to use "ItemAndContext" instead. This should make reading such code less confusing. I renamed some function symbols and predicates to avoid ambiguities. I only made one algorithmic change (at least intentionally). In assertion.m, comparing two goals for equality now ignores goal_infos for all kinds of goals, whereas previously it ignored them for most kinds of goals, but for shorthand goals it was insisting on them being equal. This seemed to me to be a bug. Pete, can you confirm this? |
||
|
|
6de3b102ba |
Add support for deconstructing by functor number rather than name,
Estimated hours taken: 20 Branches: main Add support for deconstructing by functor number rather than name, for use by write_binary. library/deconstruct.m: runtime/mercury_deconstruct.h: runtime/mercury_deconstruct.c: runtime/mercury_ml_expand_body.h: runtime/mercury_ml_deconstruct_body.h: Add predicates deconstruct.functor_number and deconstruct.deconstruct.du, which returns a functor number suitable for use by construct.construct rather than a functor name. library/construct.m: library/term.m: browser/term_rep.m: extras/quickcheck/qcheck.m: tests/valid/agc_unbound_typevars.m: tests/valid/agc_unbound_typevars2.m: Add a function get_functor_lex, which returns the lexicographic functor number given an ordinal functor number. Add equivalence types to make it clearer which ordering is being used by which functor numbers. Remove a C-ism: num_functors now fails rather than returning -1 for types without functors. NEWS: Document the new predicates and functions. runtime/mercury_type_info.h: runtime/mercury_builtin_types.c: runtime/mercury_mcpp.h: compiler/rtti.m: compiler/rtti_out.m: compiler/type_ctor_info.m: compiler/rtti_to_mlds.m: compiler/opt_debug.m: Add a field to MR_TypeCtorInfo which contains a mapping from an ordinal functor number to a lexicographic functor number which can be passed to construct.construct. Bump MR_RTTI_VERSION. tests/hard_coded/expand.m: tests/hard_coded/expand.exp: tests/hard_coded/expand.exp2: tests/hard_coded/construct_test.m: tests/hard_coded/construct_test.exp: tests/hard_coded/construct_test_exist.m: tests/hard_coded/construct_test_exist.exp: Test cases. |
||
|
|
d66ed699a1 |
Add fields to structures representing the C code itself that says whether
Estimated hours taken: 4 Branches: main Add fields to structures representing the C code itself that says whether or not the C code affects the liveness of lvals. This is intended as the basis for future improvements in the optimization of such code. Implement a new foreign_proc attribute that allows programmers to set the value of this field. Eliminate names referring to `pragma c_code' in the LLDS backend in favor of names referring to foreign_procs. compiler/llds.m: Make the changes described above. Consistently put the field containing C code last in the function symbols that contain them. compiler/prog_data.m: Make the changes described above. Rename some other function symbols to avoid ambiguity. compiler/prog_io_pragma.m: Parse the new foreign_proc attribute. doc/reference_manual.texi: Document the new attribute. compiler/pragma_c_gen.m: Rename the main predicates. compiler/opt_util.m: Change some predicates into functions, for more convenient invocation. compiler/livemap.m: Rename the predicates in this module to avoid ambiguity and the need for module qualification. compiler/*.m: Conform to the changes above. |
||
|
|
aa25c53cf6 |
For each attribute of each user event, record the order in which the attribute
Estimated hours taken: 6 Branches: main For each attribute of each user event, record the order in which the attribute it depends on, indirectly as well as directly, should be synthesized. This information allows each attribute to materialized independently of other attributes, a capability we don't yet need, but may need someday. compiler/prog_event.m: Generate this information. compiler/prog_data.m: Make room for the information generated by prog_event. compiler/layout.m: compiler/layout_out.m: Record this information in the layout structures we generate. compiler/opt_debug.m: Conform to the change to layout.m. runtime/mercury_stack_layout.h: Allow the -1 sentinel at the end of the attribute number vector. trace/mercury_trace_vars.c: Make the var_details mdb command (which is for developers only) to print the newly collected information. |
||
|
|
544aae2355 |
Implement synthesized attributes at user events.
Estimated hours taken: 32
Branches: main
Implement synthesized attributes at user events.
doc/user_guide.texi:
Document synthesized attributes.
runtime/mercury_stack_layout.h:
Include information about synthesized attributes in layout structures.
Move the the information about user events that is not specific to a
given event occurrence to a central table of user event specifications
in the module layout structure, to reduce the space overhead, and allow
a future Ducasse style execution monitor to look up information about
each event without having to search all label layout structures (e.g.
when compiling a set of user commands about what to do at each event
into a state machine).
Update the current layout structure version number.
Introduce a named type to represent HLDS variable numbers in layout
structures.
runtime/mercury_types.h:
Add typedefs for the new types in mercury_stack_layout.h.
compiler/prog_data.m:
compiler/prog_event.m:
Record the call that synthesizes synthesized attributes in terms of
attribute numbers (needed by the layout structures) as well as in terms
of attribute names (the user-friendly notation). Record some other
information now needed by the layout structures, e.g. the evaluation
order of synthesized attributes. (Previously, we computed this order
-in reverse- but then threw it away.)
Do not separate out non-synthesized attributes, now that we can handle
even synthesized ones. Return *all* attributes to call_gen.m.
Pass the filename of the event set specification file to the event
spec parser, for use in error messages.
Generate better error messages for semantic errors in event set
specifications.
compiler/continuation_info.m:
compiler/layout.m:
compiler/layout_out.m:
compiler/stack_layout.m:
Generate the new layout structures, the main changes being the
inclusion of synthesized attributes, and generating information about
event attribute names, types and maybe synthesis calls for all possible
events at once (for the module layout) instead of separately at each
user event occurrence.
In stack_layout.m, rename a couple of predicates to avoid ambiguities.
In layout_out.m, eliminate some repetitions of terms.
compiler/call_gen.m:
When processing event calls, match the supplied attributes against the
list of all attributes, and include information about the synthesized
attributes in the generated layout structures.
compiler/equiv_type.m:
compiler/module_qual.m:
Conform to the change in prog_data.m.
compiler/opt_debug.m:
Conform to the change in layout.m.
compiler/hlds_out.m:
Generate valid Mercury for event calls.
trace/mercury_event_spec.[ch]:
Record the name of the file being parsed, for use in syntax error
messages when the parser is invoked by the compiler.
Add a field for recording the line numbers of attributes, for use in
better error messages.
trace/mercury_event_parser.y:
trace/mercury_event_scanner.l:
Record the line numbers of attributes.
Modify the grammar for event set specifications to allow C-style
comments.
trace/mercury_trace_internal.c
Pass a dummy filename to mercury_event_spec. (The event set
specifications recorded in layout structures should be free of errors,
since the compiler generates them and it checked the original version
for errors.)
trace/mercury_trace_tables.[ch]:
Conform to the new design of layout structures, and record the extra
information now available.
trace/mercury_trace_vars.[ch]:
Record the values of attributes in two passes: first the
non-synthesized attributes, then the synthesized attributes
in the evaluation order recorded in the user event specification.
tests/debugger/synth_attr.{m,inp,exp}:
tests/debugger/synth_attr_spec:
New test case to test the debugger's handling of synthesized
attributes.
tests/debugger/Mmakefile:
tests/debugger/Mercury.options:
Enable the new test case.
tests/invalid/syntax_error_event.{m,err_exp}:
tests/invalid/syntax_error_event_spec:
New test case to test the handling of syntax errors in event set
specifications.
tests/invalid/synth_attr_error.{m,err_exp}:
tests/invalid/synth_attr_error_spec:
New test case to test the handling of semantic errors in event set
specifications.
tests/invalid/Mmakefile:
Enable the new test cases.
|
||
|
|
b3d4ab1760 |
For disjunctions in which all disjuncts are just facts, generate code
Estimated hours taken: 12
Branches: main
For disjunctions in which all disjuncts are just facts, generate code
that does table lookup.
compiler/disj_gen.m:
Implement lookup disjunctions. The new code based on the code in
lookup_switch.m for generating switches in which some some switch arms
are disjunctions. The code we generate is a specialized version
we would generate for such switch arms.
compiler/lookup_switch.m:
compiler/lookup_util.m:
Move some predicates from lookup_switch.m to lookup_util.m, to allow
disj_gen.m to call them also.
Reorder some argument lists to group related arguments together.
Give some predicates more descriptive names.
In lookup_switch.m, improve some comments. In lookup_util.m, avoid
computing ExprnOpts many times.
compiler/switch_gen.m:
Fix some obsolete some comments.
compiler/continuation_info.m:
Provide for a description of the stack slot used by the new code in
disj_gen.m.
compiler/opt_debug.m:
Pass a maybe(proc_label) argument to the predicates that dump rvals
and things that may contain rvals, so that rvals that are code
constants may be printed in the short, user-friendly format
(e.g. local_1, instead of the full, usually very long label name).
This was helpful in debugging the change to disj_gen.m.
Remove the old predicates whose sole purpose was to supply a
proc_label for this purpose (e.g. dump_label_for_proc), since now
even the predicates they are based on (e.g. dump_label) can now
be given a proc_label.
compiler/frameopt.m:
compiler/use_local_vars.m:
compiler/switch_gen.m:
Conform to the change to opt_debug.m.
tests/hard_coded/lookup_disj.{m,exp}:
New test case to test the handling of the lookup tables.
Existing test cases, specifically do_while, nasty_nondet, and
nondet_ite* in general, and pretty_printing in hard_coded,
are already tough tests of the handling of the code generator's
internal data structures (such as the failure stack and the set of
reserved stack slots).
tests/hard_coded/Mmakefile:
Enable the new test case.
|
||
|
|
45fd0daf5a |
Implement some of Mark's wish list for making user events more useful.
Estimated hours taken: 20
Branches: main
Implement some of Mark's wish list for making user events more useful.
1. When executing "print *" in mdb, we used to print both the values of all
attributes and the values of all live variables. Since some attributes'
values were given directly by live variables, this lead to some things being
printed twice. This diff eliminates this duplication.
2. At user events, we now print the name of the event. Whether we print the
other stuff we also print at events (the predicate containing the event,
and its source location) is now controlled by a new mdb command,
"user_event_context".
3. We would like different solvers to be compilable independently of one
another. This means that neither solver's event set should depend on the
existence of the events needed by the other solvers. This diff therefore
eliminates the requirement that all modules of the program be compiled with
the same event set specification. Instead, a program may contain modules
that were compiled with different event sets. Each event set is named;
the new requirement is that different named event sets may coexist in the
program (each being used to compile some modules), but two event sets with
the same name must be identical in all other respects as well (we need this
requirement to prevent inconsistencies arising between different versions of
the same event set).
4. We now generate user events even from modules compiled with --trace shallow.
The problem here is that user events can occur in procedures that do not
get caller events and whose ancestors may not get caller events either.
Yet these procedures must still pass on debugger information such as call
sequence numbers and the call depth to the predicate with the user event.
This diff therefore decouples the generation of code for this basic debugger
infrastructure information from the generation of call events by inventing
two new trace levels, settable by the compiler only (i.e. not from the
command line). The trace level "basic_user" is for procedures containing a
user event whose trace level (in a shallow traced module) would otherwise be
"none". The trace level "basic" is for procedures not containing a user
event but which nevertheless may need to transmit information (e.g. depth)
to a user event. For the foreseeable future, this means that shallow traced
modules containing user events will have some debugging overhead compiled
into *all* their procedures.
runtime/mercury_stack_layout.h:
Add a new field to MR_UserEvent structures, giving the HLDS number of
the variable representing each attribute.
Add a new field to module layout structures, giving the name of the
event set (if any) the module was compiled with.
Add the new trace levels to the MR_TraceLevel type.
Update the current layout structure version number.
runtime/mercury_stack_trace.[ch]:
Allow the printing of the containing predicate's name and/or the
filename/linenumber context to be turned off when printing contexts.
Factor out some of the code involved in this printing.
Give a bunch of variables better names.
Rename a type to get rid of unnecessary underscores.
compiler/prog_data.m:
compiler/prog_event.m:
Include the event set name in the information we have about the event
set.
compiler/simplify.m:
Mark each procedure and each module that contains user events
as containing user events.
Use the same technique to mark each procedure that contains parallel
conjunctions as containing parallel conjunctions, instead of marking
the predicate containing the procedure. (Switch detection may eliminate
arbitrary goals, including parallel conjunctions, from switch arms
that are unreachable due to initial insts, and in any case we want to
handle the procedures of a predicate independently from each other
after mode analysis.)
Also, change the code handling generic calls to switch on the generic
call kind, and factor out some common code.
compiler/hlds_module.m:
compiler/hlds_pred.m:
Provide slots in the proc_info and the module_info for the information
gathered by simplify.
compiler/trace_params.m:
Implement the new trace levels described above. This required changing
the signature of some of the predicates of this module.
compiler/code_info.m:
Record whether the compiler generated any trace events. We need to know
this, because if it did, then we must generate a proc layout structure
for it.
compiler/proc_gen.m:
Act on the information recorded by code_info.m.
Factor out the code for generating the call event and its layout
structure, since the conditions for generating this event have changed.
compiler/continuation_info.m:
compiler/call_gen.m:
For each user event, record the id of the variables corresponding to
each argument of a user event.
compiler/layout.m:
compiler/layout_out.m:
compiler/stack_layout.m:
Generate the new field (giving the HLDS variable number of each
attribute) in user event structures, and the new field (event set name)
in module layout structures.
Allow the call event's layout structure to be missing. This is needed
for user events in shallow traced modules.
compiler/options.m:
compiler/handle_options.m:
compiler/mercury_compiler.m:
Rename the option for specifying event sets from --event-spec-file-name
to --event-set-file-name, since it specifies only one event set, not
all events.
compiler/jumpopt.m:
Give some predicates better names.
compiler/dep_par_conj.m:
compiler/deforest.m:
compiler/granularity.m:
compiler/hlds_out.m:
compiler/inlining.m:
compiler/intermod.m:
compiler/lambda.m:
compiler/liveness.m:
compiler/modes.m:
compiler/opt_debug.m:
compiler/optimize.m:
compiler/size_proc.m:
compiler/stack_alloc.m:
compiler/store_alloc.m:
compiler/table_gen.m:
compiler/trace_gen.m:
compiler/typecheck.m:
Conform to the changes above.
doc/mdb_categories:
Mention the new mdb command.
doc/user_guide.texi:
Update the documentation of user events to account for the changes
above.
trace/mercury_event_parser.y:
trace/mercury_event_scanner.l:
Modify the grammar for event set specifications to a name for the
event set.
trace/mercury_event_spec.[ch]:
Instead of recording information about event sets internally
in this module, return a representation of each event set read in
to the callers, for them to do with as they please.
Include the event set name when we print the Mercury term for
compiler/prog_event.m.
trace/mercury_trace.c:
Do not assume that every procedure that contains an event contains a
call event (and hence a call event layout structure), since that
is not true anymore.
trace/mercury_trace_cmd_parameter.[ch]:
Implement the new mdb command "user_event_context".
trace/mercury_trace_cmd_internal.[ch]:
Include "user_event_context" in the list of mdb commands.
Print the user event name at user events. Let the current setting
of the user_event_context mdb command determine what else to print
at such events.
Instead of reading in one event set on initialization, read in
all event sets that occur in the program.
trace/mercury_trace_tables.[ch]:
Allow the gathering of information for more than one event set
from the modules of the program.
trace/mercury_trace_vars.[ch]:
For each attribute value of a user event, record what the HLDS variable
number of the attribute is. When printing all variables at an event,
mark the variable numbers of printed attributes as being printed
already, causing the variable with the same number not to be printed.
Include the name of the variable (if it has one) in the description
of an attribute. Without this, users may wonder why the value of the
variable wasn't printed.
trace/mercury_trace_cmd_browsing.[ch]:
Pass the current setting of the user_event_context mdb command to
runtime/mercury_stack_trace.c when printing the context of an event.
tests/debugger/user_event_shallow.{m,inp,exp}:
New test case to test the new functionality. This test case is the same
as the user_event test case, but it is compiled with shallow tracing,
and its mdb input exercises the user_event_context mdb command.
tests/debugger/user_event_spec:
tests/invalid/invalid_event_spec:
Update these event set spec files by adding an event set name.
tests/debugger/Mmakefile:
tests/debugger/Mercury.options:
Enable the new test case.
tests/debugger/user_event.exp:
Update the expected output of the old user event test case, which now
prints event names, but doesn't print attribute values twice.
tests/debugger/completion.exp:
Expect the new "user_event_context" mdb command in the command list.
tests/debugger/mdb_command_test.inp:
Test the existence of the documentation for the new mdb command.
tests/invalid/Mercury.options:
Conform to the name change of the --event-spec-file-name option.
|
||
|
|
b4c3bb1387 |
Clean up in unused module imports in the Mercury system detected
Estimated hours taken: 3 Branches: main Clean up in unused module imports in the Mercury system detected by --warn-unused-imports. analysis/*.m: browser/*.m: deep_profiler/*.m: compiler/*.m: library/*.m: mdbcomp/*.m: profiler/*.m: slice/*.m: Remove unused module imports. Fix some minor departures from our coding standards. analysis/Mercury.options: browser/Mercury.options: deep_profiler/Mercury.options: compiler/Mercury.options: library/Mercury.options: mdbcomp/Mercury.options: profiler/Mercury.options: slice/Mercury.options: Set --no-warn-unused-imports for those modules that are used as packages or otherwise break --warn-unused-imports, e.g. because they contain predicates with both foreign and Mercury clauses and some of the imports only depend on the latter. |
||
|
|
9ec86d6a6d |
The objective of this diff is to switch from a table of solver events built
Estimated hours taken: 32
Branches: main
The objective of this diff is to switch from a table of solver events built
into the compiler (and eventually the debugger) into a table of events
defined by a file provided by the user to the compiler, which the compiler
then records in the executable for use by the debugger.
The current design, for speed of implementation, uses temporary files parsed
by a bison-generated parser. Since the compiler needs to be able to invoke
the parser even if it is compiled in a non-debug grade, the parser is in
a new library, the eventspec library, that is always linked into the Mercury
compiler and is always linked into any Mercury program with debugging enabled
(but is of course linked only once into a Mercury compiler which has debugging
enabled).
Modify the debugger to give it the ability to print the attributes of
user-defined events (for now, only the non-synthesized attributes).
Implement a new debugger command, "user", which goes to the next user-defined
event.
configure.in:
Require flex and and bison to be available.
doc/user_guide.texi:
Document user defined events and the new debugger capabilities.
doc/mdb_categories:
Include "user" in the list of forward movement commands.
Fix some earlier omissions in that list.
runtime/mercury_stack_layout.h:
Include an event number in the user-defined event structure.
Include a string representing an event set specification in module
layout structures.
runtime/mercury_stack_layout.h:
runtime/mercury_trace_base.[ch]:
runtime/mercury_types.h
Switch from solver events to user events in names.
runtime/mercury_trace_term.[ch]:
Provide a representation of flat terms, for use in representing
the calls that generate synthesized attributes.
Ensure that exported field names have an MR_ prefix.
browser/cterm.m:
Conform to the change to runtime/mercury_trace_term.h.
scripts/c2init.in:
scripts/ml.in:
Include the eventspec library in programs compiled with debugging
enabled.
compiler/Mmakefile:
Include the eventspec library in the compiler.
compiler/options.m:
Add a new option, --event-spec-file-name, that allows the user to
specify the set of user-defined events the program may use.
compiler/handle_options.m:
Set this optimization from an environment variable (which may be
set by the mmc script) if the new option is not explicitly given.
compiler/prog_data.m:
Define the data structures for the compiler's representation of the
event set specification.
Move some definitions around to group them more logically.
compiler/hlds_module.m:
Include the event set specification as a new field in the module_info.
compiler/prog_event.m:
Add the code for invoking the parser in the eventspec library,
and for converting the simple term output by the parser to the
compiler own representation, which contains more information
(to wit, the types of the function attributes) and which has had
a whole bunch of semantic checks done on it (e.g. whether synthesized
attributes depend on themselves or on nonexistent attributes).
Provide a function to generate a canonicalized version of the event
specification file.
compiler/module_qual.m:
compiler/equiv_type.m:
Process event spec specifications as well as items, to module qualify
the names of the types of event arguments, and expanding out
equivalence types.
In equiv_type.m, rename some variables to make clear what kind of info
they represent.
compiler/mercury_compile.m:
Process the event set specification file if one has been selected:
read it in, module qualify it, expand its equivalence types, and add
to the module_info.
compiler/compile_target_code.m:
Include the event_spec library when linking debuggable executables.
compiler/call_gen.m:
compiler/continuation_info.m:
compiler/trace_gen.m:
compiler/trace_params.m:
mdbcomp/prim_data.m:
mdbcomp/trace_counts.m:
runtime/mercury_goto.h:
Generate user-defined events instead of solver events.
compiler/layout.m:
compiler/layout_out.m:
compiler/stack_layout.m:
Include a canonicalized version of the event specification file
in the module layout if the module has any user-defined events.
compiler/code_info.m:
compiler/llds_out.m:
compiler/modes.m:
compiler/modules.m:
compiler/opt_debug.m:
compiler/typecheck.m:
Conform to the changes above.
compiler/passes_aux.m:
Rename a predicate to avoid an ambiguity.
trace/Mmakefile:
Add the definition and rules required to build the eventspec library.
trace/mercury_event_scanner.l:
trace/mercury_event_parser.y:
A scanner and a parser for reading in event spec specifications.
trace/mercury_event_spec_missing.h:
Provide the declarations that should be (but aren't) provided by
flex and bison.
trace/mercury_event_spec.[ch]:
The main module of the eventspec library. Provides functions to read
in event set specifications from a file, and to write them out as a
Mercury term in the form needed by the compiler.
trace/mercury_trace_tables.c:
If the module layouts being registered include event set
specifications, then check their consistency. Make the specification
and the consistency indication available to other modules.
trace/mercury_trace_internal.c:
During initialization, if the modules contain a consistent set of event
set specifications, then read that specification into the debugger.
(We don't yet make use of this information.)
Add an extra mdb command, "user", which goes forward to the next
user-defined event.
trace/mercury_trace.[ch]:
trace/mercury_trace_cmd_forward.[ch]:
Implement the new mdb command.
trace/mercury_trace_vars.[ch]:
For user-defined events, include the attributes' values among the
values that can be printed or browsed.
trace/mercury_trace_cmd_browsing.c:
trace/mercury_trace_declarative.c:
Minor changes.
scripts/scripts/prepare_tmp_dir_grade_part:
Copy the .y and .l files to the tmp dir we use for installs.
tools/bootcheck:
Copy the .y and .l files of the trace directory to stage 2.
tools/lmc.in:
Include the eventspec library when linking debuggable executables.
tests/debugger/user_event.{m,inp,exp}:
tests/debugger/user_event_spec:
New test case to test the new functionality.
tests/debugger/Mercury.options:
tests/debugger/Mmakefile:
Enable the new test case.
tests/debugger/completion.exp:
Expect the new "user" mdb command in the completion output.
|
||
|
|
ecf1ee3117 |
Add a mechanism for growing the stacks on demand by adding new segments
Estimated hours taken: 20 Branches: main Add a mechanism for growing the stacks on demand by adding new segments to them. You can ask for the new mechanism via a new grade component, stseg (short for "stack segments"). The mechanism works by adding a test to each increment of a stack pointer (sp or maxfr). If the test indicates that we are about to run out of stack, we allocate a new stack segment, allocate a placeholder frame on the new segment, and then allocate the frame we wanted in the first place on top of the placeholder. We also override succip to make it point code that will (1) release the new segment when the newly created stack frame returns, and then (2) go to the place indicated by the original, overridden succip. For leaf procedures on the det stack, we optimize away the check of the stack pointer. We can do this because we reserve some space on each stack for the use of such stack frames. My intention is that doc/user_guide.texi and NEWS will be updated once we have used the feature ourselves for a while and it seems to be stable. runtime/mercury_grade.h: Add the new grade component. runtime/mercury_conf_param.h: Document the new grade component, and the option used to debug stack segments. runtime/mercury_context.[ch]: Add new fields to contexts to hold the list of previous segments of the det and nondet stacks. runtime/mercury_memory_zones.[ch]: Include a threshold in all zones, for use in stack segments. Set it when a zone is allocated. Restore the previous #ifdef'd out function MR_unget_zone, for use when freeing stack segments execution has fallen out of. runtime/mercury_debug.[ch]: When printing the offsets of pointers into the det and nondet stacks, print the number of the segment the pointer points into (unless it is the first, in which case we suppress this in the interest of brevity and simplicity). Make all the functions in this module take a FILE * as an input argument; don't print to stdout by default. runtime/mercury_stacks.[ch]: Modify the macros that allocate stack frames to invoke the code for adding new stack segments when we are about to run out of stack. Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. Conform to the changes in mercury_debug.c. runtime/mercury_stack_trace.c: When traversing the stack, step over the placeholder stack frames at the bottoms of stack segments. Conform to the changes in mercury_debug.c. runtime/mercury_wrapper.[ch]: Make the default stack size small in grades that support stack segments. Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. Conform to the changes in mercury_debug.c. runtime/mercury_memory.c: Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. runtime/mercury_engine.[ch]: runtime/mercury_overflow.h: Standardize on "nondet" over "nond" as the abbreviation referring to the nondet stack. Convert these files to four-space indentation. runtime/mercury_minimal_model.c: trace/mercury_trace.c: trace/mercury_trace_util.c: Conform to the changes in mercury_debug.c. compiler/options.m: Add the new grade option for stack segments. compiler/compile_target_code.m: compiler/handle_options.m: Add the new grade component, and handle its exclusions with other grade components and optimizations. compiler/llds.m: Extend the incr_sp instruction to record whether the stack frame is for a leaf procedure. compiler/llds_out.m: Output the extended incr_sp instruction. compiler/proc_gen.m: Fill in the extra slot in incr_sp instructions. compiler/goal_util.m: Provide a predicate for testing whether a procedure body is a leaf. compiler/delay_slot.m: compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/frameopt.m: compiler/global_data.m: compiler/jumpopt.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/reassign.m: compiler/use_local_vars.m: Conform to the change in llds.m. scripts/canonicate_grade.sh-subr: scripts/init_grade_options.sh-subr: scripts/parse_grade_options.sh-subr: scripts/final_grade_options.sh-subr: scripts/mgnuc.in: Handle the new grade component. Convert parse_grade_options.sh-subr to four-space indentation. Mmake.workspace: Fix an old bug that prevented bootcheck from working in the new grade: when computing the gc grade, use the workspace's version of ml (which in this case understands the new grade components), rather than the installed ml (which does not). (This was a devil to track down, because neither make --debug nor strace on make revealed how the installed ml was being invoked, and there was no explicit invocation in the Makefile either; the error message appeared to come out of thin air just before the completion of the stage 2 library. It turned out the invocation happened implicitly, as a result of expanding a make variable.) |
||
|
|
e21193c283 |
Rename a bunch of predicates and function symbols to eliminate
Estimated hours taken: 6 Branches: main browser/*.m: compiler/*.m: Rename a bunch of predicates and function symbols to eliminate ambiguities. The only real change is factoring out some common code in the mlds and llds code generators, replacing them with single definitions in switch_util.m. |
||
|
|
9eed887d33 |
This diff is the second step in implementing trace events.
Estimated hours taken: 24 Branches: main This diff is the second step in implementing trace events. It modifies label layouts to include room for solver-event-specific information, and modifies the compiler to generate this information. Modifications to the debugger to use this information, user-level documentation and test cases will come later. runtime/mercury_stack_layout.h: Modify label layouts to allow them to hold information about solver events. Modify the macros creating label layouts to include a null pointer as the value of the label layout's solver event field. For each such macro, add a version that puts a pointer to the label's solver event structure in that field. Modify the definition of the MR_Long_Lval type to allow the representation of constants, since without this capability solver events would often need to be preceded by code to put constant values (e.g. solver ids) into lvals. To make it easier to locate all the places where MR_Long_Lvals are used (which need to be updated), change the type MR_Long_Lval from a synonym for an int to a structure containing an int. runtime/mercury_types.h: Add the typedefs now required for mercury_stack_layout.h. runtime/mercury_goto.h: Add a macro needed by mercury_stack_layout.h. runtime/mercury_grade.h: Change the binary compatibility version number for debug grades, due to the change to label layouts. runtime/mercury_layout_util.c: Update the functions interpreting MR_Long_Lvals to conform to the change in mercury_stack_layout.h. runtime/mercury_deep_profiling_hand.h: Fix a bug, possibly the bug preventing us from bootchecking in deep profiling grades: stack slot numbers of ProcStatic structures are supposed to be plain integers, not MR_Long_Lvals. runtime/mercury_stack_trace.c: library/exception.m: trace/mercury_trace.c: Conform the change in MR_Long_Lval. runtime/mercury_trace_base.[ch]: Add a new port: the solver event port. Add a macro and a function for solver events: MR_SOLVER_EVENT and MR_solver_trace. For now, they do the same thing as MR_EVENT and MR_trace, but in future, they can do something else (e.g. generate information for a visualization tool). mdbcomp/prim_data.m: Add a new port: the solver event port. Rename all ports to eliminate clashes with language keywords such as "call". mdbcmp/trace_counts.m: browser/declarative_execution.m: slice/mcov.m: compiler/tupling.m: Conform to the change in port names, and to the addition of the new port. compiler/trace_params.m: Conform to the change in port names, and to the addition of the new port. Rename some function symbols to avoid some ambiguities. trace/mercury_trace_declarative.c: Ignore the solver port when building the annotated trace, since it doesn't fit into it. compiler/prog_event.m: Extend the representation of events to include names for the event attributes. compiler/call_gen.m: Implement event goals. The implementation consists of a call to MR_SOLVER_EVENT with a layout structure whose solver event field points to a solver event structure giving the event goal's arguments. Rename some function symbols to avoid some ambiguities. compiler/trace_gen.m: Add a predicate for generating solver events. Conform to the change in port names. Rename some function symbols to avoid some ambiguities. compiler/code_info.m: When recording trace layout information for a label, take an extra argument describing the label layout's associated solver event, if any. compiler/continuation_info.m: Extend the first representation of label layouts to include room for solver events. compiler/stack_layout.m: Convert the representation of solver events in continuation_info.m's data structure to the data structure required by layout_out.m. Conform to the changes in MR_Long_Lvals. compiler/layout.m: Extend the compiler's internal representation of the contents of label layout structures to accommodate the optional solver event field. compiler/layout_out.m: Generate the extended label layout structures, using the new macros in mercury_stack_layout.h if necessary. Conform to the change in the MR_Long_Lval type. Conform to the change in port names. Rename some function symbols to avoid some ambiguities. compiler/global_data.m: Modify rval_type_as_arg to require only the value of the relevant option, not a package of such options. This is for the new code in stack_layout.m. compiler/var_locn.m: Conform to the change in global_data.m. compiler/llds_out.m: Conform to the change in continuation_info.m. Delete this module's unused definition of rval_type_as_arg. compiler/opt_debug.m: Conform to the change in continuation_info.m. |
||
|
|
712027f307 |
This patch changes the parallel execution mechanism in the low level backend.
Estimated hours taken: 100 Branches: main This patch changes the parallel execution mechanism in the low level backend. The main idea is that, even in programs with only moderate parallelism, we won't have enough processors to exploit it all. We should try to reduce the cost in the common case, i.e. when a parallel conjunction gets executed sequentially. This patch does two things along those lines: (1) Instead of unconditionally executing all parallel conjuncts (but the last) in separate Mercury contexts, we allow a context to continue execution of the next conjunct of a parallel conjunction if it has just finished executing the previous conjunct. This saves on allocating unnecessary contexts, which can be a big reduction in memory usage. We also try to execute conjuncts left-to-right so as to minimise the need to suspend contexts when there are dependencies between conjuncts. (2) Conjuncts that *are* executed in parallel still need separate contexts. We used to pass variable bindings to those conjuncts by flushing input variable values to stack slots and copying the procedure's stack frame to the new context. When the conjunct finished, we would copy new variable bindings back to stack slots in the original context. What happens now is that we don't do any copying back and forth. We introduce a new abstract machine register `parent_sp' which points to the location of the stack pointer at the time that a parallel conjunction began. In parallel conjuncts we refer to all stack slots via the `parent_sp' pointer, since we could be running on a different context altogether and `sp' would be pointing into a new detstack. Since parallel conjuncts now share the procedure's stack frame, we have to allocate stack slots such that all parallel conjuncts in a procedure that could be executing simultaneously have distinct sets of stack slots. We currently use the simplest possible strategy, i.e. don't allow variables in parallel conjuncts to reuse stack slots. Note: in effect parent_sp is a frame pointer which is only set for and used by the code of parallel conjuncts. We don't call it a frame pointer as it can be confused with "frame variables" which have to do with the nondet stack. compiler/code_info.m: Add functionality to keep track of how deep inside of nested parallel conjunctions the code generator is. Add functionality to acquire and release "persistent" temporary stack slots. Unlike normal temporary stack slots, these don't get implicitly released when the code generator's location-dependent state is reset. Conform to additions of `parent_sp' and parent stack variables. compiler/exprn_aux.m: Generalise the `substitute_lval_in_*' predicates by `transform_lval_in_*' predicates. Instead of performing a fixed substitution, these take a higher order predicate which performs some operation on each lval. Redefine the substitution predicates in terms of the transformation predicates. Conform to changes in `fork', `join_and_terminate' and `join_and_continue' instructions. Conform to additions of `parent_sp' and parent stack variables. Remove `substitute_rval_in_args' and `substitute_rval_in_arg' which were unused. compiler/live_vars.m: Introduce a new type `parallel_stackvars' which is threaded through `build_live_sets_in_goal'. We accumulate the sets of variables which are assigned stack slots in each parallel conjunct. At the end of processing a parallel conjunction, use this information to force variables which are assigned stack slots to have distinct slots. compiler/llds.m: Change the semantics of the `fork' instruction. It now takes a single argument: the label of the next conjunct after the current one. The instruction now "sparks" the next conjunct to be run, either in a different context (possibly in parallel, on another Mercury engine) or is queued to be executed in the current context after the current conjunct is finished. Change the semantics of the `join_and_continue' instruction. This instruction now serves to end all parallel conjuncts, not just the last one in a parallel conjunction. Remove the `join_and_terminate' instruction (no longer used). Add the new abstract machine register `parent_sp'. Introduce "parent stack slots", which are similar to normal stack slots but relative to the `parent_sp' register. compiler/par_conj_gen.m: Change the code generated for parallel conjunctions. That is: - use the new `fork' instruction at the beginning of a parallel conjunct; - use the `join_and_continue' instruction at the end of all parallel conjuncts; - keep track of how deep the code generator currently is in parallel conjunctions; - set and restore the `parent_sp' register when entering a non-nested parallel conjunction; - after generating the code of a parallel conjunct, replace all references to stack slots by parent stack slots; - remove code to copy back output variables when a parallel conjunct finishes. Update some comments. runtime/mercury_context.c: runtime/mercury_context.h: Add the type `MR_Spark'. Sparks are allocated on the heap and contain enough information to begin execution of a single parallel conjunct. Add globals `MR_spark_queue_head' and `MR_spark_queue_tail'. These are pointers to the start and end of a global queue of sparks. Idle engines can pick up work from this queue in the same way that they can pick up work from the global context queue (the "run queue"). Add new fields to the MR_Context structure. `MR_ctxt_parent_sp' is a saved copy of the `parent_sp' register for when the context is suspended. `MR_ctxt_spark_stack' is a stack of sparks that we decided not to put on the global spark queue. Update `MR_load_context' and `MR_save_context' to save and restore `MR_ctxt_parent_sp'. Add the counters `MR_num_idle_engines' and `MR_num_outstanding_contexts_and_sparks'. These are used to decide, when a `fork' instruction is reached, whether a spark should be put on the global spark queue (with potential for parallelism but also more overhead) or on the calling context's spark stack (no parallelism and less overhead). Rename `MR_init_context' to `MR_init_context_maybe_generator'. When initialising contexts, don't reset redzones of already allocated stacks. It seems to be unnecessary (and the reset implementation is buggy anyway, though it's fine on Linux). Rename `MR_schedule' to `MR_schedule_context'. Add new functions `MR_schedule_spark_globally' and `MR_schedule_spark_locally'. In `MR_do_runnext', add code for idle engines to get work from the global spark queue. Resuming contexts are prioritised over sparks. Rename `MR_fork_new_context' to `MR_fork_new_child'. Change the definitions of `MR_fork_new_child' and `MR_join_and_continue' as per the new behaviour of the `fork' and `join_and_continue' instructions. Delete `MR_join_and_terminate'. Add a new field `MR_st_orig_context' to the MR_SyncTerm structure to record which context originated the parallel conjunction instance represented by a MR_SyncTerm instance, and update `MR_init_sync_term'. This is needed by the new behaviour of `MR_join_and_continue'. Update some comments. runtime/mercury_engine.h: runtime/mercury_regs.c: runtime/mercury_regs.h: runtime/mercury_stacks.h: Add the abstract machine register `parent_sp' and code to copy it to and from the fake_reg array. Add a macro `MR_parent_sv' to access stack slots via `parent_sp'. Add `MR_eng_parent_sp' to the MercuryEngine structure. runtime/mercury_wrapper.c: runtime/mercury_wrapper.h: Add Mercury runtime option `--max-contexts-per-thread' which is saved in the global variable `MR_max_contexts_per_thread'. The number `MR_max_outstanding_contexts' is derived from this. It sets a soft limit on the number of sparks we put in the global spark queue, relative to the number of threads we are running. We don't want to put too many sparks on the global queue if there are plenty of ready contexts or sparks already on the global queues, as they are likely to result in new contexts being allocated. When initially creating worker engines, wait until all the worker engines have acknowledged that they are idle before continuing. This is mainly so programs (especially benchmarks and test cases) with only a few fork instructions near the beginning of the program don't execute the forks before any worker engines are ready, resulting in no parallelism. runtime/mercury_engine.c: runtime/mercury_thread.c: Don't allocate a context at the time a Mercury engine is created. An engine only needs a new context when it is about to pick up a spark. configure.in: compiler/options.m: scripts/Mercury.config.in: Update to reflect the extra field in MR_SyncTerm. Add the option `--sync-term-size' and actually make use the result of the sync term size calculated during configuration. compiler/code_util.m: compiler/continuation_info.m: compiler/dupelim.m: compiler/dupproc.m: compiler/global_data.m: compiler/hlds_llds.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_out.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/reassign.m: compiler/stack_layout.m: compiler/use_local_vars.m: compiler/var_locn.m: Conform to changes in `fork', `join_and_terminate' and `join_and_continue' instructions. Conform to additions of `parent_sp' and parent stack variables. XXX not sure about the changes in stack_layout.m library/par_builtin.m: Conform to changes in the runtime system. |
||
|
|
f9cac21e3e |
Get rid of a bunch more ambiguities by renaming predicates, mostly
Estimated hours taken: 8
Branches: main
Get rid of a bunch more ambiguities by renaming predicates, mostly
in polymorphism.m, {abstract,build,ordering}_mode_constraints.m, prog_type.m,
and opt_debug.m in the compiler directory and term_io.m, term.m, parser.m,
and string.m in the library.
In some cases, when the library and the compiler defined the same predicate
with the same code, delete the compiler's copy and give it access to the
library's definition by exporting the relevant predicate (in the undocumented
part of the library module's interface).
NEWS:
Mention that the names of some library functions have changed.
library/*.m:
compiler/*.m:
mdbcomp/*.m:
browser/*.m:
Make the changes mentioned above, and conform to them.
test/general/string_test.m:
test/hard_coded/string_strip.m:
test/hard_coded/string_strip.exp:
Conform to the above changes.
|