Commit Graph

2464 Commits

Author SHA1 Message Date
Paul Bone
a3f37dba22 glibc provides a type called cpu_set_t, see CPU_SET(3). These types are
bitfields with one bit per cpu to represent a set of cpus.  These types
have an arbitrary width.  However, the man page is misleading about how to
specify the size of a cpuset (bits or bytes) to macros and functions than
manipulate it.  This patch corrects this problem.

runtime/mercury_context.c:
    Fix how the size of a CPU_SET is specified.

    Fix MR_pin_thread_no_locking() so that it returns a valid result if
    the thread could not be pinned or the loop (in this function) was never
    executed.

    Users can never set MR_num_processors, so remove the code that presumes
    they can.
2011-10-19 05:53:58 +00:00
Peter Wang
2ccac171dd Add float registers to the Mercury abstract machine, implemented as an
Branches: main

Add float registers to the Mercury abstract machine, implemented as an
array of MR_Float in the Mercury engine structure.

Float registers are only useful if a Mercury `float' is wider than a word
(i.e. when using double precision floats on 32-bit platforms) so we let them
exist only then.  In other cases floats may simply be passed via the regular
registers, as before.

Currently, higher order calls still require the use of the regular registers
for all arguments.  As all exported procedures are potentially the target of
higher order calls, exported procedures must use only the regular registers for
argument passing.  This can lead to more (un)boxing than if floats were simply
always boxed.  Until this is solved, float registers must be enabled explicitly
with the developer only option `--use-float-registers'.

The other aspect of this change is using two consecutive stack slots to hold a
single double variable.  Without that, the benefit of passing unboxed floats
via dedicated float registers would be largely eroded.


compiler/options.m:
	Add developer option `--use-float-registers'.

compiler/handle_options.m:
	Disable `--use-float-registers' if floats are not wider than words.

compiler/make_hlds_passes.m:
	If `--use-float-registers' is in effect, enable a previous change that
	allows float constructor arguments to be stored unboxed in structures.

compiler/hlds_llds.m:
	Move `reg_type' here from llds.m and `reg_f' option.

	Add stack slot width to `stack_slot' type.

	Add register type and stack slot width to `abs_locn' type.

	Remember next available float register in `abs_follow_vars'.

compiler/hlds_pred.m:
	Add register type to `arg_loc' type.

compiler/llds.m:
	Add a new kind of lval: double-width stack slots.
	These are used to hold double-precision floating point values only.

	Record setting of `--use-float-registers' in exprn_opts.

	Conform to addition of float registers and double stack slots.

compiler/code_info.m:
	Make predicates take the register type as an argument,
	where it can no longer be assumed.

	Remember whether float registers are being used.

	Remember max float register for calls to MR_trace.

	Count double width stack slots as two slots.

compiler/arg_info.m:
	Allocate float registers for procedure arguments when appropriate.

	Delete unused predicates.

compiler/var_locn.m:
	Make predicates working with registers either take the register type as
	an argument, or handle both register types at once.

	Select float registers for variables when appropriate.

compiler/call_gen.m:
	Explicitly use regular registers for all higher-order calls,
	which was implicit before.

compiler/pragma_c_gen.m:
	Use float registers, when available, at the interface between Mercury
	code and C foreign_procs.

compiler/export.m:
	Whether a float rval needs to be boxed/unboxed when assigned to/from a
	register depends on the register type.

compiler/fact_table.m:
	Use float registers for arguments to predicates defined by fact tables.

compiler/stack_alloc.m:
	Allocate two consecutive stack slots for float variables when
	appropriate.

compiler/stack_layout.m:
	Represent double-width stack slots in procedure layout structures.

	Conform to changes.

compiler/store_alloc.m:
	Allocate float registers (if they exist) for float variables.

compiler/use_local_vars.m:
	Substitute float abstract machine registers with MR_Float local
	variables.

compiler/llds_out_data.m:
compiler/llds_out_instr.m:
	Output float registers and double stack slots.

compiler/code_util.m:
compiler/follow_vars.m:
	Count float registers separately from regular registers.

compiler/layout.m:
compiler/layout_out.m:
compiler/trace_gen.m:
	Remember the max used float register for calls to MR_trace().

compiler/builtin_lib_types.m:
	Fix incorrect definition of float_type_ctor.

compiler/bytecode_gen.m:
compiler/continuation_info.m:
compiler/disj_gen.m:
compiler/dupelim.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/hlds_out_goal.m:
compiler/jumpopt.m:
compiler/llds_to_x86_64.m:
compiler/lookup_switch.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/par_conj_gen.m:
compiler/proc_gen.m:
compiler/string_switch.m:
compiler/tag_switch.m:
compiler/tupling.m:
compiler/x86_64_regs.m:
	Conform to changes.

runtime/mercury_engine.h:
	Add an array of fake float "registers" to the Mercury engine structure,
	when MR_Float is wider than MR_Word.

runtime/mercury_regs.h:
	Document float registers in the Mercury abstract machine.

	Add macros to access float registers in the Mercury engine.

runtime/mercury_stack_layout.h:
	Add new MR_LongLval cases to represent double-width stack slots.

	MR_LONG_LVAL_TAGBITS had to be increased to accomodate the new cases,
	which increases the number of integers in [0, 2^MR_LONG_LVAL_TAGBITS)
	equal to 0 modulo 4.  These are the new MR_LONG_LVAL_TYPE_CONS_n cases.

	Add max float register field to MR_ExecTrace.

runtime/mercury_layout_util.c:
runtime/mercury_layout_util.h:
	Extend MR_copy_regs_to_saved_regs and MR_copy_saved_regs_to_regs
	for float registers.

	Understand how to look up new kinds of MR_LongLval: MR_LONG_LVAL_TYPE_F
	(previously unused), MR_LONG_LVAL_TYPE_DOUBLE_STACKVAR,
	MR_LONG_LVAL_TYPE_DOUBLE_FRAMEVAR.

	Conform to the new MR_LONG_LVAL_TYPE_CONS_n cases.

runtime/mercury_float.h:
	Delete redundant #ifdef.

runtime/mercury_accurate_gc.c:
runtime/mercury_agc_debug.c:
	Conform to changes (untested).

trace/mercury_trace.c:
trace/mercury_trace.h:
trace/mercury_trace_declarative.c:
trace/mercury_trace_external.c:
trace/mercury_trace_internal.c:
trace/mercury_trace_spy.c:
trace/mercury_trace_vars.c:
trace/mercury_trace_vars.h:
	Handle float registers in the trace subsystem.  This is mostly a matter
	of saving/restoring them as with regular registers.
2011-10-17 04:31:33 +00:00
Paul Bone
75f961dedf Fix two bugs in the parallel runtime code.
One bug was caused when the master context, in MR_lc_finish() would release the
contexts used by each of the slots.  The release code attempts to save state
from the engine back into the context, which is necessary most of the time.
However, in this case it saved state from the engine running the master
context, into other contexts, so that when they where re-used they used an
invalid stack pointer.

Another bug was found in code with recursive parallel conjunctions.  Each
context structure contains a pointer to a code location, it is used as a value
for the instruction pointer when a context is resumed.  The
MR_join_and_continue operation for parallel conjunctions uses this resume to
ensure that the master context for a parallel conjunction is only resumed if it
has become blocked and is ready to be resumed.  However the field was never
cleared before and will always contain the same value parallel conjunctions are
nested as they will all have the same resume point.  This caused the master
context to be resumed before it had fully blocked, causing it to be resumed
with an invalid state.

A potential bug was found where a field should have been volatile to prevent
the compiler from caching its value when doing so would not be safe.

Widen a couple of critical sections as they didn't quite protect against some
race conditions.  This is another potential cause of bugs.

runtime/mercury_par_builtin.[ch]:
    Make the master_context field of the loop control structure volatile so
    that the compiler doesn't cache its value.

    Make the last worker to finish take a lock earlier, to ensure that the
    master context won't be left waiting forever.

    Add a comment explaining why a context must not be saved before calling
    MR_destroy_context().

    Improve debugging code to print out the value of the stack or parent stack
    pointer, depending on the code in question.

    Make the lock in MR_lc_finish() wider, so that the lock is held when the
    code checks to see if it should block.

runtime/mercury_context.c:
    MR_destroy_context() no longer saves the context before releasing it.

    MR_destroy_context() no longer sets the MR_ctxt_resume_owner_engine field
    of the context since it's not currently used.

    MR_join_and_continue(), the barrier for parallel conjunctions, how resets
    the resume code pointer of the master context when it switches to it.

runtime/mercury_context.h:
    Described the reason why the context must be saved before it is
    destroyed/released.

runtime/mercury_context.c:
runtime/mercury_engine.c:
    Call MR_save_context() before calling MR_destroy_context()
2011-10-16 03:34:40 +00:00
Paul Bone
1b8f22c0a2 Fix a crash in loop-controlled code.
Loop controlled code would sometimes find, while signaling a future,
that the future had already been signaled and had a different value.

The problem was that the MR_lc_join_and_terminate operation would only
save the context after marking the loop control slot (that owned the
context) as free.  The save context code and the MR_lc_spawn_off code
both write the parent stack pointer field in the context structure, If
these (or other cases where the context is saved - such as when it
blocks on a future) race, then they can cause problems when the spawned
off computation uses its parent stack pointer.

runtime/mercury_par_builtin.h:
    MR_lc_join_and_terminate now saves the context before calling
    MR_lc_join.
2011-10-14 00:53:27 +00:00
Paul Bone
a071eaba53 Improve thread pinning:
+ Now pins threads intelligently on SMT systems by balancing threads among
      cores.
    + performs fewer migrations when pinning threads (If a thread's current
      CPU is a valid CPU for pinning, then it is not migrated).
    + Handle cases where the user requests more threads than available CPUs.
    + Handle cases where the process is restricted to a subset of CPUs by its
      environment.  (for instance, Linux cpuset(7))

This is largely made possible by the hwloc library
http://www.open-mpi.org/projects/hwloc/  However, hwloc is not required and the
runtime system will fall back to sched_setaffinity(), it will simply be less
intelligent WRT SMT.

runtime/mercury_context.h:
runtime/mercury_context.c:
    Do thread pinning either via hwloc or sched_setaffinity.  Previously only
    sched_setaffinity was used.

    Update thread-pinning algorithm, this:

    Include the general thread pinning code only if MR_HAVE_THREAD_PINNING is
    defined.

    Use a combination of sysconf and sched_getaffinity to detect the number of
    processors when hwloc isn't available.  This makes the runtime compatible
    with Linux cpuset(7) when hwloc isn't available.

configure.in:
Mmake.common.in:
    Detect presence of the hwloc library.

configure.in:
    Detect sched_getaffinity()

aclocal.m4:
acinclude.m4:
    Move aclocal.m4 to acinclude.m4, the aclocal program will build aclocal.m4
    and retrieve macros from the system and the contents of acinclude.m4.

Mmakefile:
    Create a make target for aclocal.m4.

runtime/Mmakefile:
    Link the runtime with libhwloc in low-level C parallel grades.

    Include CFLAGS for libhwloc.

scripts/ml.in:
    Link programs and libraries with libhwloc in low-level C parallel grades.

runtime/mercury_conf.h.in:
    Define MR_HAVE_HWLOC when it is available.

    Define MR_HAVE_SCHED_GETAFFINITY when it is available.

runtime/mercury_conf_param.h:
    Define MR_HAVE_THREAD_PINNING if either hwloc or [sched_setaffinity and
    sched_getaffinity] are available.

runtime/mercury_thread.c:
runtime/mercury_wrapper.c:
    Only call MR_pin_thread and MR_pin_primordial_thread if
    MR_HAVE_THREAD_PINNING is defined.

runtime/mercury_thread.h:
runtime/mercury_context.h:
    Move the declaration of MR_pin_primordial_thread to mercury_context.h from
    mercury_thead.h since it's definition is in mercury_context.c.

    Require MR_HAVE_THREAD_PINNING for the declaration of
    MR_pin_primordial_thread.

runtime/mercury_wrapper.c:
    Conform to changes in mercury_context.h

INSTALL_CVS:
tools/test_mercury
    Run aclocal at the right times while testing Mercury.
2011-10-13 02:42:21 +00:00
Paul Bone
2efb78955e The loop control transformation now works.
This patch commits the code-generator parts of the loop control transformation.
It also makes corrections and changes to the source-to-source, runtime and
library parts of the transformation.

Preliminary results look good, loop controlled right-recursive dependent code
performs as fast as independent right-recursive code, and it does so using the
minimum number of contexts (8 on apollo (an i7)).  Previously, when
transforming code by hand, we needed 32 contexts on a 4 core system (taura).
The reason for this is that we changed our design so that the master context
would become blocked if there was no free slot.  This ensures that once a
worker finishes it's current job new work is either already available or can be
made available promptly.

compiler/par_conj_gen.m:
compiler/code_gen.m:
    Generate code for the new loop_control scope.

compiler/llds_out_instr.m:
    Write out the lc_spawn_off instruction correctly.

compiler/code_info.m:
    Add support for storing out-of-line code in the code_info structure.

compiler/proc_gen.m:
    After generating a procedure's body add any out-of-line code stored in the
    code_info structure onto the end of the procedure (after the exit code).

compiler/par_loop_control.m:
    Add missing parts to the loop control transformation:
        + Add the barrier in the base case.
        + Transform non-parallel recursive calls.
        + Add a join_and_terminate call to the end of the forked-off code.

    Make minor corrections to comments.

runtime/mercury_par_builtin.h:
runtime/mercury_par_builtin.c:
    MR_lc_wait_free_slot and MR_lc_spawn_off no-longer mangle the labels they
    are passed.

    Fix a typeo that caused a bug.

    Add debugging code.

library/par_builtin.m:
    Store the value of LC in a stack slot during lc_wait_for_slot, This makes
    sure it is available in the case that lc_wait_for_slot suspends the
    context.

    Remove the loop_control_slot type, we now use integers to represent the
    position of a slot within a loop control structure.
2011-10-09 09:56:20 +00:00
Paul Bone
a79339b0f3 Addressed Zoltan's review comments on the loop control primitives.
runtime/mercury_par_builtin.[ch]:
    Corrected some comments, cleaning up unclear prose and also correcting
    content.

    Use a hint for the next free slot in the loop control structure.  This will
    ensure that free slots are found more quickly.

    Corrected a case MR_fatal_error would have been called when there was no
    error.
2011-10-04 03:20:07 +00:00
Zoltan Somogyi
eccc863e7d Add comments about my recent design decision about the representation
Estimated hours taken: 0.1
Branches: main

runtime/mercury_stack_layout.h:
trace/mercury_trace_declararative.c:
	Add comments about my recent design decision about the representation
	of goal paths.
2011-09-30 05:24:28 +00:00
Zoltan Somogyi
10d6f4c2e2 Remove a bunch of long-obsolete macros. Their job was to define
Estimated hours taken: 0.1
Branches: main

runtime/mercury_stack_layout.h:
	Remove a bunch of long-obsolete macros. Their job was to define
	or to declare individual label layouts, but for a long time now
	we have put all label layouts into arrays.
2011-09-28 07:06:23 +00:00
Zoltan Somogyi
41d6836024 The first part of my post-commit review of Paul's loop control diff,
Estimated hours taken: 0.5
Branches: main

The first part of my post-commit review of Paul's loop control diff,
covering everything except the transformation.

compiler/goal_util.m:
	Remove the new expand_plain_conj predicate Paul just added,
	since it exactly duplicates the existing goal_to_conj_list.

compiler/par_loop_control.m:
	Conform to the above.

runtime/mercury_par_builtin.h:
	Fix a bug introduced by Paul's diff: the extendable array MUST be
	the last slot in the MR_LoopControl structure.

	Fix some of the documentation and the formatting.

runtime/mercury_par_builtin.c:
	Fix some of the documentation and the formatting.

	Add some XXXs.
2011-09-27 06:22:49 +00:00
Paul Bone
58e305e4c0 Implement the source-to-source part of the loop control transformation. The
remaining part is the code generation for code that is to be spawned off.  It
must be handled in the code generator since it uses the parent stack pointer in
many cases.

I'm committing this now so that Zoltan can begin to review it while I work on
the code generator component.

compiler/par_loop_control.m:
    This new file contains the source-to-source part of the parallel loop
    control transformation..

compiler/transform_hlds.m.
    Include the par_loop_control module within the transform_hlds module.

compiler/mercury_compile_middle_passes.m:
    Call the loop control transformation at stage 206 - after the dependant
    parallel conjunction transformation.

    Move the last call optimisation pass from stage 175 to 206 since it will
    most-likely prevent loop control from working.  Where both transformations
    are applicable, the loop control transformation is preferred.

compiler/options.m:
    Add new options for loop control.

compiler/handle_options.m:
    Disable loop control if we're not in a grade that supports parallel
    conjunctions.

    Other tests that should have been testing for parallel conjunction support
    but only tested parallel support have been fixed.

compiler/hlds_goal.m:
    Add the feature_do_not_tailcall feature.

compiler/call_gen.m:
    Mark LLCS call goals that may not have last call optimisation applied to
    them if they have the feature_do_not_tailcall feature set in their HLDS
    info.

compiler/goal_util.m:
    Create a new predicate expand_plain_conj, this returns a list of the sub
    goals of a plain conjunction, or returns the goal in a singleton list.
    XXX: Could someone review the name of this predicate.

compiler/hlds_pred.m:
    Add a symbol for the new transformation in the pred_transformation type.

    Corrected a comment to match the arguments in the predicate it refers to.

compiler/prog_util.m:
    Add support to make_pred_name for creating names for loop control
    predicates.

compiler/dep_par_conj.m:
    Fix grammer in a comment.

compiler/saved_vars.m:
    Conform to the change in hlds_goal.m

compiler/layout_out.m:
    Conform to the change in hlds_pred.m

runtime/mercury_par_builtin.[ch]:
    Add support for lc_wait_free_slot/2, the blocking version of
    lc_get_free_slot/2.  This means that other loop control builtins have
    changed, for instance, lc_join_and_terminate/2 must wake up a context
    blocked in lc_wait_free_slot/2 after making the slot it was using free.

    Use a spin lock in the loop control structure rather than a POSIX mutex.

runtime/mercury_wrapper.[ch]:
    Add support for a runtime variable, the number of contexts per loop control.
    This can be controlled with a MERCURY_OPTIONS option.

mdbcomp/program_representation.m:
    Include lc_wait_free_slot/2 in the list of external predicates.

mdbcomp/mdbcomp.goal_path.m:
    Add two new predicates goal_path_remove_first/3 and goal_path_get_first/2.

library/par_builtin.m:
    Add new builtins to support the loop control transformation:

        lc_wait_free_slot/2 will block the context until a new slot is
        available.

        lc_default_num_contexts/1 will return the number of contexts to use, by
        default, for a loop-controlled loop.

    Add myself as an author of this module.

doc/user_guide.texi:
    Document the runtime --num-contexts-per-lc-per-thread option.  It is
    currently commented out since it is not intended for users, at least for
    now.

    Document the loop control options for the compiler.

---

The change below was written by Zoltan, I reviewed when I applied his diff to
my workspace.

Allow the compiler to mark calls in the LLDS as calls that cannot have last
call optimization applied to them. Paul will soon need this capability
in order to implement parallel conjunctions in which earlier conjuncts
are spawned off, and later conjuncts contain recursive calls, but the
earlier conjuncts need the stack frame.

compiler/llds.m:
        Add a flag to det and semi calls. (Model_non calls have had a similar
        flag for a long time, for a totally different reason.)

compiler/call_gen.m:
        By default, say that det and semi calls may have LCO applied to them.

compiler/jumpopt.m:
        Apply LCO to det and semi calls only if this flag allows it.

compiler/opt_debug.m:
        Include the flag in debugging dumps.
2011-09-27 00:49:27 +00:00
Zoltan Somogyi
a83aad6681 Remove references to nondet foreign_proc from the definition of the data
Estimated hours taken: 2
Branches: main

Remove references to nondet foreign_proc from the definition of the data
structures that define stack layouts.

runtime/mercury_stack_layout.h:
	Remove the trace ports that could occur in nondet foreign_procs
	from the definition of the trace port type used in C code.

mdbcomp/prim_data.m:
	Remove the trace ports that could occur in nondet foreign_procs
	from the definition of the trace port type used in Mercury code.

compiler/layout_out.m:
compiler/stack_layout.m:
compiler/trace_params.m:
mdbcomp/trace_counts.m:
runtime/mercury_trace_base.h:
trace/mercury_trace_declarative.h:
	Delete references to those ports.

runtime/mercury_stack_layout.h:
	Update the binary compatibility version number for debuggable
	executables, since the port number of user events has changed.
2011-09-26 04:30:48 +00:00
Zoltan Somogyi
8e3ead5903 Reduce the size of the string tables in debuggable executables by encoding
Estimated hours taken: 6
Branches: main

Reduce the size of the string tables in debuggable executables by encoding
variable names that fit a few standard templates, the most important of which
is STATE_VARIABLE_name_number.

The effect on the compiler is to reduce the string table size from about
3.1Mb to about 2.1Mb, which is about a 30% reduction.

compiler/stack_layout.m:
	Look for the names fitting the patterns in variable names, and encode
	them.

runtime/mercury_stack_layout.[ch]:
	Add a function for looking up variable names, decoding them if needed.

	Since goal paths cannot fit any of the patterns, access them without
	using that function.

mdbcomp/rtti_access.m:
	Use the new function to retrieve variable names.

runtime/mercury_grade.h:
	Increment the debugging compatibility version number, since debuggable
	executables in which some modules were produced by a compiler without
	this diff and some were produced by a compiler with this diff won't
	work together.
2011-09-26 04:29:37 +00:00
Zoltan Somogyi
605b11598f Fix layout.
Estimated hours taken: 0.1
Branches: main

runtime/mercury_trace_base.c:
	Fix layout.
2011-09-21 09:31:53 +00:00
Zoltan Somogyi
9f55ffa28a Fix typos in comments.
Estimated hours taken: 0.1
Branches: main

runtime/mercury_threadscope.c:
	Fix typos in comments.
2011-09-21 07:59:39 +00:00
Julien Fischer
afaea7c1ba Fix the condition protecting the definition of MR_GC_MALLOC_INLINE,
Branches: 11.07, main

runtime/mercury.h:
	Fix the condition protecting the definition of MR_GC_MALLOC_INLINE,
	since we are calling the Boehm collector directly we require
 	MR_BOEHM_GC to be defined, not just MR_CONSERVATIVE_GC.

	MR_GC_MALLOC_ATOMIC does not exist; use GC_MALLOC_ATOMIC instead.

	Add a couple of XXXs regarding the definition of MR_new_object_atomic
	in the case where inline allocation is enabled.
2011-09-19 08:46:23 +00:00
Julien Fischer
5b1105b6a3 Avoid failures in the namespace cleanliness check in .par grade on MinGW.
Branches: main, 11.07

Avoid failures in the namespace cleanliness check in .par grade on MinGW.

*/RESERVED_MACRO_NAMES:
	Add some macros automatically defined by GCC on MinGW.
2011-09-14 07:00:44 +00:00
Julien Fischer
47a7aee96b Avoid warnings about functions that don't return in the runtime
Branches: main, 11.07

Avoid warnings about functions that don't return in the runtime
with MSVC.

Avoid a warning in the configure script with MSVC.

configure.in:
	The cygpath tool is only required with MSVC when using
	Cygwin as the build environment; don't emit an error message
	about this on other systems, e.g. MingGW.

runtime/mercury_std.h:
	Redefine MR_NO_RETURN so that it works with both GCC/Clang
	and Visual C.

runtime/mercury_misc.h:
runtime/mercury_engine.c:
	Conform to the above change to MR_NO_RETURN.

runtime/mercury_bootstrap.h:
	Delete the redefinition of NO_RETURN; any code that still
	uses is not going to work for a variety of other reasons.
2011-09-12 16:29:55 +00:00
Zoltan Somogyi
742800c5da Post-commit review of Paul's change introducing the loop control primitives.
Estimated hours taken: 1
Branches: main

Post-commit review of Paul's change introducing the loop control primitives.
It also updates some documentation Paul's update did not touch.

library/par_buildin.m:
runtime/mercury_atomic_ops.h:
runtime/mercury_context.h:
	Fix formatting and grammar.

runtime/mercury_par_builtin.[ch]:
	Use a variable length array in the loop control struct to store
	the loop control slots. This setup needs one load to access a slot,
	compared to two with the previous arrangement.

	Fix formatting and grammar.

	Add XXXs where relevant.
2011-09-12 08:09:24 +00:00
Paul Bone
ea9eb7a654 Introduce loop control runtime code.
runtime/mercury_par_builtin.h:
runtime/mercury_par_builtin.c:
    Introduce loop control runtime code.

runtime/mercury_context.h:
    Introduce a new new macro to tune the size of contexts that are used as
    workers by the loop control runtime.  This is set to the same context size
    as for sparks.

runtime/mercury_context.c:
    Fixed a typeo in a comment.

library/par_builtin.m:
    Create predicate versions of the par builtin macros runtime code.  The only
    primitive without a predicate version is MR_lc_spawn_off which cannot be
    expressed in Mercury and needs support from the LLDS stage in the compiler.

mdbcomp/program_representation.m:
    Add par_builtin.lc_finish/1 as an externally defined predicate.  This tells
    the debugger not to expect any events for it.
2011-09-12 04:51:17 +00:00
Paul Bone
7c086e8dbe ThreadScope updates.
An event described in our ThreadScope paper had not been added to the runtime
system.  This event announces that an engine is attempting find work on the
form of a local spark.

This change also introduces a hierarchy of events, where one event 'extends'
another existing event.  We use this for Mercury's spark events which contain
spark IDs in their payloads.  These extend GHC's spark events.

Other changes have been made to ensure that Mercury conforms with the
ghc-events library, which is used by the ThreadScope tool.

runtime/mercury_threadscope.h:
runtime/mercury_threadscope.c:
    Add support for the LOOKING_FOR_LOCAL_SPARK event.

    Re-number the CALLING_MAIN event to make a Mercury specific event.

    Re-number the STRING event.

    Re-name the STRING event, it is now INTERN_STRING.

    No-longer use the deprecated SPARK_RUN and SPARK_STEAL events, instead use
    the new events and create Mercury specific events that extend these events.

    The Mercury-specific SPARKING event has been renamed to SPARK_CREATE and
    now extends the base SPARK_CREATE event.

    Made a correction to a comment.

runtime/mercury_context.c:
    Post the LOOKING_FOR_LOCAL_SPARK event.
2011-09-08 01:53:08 +00:00
Peter Wang
257efbd678 Store double-precision `float' constructor arguments in unboxed form,
Branches: main

Store double-precision `float' constructor arguments in unboxed form,
in high-level C grades on 32-bit platforms, i.e. `float' (and equivalent)
arguments may occupy two machine words.

As the C code generated by the MLDS back-end makes use of MR_Float variables
and parameters, float (un)boxing may be reduced substantially in many programs.

compiler/prog_data.m:
	Add `double_word' as a new option for constructor argument widths,
	only used for float arguments as yet.

compiler/make_hlds_passes.m:
	Set constructor arguments to have `double_word' width if required,
	and possible.

compiler/type_util.m:
	Add helper predicate.

compiler/builtin_ops.m:
compiler/c_util.m:
compiler/llds.m:
	Add two new binary operators used by the MLDS back-end.

compiler/arg_pack.m:
	Handle `double_word' arguments.

compiler/ml_code_util.m:
	Deciding whether or not a float constructor argument requires boxing
	now depends on the width of the field.

compiler/ml_global_data.m:
	When a float constant appears as an initialiser of a generic array
	element, it is now always unboxed, irrespective of --unboxed-float.

compiler/ml_type_gen.m:
	Take double-word arguments into account when generating structure
	fields.

compiler/ml_unify_gen.m:
	Handle double-word float constructor arguments in (de)constructions.
	In some cases we break a float argument into its two words, so
	generating two assignments statements or two separate rvals.

	Take double-word arguments into account when calculating field offsets.

compiler/mlds_to_c.m:
	The new binary operators require no changes here.

	As a special case, write `MR_float_from_dword_ptr(&X)' instead of
	`MR_float_from_dword(X, Y)' when X, Y are consecutive words within a
	field. The definition of `MR_float_from_dword_ptr' is more
	straightforward, and gcc produces better code than if we use the more
	general `MR_float_from_dword'.

compiler/rtti_out.m:
	For double-word arguments, generate MR_DuArgLocn structures with
	MR_arg_bits set to -1.

compiler/rtti_to_mlds.m:
	Handle double-word arguments in field offset calculation.

compiler/unify_gen.m:
	Partially handle double_word arguments in LLDS back-end.

compiler/handle_options.m:
	Set --unboxed-float when targetting Java, C# and Erlang.

compiler/structure_reuse.direct.choose_reuse.m:
	Rename a predicate.

compiler/bytecode.m:
compiler/equiv_type.m:
compiler/equiv_type_hlds.m:
compiler/llds_to_x86_64.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/opt_debug.m:
	Conform to changes.

library/construct.m:
library/store.m:
	Handle double-word constructor arguments.

runtime/mercury_conf.h.in:
	Clarify what `MR_BOXED_FLOAT' now means.

runtime/mercury_float.h:
	Add helper macros for converting between doubles and word/dwords.

runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct.h:
	Add a macro `MR_arg_value' and a helper function to extract a
	constructor argument value.  This replaces `MR_unpack_arg'.

runtime/mercury_type_info.h:
	Remove `MR_unpack_arg'.

	Document that MR_DuArgLocn.MR_arg_bits may be -1.

runtime/mercury_deconstruct_macros.h:
runtime/mercury_deep_copy_body.h:
runtime/mercury_ml_arg_body.h:
runtime/mercury_table_type_body.h:
runtime/mercury_tabling.c:
runtime/mercury_type_info.c:
	Handle double-word constructor arguments.

tests/hard_coded/Mercury.options:
tests/hard_coded/Mmakefile:
tests/hard_coded/lco_double.exp:
tests/hard_coded/lco_double.m:
tests/hard_coded/pack_args_float.exp:
tests/hard_coded/pack_args_float.m:
	Add test cases.

trace/mercury_trace_vars.c:
	Conform to changes.
2011-09-06 05:20:45 +00:00
Julien Fischer
e2aede6f47 Define MR_NO_RETURN for clang.
Branches: main, 11.07

runtime/mercury_std.h:
	Define MR_NO_RETURN for clang.
2011-08-29 05:45:52 +00:00
Julien Fischer
b31ff32593 Support more of the handwritten atomic ops with clang.
Branches: main, 11.07

runtime/mercury_atomic_ops.h:
	Support more of the handwritten atomic ops with clang.
2011-08-26 14:10:04 +00:00
Julien Fischer
9033da0777 Fix some dodgy spacing.
Branches: main, 11.07

runtime/mercury_atomic_ops.h:
	Fix some dodgy spacing.
2011-08-26 11:09:33 +00:00
Julien Fischer
4486c39aeb Make none.par.gc bootstrap with clang (2.8.0) on Linux.
Branches: main, 11.07

Make none.par.gc bootstrap with clang (2.8.0) on Linux.

runtime/mercury_atomic_ops.h:
	Define MR_ATOMIC_DEC_INT_BODY and MR_ATOMIC_DEC_AND_IS_ZERO_WORD_BODY
	for clang - we use the same inline assembler definitions that are used
	for GCC.
2011-08-26 08:02:05 +00:00
Julien Fischer
870f70a1c3 Make hlc.par.gc bootstrap with clang on Linux.
Branches: main, 11.07

Make hlc.par.gc bootstrap with clang on Linux.

runtime/mercury_atomic_ops.h:
	Use the GCC definitions of MR_COMPARE_AND_SWAP_WORD_BODY and
	MR_CPU_SFENCE with clang.
2011-08-26 06:19:34 +00:00
Peter Wang
573e6f2f00 Support unboxed float fields in high-level C grades.
Branches: main

Support unboxed float fields in high-level C grades.

When the representation of `float' is no wider than a machine word, d.u.
functor arguments of type `float' (or equivalent) will be stored directly
within cells constructed for that functor, instead of a pointer to the box
containing the value.  This was already so for low-level C grades.

compiler/mlds.m:
	Add an option to mlds_type, equivalent to
	`mlds_array_type(mlds_generic_type)' except that some elements are
	known to be floats.

	Update some comments.

compiler/ml_global_data.m:
	Remember the `--unboxed-float' option in `ml_global_data'.

	Special case generic arrays in `ml_gen_static_scalar_const_addr' and
	`ml_gen_static_scalar_const_value'.  Float literals cannot be used to
	initialize an element of a generic array in C.  If any appear, replace
	the generic array type by an instance of
	`mlds_mostly_generic_array_type' with float fields in the positions
	which have float initializers.

compiler/ml_code_util.m:
	Make `ml_must_box_field_type' and `ml_gen_box_const_rval' depend on the
	`--unboxed-float' option.

	Delete some now-misleading comments.

	Delete an unused predicate.

compiler/mlds_to_c.m:
	Update code that writes out scalar static data to handle
	`mlds_mostly_generic_array_type'.

	In one case, for `--high-level-data' only, output float constants by
	their integer representation, so that they may be cast to pointer
	types.

compiler/ml_unify_gen.m:
	Rename some predicates for clarity.

compiler/ml_accurate_gc.m:
compiler/ml_lookup_switch.m:
compiler/ml_proc_gen.m:
compiler/ml_simplify_switch.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
	Conform to changes.

library/float.m:
	Add hidden functions to return the integer representation of the bit
	layout of floating point values.

library/exception.m:
	Delete mention of MR_AVOID_MACROS.

runtime/mercury.c:
runtime/mercury.h:
	Make MR_box_float/MR_unbox_float act like "casts" when MR_BOXED_FLOAT
	is undefined, and only define them in high-level grades.  I think they
	should be replaced by MR_float_to_word/MR_word_to_float (which have
	less confusing names when there is no boxing) but that would require
	some header file reshuffling which I don't want to undertake yet.

	Delete references to MR_AVOID_MACROS.  Apparently it existed to support
	the defunct gcc back-end but I cannot see it ever being defined.

runtime/mercury_conf_param.h:
	MR_HIGHLEVEL_CODE no longer implies MR_BOXED_FLOAT.

	Delete mention of MR_AVOID_MACROS.

runtime/mercury_float.h:
	Fix a comment.

tests/hard_coded/Mmakefile:
tests/hard_coded/float_ground_term.exp:
tests/hard_coded/float_ground_term.m:
	Add a test case.
2011-08-22 07:56:10 +00:00
Julien Fischer
ecbd4773d6 Respond to review comments from Paul.
Branches: main, 11.07

Respond to review comments from Paul.

runtime/mercury_conf_param.h:
	Fix some spacing.

runtime/mercury_std.h:
	Fix s/MR_GNUC/__GNUC__/ in a comment.
2011-08-11 05:27:18 +00:00
Julien Fischer
d871f74da9 Don't use MR_GNUC here since we don't #include the usual
Branches: main, 11.07

runtime/mercury_getopt.c:
	Don't use MR_GNUC here since we don't #include the usual
	Mercury headers here and it will always be undefined.
2011-08-09 11:07:37 +00:00
Julien Fischer
6db56dea42 Use MR_GNUC in place of __GNUC__ in some spots.
Branches: main, 11.07

runtime/mercury_string.h:
runtime/mercury_types.h:
	Use MR_GNUC in place of __GNUC__ in some spots.
2011-08-02 08:28:19 +00:00
Zoltan Somogyi
b4092d2e4e Further improvements in the implementation of string switches, along with
Estimated hours taken: 12
Branches: main

Further improvements in the implementation of string switches, along with
some bug fixes.

If the chosen hash function does not yield any collisions for the strings
in the switch arms, then we can optimize away the table column that we would
otherwise need for open addressing. This was implemented in a previous diff.

For an ordinary (non-lookup) string switch, the hash table has two columns
in the presence of collisions and one column in their absence. Therefore if
doubling the size of the table allows us to eliminate collisions, the table
size is unaffected, though the corresponding array of labels we have to put
into the computed_goto instruction we generate has to double as well.
Thus the only cost of such doubling is an increase in "code" size, and
for small tables, the elimination of the open addressing loop may compensate
for this, at least partially.

For lookup string switches, doubling the table size this way has a bigger
space cost, but the elimination of the open addressing loop still brings
a useful speed boost.

We therefore now DO double the table size if this eliminates collisions.
In the library, compiler etc directories, this eliminates collisions in
19 out of 47 switch switches that had collisions with the standard table size.

compiler/switch_util.m:
	Replace the separate sets of predicates we used to have for computing
	hash maps (one for lookup switches and one for non-lookup switches)
	with a single set that works for both.

	Change this set to double the table size if this eliminates collisions.
	This requires it to decide the table size, a task previously done
	separately by each of its callers.

	One version of this set had an old bug, which caused it to effectively
	ignore the second and third string hash functions. This diff fixes it.

	There were two bugs in my previous diff: the unneeded table column
	was not being optimized away from several_soln lookup switches, and the
	lookup code for one_soln lookup switches used the wrong column offset.
	This diff fixes these too.

	Since doubling the table size requires recalculating all the hash
	values, decouple the computation of the hash values from generating
	code for each switch arm, since the latter shouldn't be done more than
	once.

	Add a note on an old problem.

compiler/ml_string_switch.m:
compiler/string_switch.m:
	Bring the code for generating code for the arms of string switches
	here from switch_util.m.

tests/hard_coded/Mmakefile:
	Fix the reason why the bugs mentioned above were not detected:
	the relevant test cases weren't enabled.

tests/hard_coded/string_hash.m:
	Update this test case to test the correspondence of the compiler's
	and the runtime's versions of not just the first hash function,
	but also the second and third.

runtime/mercury_string.h:
	Fix a typo in a comment.
2011-08-02 00:05:44 +00:00
Julien Fischer
8af00f7a2a Avoid using the __GNUC__ macro in the runtime as a test for the presence of
Branches: main, 11.07

Avoid using the __GNUC__ macro in the runtime as a test for the presence of
gcc, since clang also defines that macro.  Since clang doesn't support all
of the GNU C extensions, we can't actually use __GNUC__ without also checking
whether we are actually using clang.

runtime/mercury_conf_param.h:
	Add three new macros, MR_CLANG, MR_GNUC and MR_MSVC that are defined
	only when the C compiler is clang, gcc, or Visual C respectively.
	(In particular, MR_GNUC will _not_ be defined when the C compiler
	is clang.)

runtime/mercury.c:
runtime/mercury.h:
runtime/mercury_atomic_ops.c:
runtime/mercury_atomic_ops.h
runtime/mercury_bitmap.h:
runtime/mercury_float.h:
runtime/mercury_getopt.c:
runtime/mercury_goto.h:
runtime/mercury_heap.h:
runtime/mercury_std.h:
	Replace uses of the __GNUC__ and __clang__ macros with the above.

runtime/mercury_regs.h:
	As above, also #include mercury_conf_param.h directly since
	this file is #included by some of the tests in the configure
	script.
2011-08-01 07:06:21 +00:00
Julien Fischer
2239f59b30 Fix minor problems in the runtime identified by Visual C.
Branches: main, 11.07

Fix minor problems in the runtime identified by Visual C.

runtime/mercury_memory_zones.c:
	Fix a call to a function that no longer exists.

runtime/mercury_stack_trace.h:
	Fix argument type mismatches between function prototypes
	and definitions.
2011-07-16 07:51:30 +00:00
Julien Fischer
6c44fead50 Fix a another Visual C runtime compilation problem.
Branches: main, 11.07

Fix a another Visual C runtime compilation problem.

runtime/mercury_heap_profile.c:
	Avoid arithmetic with void pointers.
	(That's a GNU extension.)
2011-07-13 01:22:54 +00:00
Julien Fischer
1c709ff8e8 Fix a runtime compilation error with Visual C.
Branches: main, 11.07

Fix a runtime compilation error with Visual C.

runtime/mercury_memory_zones.c:
	Don't interleave variable declarations and code.
	(Doing so works in GNU C or C99, but not with VC9.)
2011-07-13 00:03:30 +00:00
Julien Fischer
e81a2bb50c Fix a problem that was causing the runtime not to compile in high-level C
Branches: main, 11.07

Fix a problem that was causing the runtime not to compile in high-level C
grades with clang.

runtime/mercury_std.h:
	Define MR_STATIC_INLINE and friends for clang.

	Add an XXX regarding the use of the C99 definitions for the above.
2011-07-12 03:21:58 +00:00
Peter Wang
0ae65de577 Pack consecutive enumeration arguments in discriminated union types into a
Branches: main

Pack consecutive enumeration arguments in discriminated union types into a
single word to reduce cell sizes.  Argument packing is only enabled on C
back-ends with low-level data, and reordering arguments to improve
opportunities for packing is not yet attempted.  The RTTI implementations for
other back-ends will need to be updated, but that is best left until after any
argument reordering change.

Modules which import abstract enumeration types are notified so by writing
declarations of the form:

	:- type foo where type_is_abstract_enum(NumBits).

into the interface file for the module which defines the type.


compiler/prog_data.m:
	Add an `arg_width' argument to constructor arguments.

	Replace `is_solver_type' by `abstract_type_details', with an extra
	option for abstract exported enumeration types.

compiler/handle_options.m:
compiler/options.m:
	Add an internal option `--allow-argument-packing'.

compiler/make_hlds_passes.m:
	Determine whether and how to pack enumeration arguments, updating the
	`arg_width' fields of constructor arguments before constructors are
	added to the HLDS.

compiler/mercury_to_mercury.m:
compiler/modules.m:
	Write `where type_is_abstract_enum(NumBits)' to interface files
	for abstract exported enumeration types.

compiler/prog_io_type_defn.m:
	Parse `where type_is_abstract_enum(NumBits)' attributes on type
	definitions.

compiler/arg_pack.m:
compiler/backend_libs.m:
	Add a new module.  This mainly contains a predicate which packs rvals
	according to arg_widths, which is used by both LLDS and MLDS back-ends.

compiler/ml_unify_gen.m:
compiler/unify_gen.m:
	Take argument packing into account when generating code for
	constructions and deconstructions.  Only a relatively small part of the
	compiler actually needs to understand argument packing.  The rest works
	at the HLDS level with constructor arguments and variables, or at the
	LLDS and MLDS levels with structure fields.

compiler/code_info.m:
compiler/var_locn.m:
	Add assign_field_lval_expr_to_var and
	var_locn_assign_field_lval_expr_to_var.

	Allow more kinds of rvals in assign_cell_arg.  I do not know why it was
	previously restricted, except that the other kinds of rvals were not
	encountered as cell arguments before.

compiler/mlds.m:
	We can now rely on the compiler to pack arguments in the
	mlds_decl_flags type instead of doing it manually.  A slight downside
	is that though the type is packed down to a single word cell, it will
	still incur a memory allocation per cell.  However, I did not notice
	any difference in compiler speed.

compiler/rtti.m:
compiler/rtti_out.m:
	Add and output a new field for MR_DuFunctorDesc instances, which, if
	any arguments are packed, points to an array of MR_DuArgLocn.  Each
	array element describes the offset in the cell at which the argument's
	value is held, and which bits of the word it occupies.  In the more
	common case where no arguments are packed, the new field is simply
	null.

compiler/rtti_to_mlds.m:
	Generate the new field to MR_DuFunctorDesc.

compiler/structure_reuse.direct.choose_reuse.m:
	For now, prevent structure reuse reusing a dead cell which has a
	different constructor to the new cell.  The code to determine whether a
	dead cell will hold the arguments of a new cell with a different
	constructor will need to be updated to account for argument packing.

compiler/type_ctor_info.m:
	Bump RTTI version number.

	Conform to changes.

compiler/add_type.m:
compiler/check_typeclass.m:
compiler/equiv_type.m:
compiler/equiv_type_hlds.m:
compiler/erl_rtti.m:
compiler/hlds_data.m:
compiler/hlds_out_module.m:
compiler/intermod.m:
compiler/make_tags.m:
compiler/mlds_to_gcc.m:
compiler/opt_debug.m:
compiler/prog_type.m:
compiler/recompilation.check.m:
compiler/recompilation.version.m:
compiler/special_pred.m:
compiler/type_constraints.m:
compiler/type_util.m:
compiler/unify_proc.m:
compiler/xml_documentation.m:
	Conform to changes.

	Reduce code duplication in classify_type_defn.

compiler/hlds_goal.m:
	Clarify a comment.

library/construct.m:
	Make `construct' pack arguments when necessary.

	Remove an old RTTI version number check as recommended in
	mercury_grade.h.

library/store.m:
	Deal with packed arguments in this module.

runtime/mercury_grade.h:
	Bump binary compatibility version number.

runtime/mercury_type_info.c:
runtime/mercury_type_info.h:
	Bump RTTI version number.

	Add MR_DuArgLocn structure definition.

	Add a macro to unpack an argument as described by MR_DuArgLocn.

	Add a function to determine a cell's size, since the number of
	arguments is no longer correct.

runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct.h:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_ml_arg_body.h:
runtime/mercury_ml_expand_body.h:
	Deal with packed arguments when deconstructing.

	Remove an old RTTI version number check as recommended in
	mercury_grade.h.

runtime/mercury_deep_copy_body.h:
	Deal with packed arguments when copying.

runtime/mercury_table_type_body.h:
	Deal with packed arguments in tabling.

runtime/mercury_dotnet.cs.in:
	Add DuArgLocn field to DuFunctorDesc. Argument packing is not enabled
	for the C# back-end yet so this is unused.

trace/mercury_trace_vars.c:
	Deal with packed arguments in MR_select_specified_subterm,
	use for the `hold' command.

java/runtime/DuArgLocn.java:
java/runtime/DuFunctorDesc.java:
	Add DuArgLocn field to DuFunctorDesc. Argument packing is not enabled
	for the Java back-end yet so this is unused.

extras/trailed_update/tr_store.m:
	Deal with packed arguments in this module (untested).

extras/trailed_update/samples/interpreter.m:
extras/trailed_update/tr_array.m:
	Conform to argument reordering in the array, map and other modules in
	previous changes.

tests/hard_coded/Mercury.options:
tests/hard_coded/Mmakefile:
tests/hard_coded/lco_pack_args.exp:
tests/hard_coded/lco_pack_args.m:
tests/hard_coded/pack_args.exp:
tests/hard_coded/pack_args.m:
tests/hard_coded/pack_args_copy.exp:
tests/hard_coded/pack_args_copy.m:
tests/hard_coded/pack_args_intermod1.exp:
tests/hard_coded/pack_args_intermod1.m:
tests/hard_coded/pack_args_intermod2.m:
tests/hard_coded/pack_args_reuse.exp:
tests/hard_coded/pack_args_reuse.m:
tests/hard_coded/store_ref.exp:
tests/hard_coded/store_ref.m:
tests/invalid/Mmakefile:
tests/invalid/where_abstract_enum.err_exp:
tests/invalid/where_abstract_enum.m:
tests/tabling/Mmakefile:
tests/tabling/pack_args_memo.exp:
tests/tabling/pack_args_memo.m:
	Add new test cases.

tests/hard_coded/deconstruct_arg.exp:
tests/hard_coded/deconstruct_arg.exp2:
tests/hard_coded/deconstruct_arg.m:
	Add constructors with packed arguments to these cases.

tests/invalid/where_direct_arg.err_exp:
	Update expected output.
2011-07-05 03:34:39 +00:00
Peter Wang
2209caecea The direct argument functor change added the constant MR_SECTAG_NONE_DIRECT_ARG
Branches: main

The direct argument functor change added the constant MR_SECTAG_NONE_DIRECT_ARG
in some places but not others, breaking deconstruct on C# and Java back-ends.

compiler/mlds_to_gcc.m:
java/runtime/Sectag_Locn.java:
library/rtti_implementation.m:
runtime/mercury_dotnet.cs.in:
	Add missing constants.
2011-06-27 06:40:36 +00:00
Paul Bone
491b089085 Fix some ThreadScope issues.
Firstly, this change allows the ThreadScope tool to read Mercury's .eventlog
files without aborting.  This is fixed by making THREAD_START and THREAD_STOP
events consistent.

Secondly, this change implements the missing EVENT_SLEEPING event.  This
ensures that the implementation matches the description in the ThreadScope
paper.

Thirdly, the idle engines try to run a suspended context before running a
spark.

runtime/mercury_threadscope.c:
    Don't post THREAD_START or THREAD_STOP events if it wouldn't make sense,
    ie: the thread is already stopped.  We do this to make RTS code simpler
    since an engine may hang on to a context even when that context is stopped.
    The RTS uses this for caching.

    Create a new event ENGINE_SLEEPING to be used when an engine goes to sleep.

runtime/mercury_context.c:
    Add some missing calls to threadscope, this ensures that Mercury's eventlog file
    maintains some invariants expected by the ThreadScope visualisation tool.

    Modify how idle engines look for new work: now, in all cases, an idle
    engine will attempt to resume a context first.

    Avoid taking the lock to the global run queue of contexts if the runqueue
    pointer is NULL indicating that the queue is empty.
2011-06-23 08:13:50 +00:00
Peter Wang
12281f3419 Implement a type representation optimisation ("direct argument functors"),
Branches: main

Implement a type representation optimisation ("direct argument functors"),
where a functor with exactly one argument can be represented by a tagged
pointer to the argument value, which itself does not require the tag bits,
e.g.

	:- type maybe_foo ---> yes(foo) ; no.
	:- type foo       ---> foo(int, int).  % aligned pointer

To ensure that all modules which could construct or deconstruct the functor
agree on the type representation, I had planned to automatically output
extra information to .int files to notify importing modules about functors
using the optimised representation:

	:- type maybe_foo ---> yes(foo) ; no
		where direct_arg is [yes/1].

However, the compiler does not perform enough (or any) semantic analysis
while making interface files.  The fallback solution is to only use the
optimised representation when all importing modules can be guaranteed to
import both the top-level type and the argument type, namely, when both
types are exported from the same module.  We also allow certain built-in
argument types; currently this only includes tuples.

Non-exported types may use the optimised representation, but when
intermodule optimisation is enabled, they may be written out to .opt files.
Then, we *do* add direct_arg attributes to .opt files to ensure that importing
modules agree on the type representation.  The attributes may also be added by
Mercury programmers to source files, which will be copied directly into .int
files without analysis.  They will be checked when the module is actually
compiled.

This patch includes work by Zoltan, who independently implemented a version
of this change.


compiler/hlds_data.m:
	Record the direct arg functors in hlds_du_type.

	Add a new option to cons_tag.

	Fix some comments.

compiler/prog_data.m:
compiler/prog_io_type_defn.m:
	Parse and record `direct_arg' attributes on type definitions.

compiler/prog_io_pragma.m:
	Issue an error if the `direct_arg' attribute is used with a foreign
	type.

compiler/make_tags.m:
compiler/mercury_compile_front_end.m:
	Add a pass to convert suitable functors to use the direct argument
	representation.  The argument type must have been added to the type
	table, so we do this after all type definitions have been added.

	Move code to compute cheaper_tag_test here.

compiler/ml_unify_gen.m:
compiler/unify_gen.m:
	Generate different code to construct/deconstruct direct argument
	functors.

compiler/intermod.m:
	Write `direct_arg' attributes to .opt files for functors
	using the direct argument representation.

compiler/mercury_to_mercury.m:
	Write out `direct_arg' attributes.

compiler/rtti.m:
compiler/rtti_out.m:
compiler/rtti_to_mlds.m:
	Add an option to the types which describe the location of secondary
	tag options. The functors which can use the optimised representation
	are a subset of those which require no secondary tag.

	Output "MR_SECTAG_NONE_DIRECT_ARG" instead of "MR_SECTAG_NONE" in
	RTTI structures when applicable.

compiler/add_pragma.m:
compiler/add_type.m:
compiler/bytecode_gen.m:
compiler/check_typeclass.m
compiler/code_info.m:
compiler/equiv_type.m:
compiler/export.m:
compiler/foreign.m:
compiler/hlds_code_util.m:
compiler/hlds_out_module.m:
compiler/inst_check.m:
compiler/ml_proc_gen.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
compiler/ml_type_gen.m:
compiler/module_qual.m:
compiler/modules.m:
compiler/post_term_analysis.m:
compiler/post_typecheck.m:
compiler/recompilation.check.m:
compiler/recompilation.usage.m:
compiler/recompilation.version.m:
compiler/simplify.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/switch_gen.m:
compiler/switch_util.m:
compiler/tag_switch.m:
compiler/term_norm.m:
compiler/type_ctor_info.m:
compiler/type_util.m:
compiler/unify_proc.m:
compiler/unused_imports.m:
compiler/xml_documentation.m:
	Conform to changes.

	Bump RTTI version number.

doc/reference_manual.texi:
	Add commented out documentation for `direct_arg' attributes.

library/construct.m:
	Handle MR_SECTAG_NONE_DIRECT_ARG in construct.construct/3.

library/private_builtin.m:
	Add MR_SECTAG_NONE_DIRECT_ARG constant for Java for consistency,
	though it won't be used.

runtime/mercury_grade.h:
	Bump binary compatibility version number.

runtime/mercury_type_info.h:
	Bump RTTI version number.

	Add MR_SECTAG_NONE_DIRECT_ARG.

runtime/mercury_deconstruct.c:
runtime/mercury_deep_copy_body.h:
runtime/mercury_ml_expand_body.h:
runtime/mercury_table_type_body.h:
runtime/mercury_term_size.c:
runtime/mercury_unify_compare_body.h:
	Handle MR_SECTAG_NONE_DIRECT_ARG in RTTI code.

tests/debugger/Mmakefile:
tests/debugger/chooser_tag_test.exp:
tests/debugger/chooser_tag_test.inp:
tests/debugger/chooser_tag_test.m:
tests/hard_coded/Mercury.options:
tests/hard_coded/Mmakefile:
tests/hard_coded/construct_test.exp:
tests/hard_coded/construct_test.m:
tests/hard_coded/direct_arg_cyclic1.exp:
tests/hard_coded/direct_arg_cyclic1.m:
tests/hard_coded/direct_arg_cyclic2.m:
tests/hard_coded/direct_arg_cyclic3.m:
tests/hard_coded/direct_arg_intermod1.exp:
tests/hard_coded/direct_arg_intermod1.m:
tests/hard_coded/direct_arg_intermod2.m:
tests/hard_coded/direct_arg_intermod3.m:
tests/hard_coded/direct_arg_parent.exp:
tests/hard_coded/direct_arg_parent.m:
tests/hard_coded/direct_arg_sub.m:
tests/invalid/Mmakefile:
tests/invalid/where_direct_arg.err_exp:
tests/invalid/where_direct_arg.m:
tests/invalid/where_direct_arg2.err_exp:
tests/invalid/where_direct_arg2.m:
	Add test cases.

tests/invalid/ee_invalid.err_exp:
	Update expected output.
2011-06-16 06:42:19 +00:00
Paul Bone
0365571027 In ThreadScope grades each context has a unique ID. Previously when a context
was re-used (as apposed to created from scratch) we would re-assign it's ID, so
that it was clear to see when a new computation was started.  This is no-longer
necessary and prevents anyone using ThreadScope from understanding how contexts
are re-used.

This change also adds a new ThreadScope event that marks when a context is
released back to the free context pool.

runtime/mercury_context.c:
    Only allocate new context IDs for new contexts (not re-used contexts

    Use the new release_context event.

    Fixed spelling mistake.

runtime/mercury_threadscope.h:
runtime/mercury_threadscope.c:
    Add support for the release_context event.
2011-06-02 05:59:21 +00:00
Paul Bone
67f072901a Include the name of futures in ThreadScope profiles.
runtime/mercury_threadscope.h:
runtime/mercury_threadscope.c:
    Add a second parameter for the NEW_FUTURE event. The parameter is the id of
    the string that holds the future's name.

runtime/mercury_par_builtin.h:
    In threadscope grades use a two-args version of the new_future macro.

library/par_builtin.m:
    Conform to changes in mercury_par_builtin.h, new_future now takes two
    arguments.

compiler/dep_par_conj.m:
    Create a name variable for each future and pass it as a second parameter to
    calls to new_future.

    Thread a threadscope string table throughout this transformation so that
    strings for variables can be collected.

compiler/hlds_module.m:
    Add a threadscope string table to the module_info structure.

compiler/global_data.m:
    global_data_init now takes the threadscope string table and its size as
    parameters.  This is necessary because the table may be non-empty before
    the LLDS transformation begins.

compiler/mercury_compile_llds_back_end.m:
    Conform to changes in global_data.m

mdbcomp/program_representation.m:
    Disable the polymorphism transformation for new_future/2 rather than the
    old new_future/1.
2011-05-31 03:14:21 +00:00
Paul Bone
987d2e31e3 Fix ThreadScope support since my recent work stealing changes.
runtime/mercury_threadscope.h:
runtime/mercury_threadscope.c:
    Fix some compilation problems.

    Rename stop conjunction and stop conjunct events to use the word "end"
    rather than "stop".  The meaning is clearer and the name matches that used
    in the threadscope paper.

runtime/mercury_context.h:
runtime/mercury_context.c:
    Re-order some operations in the idle loop: try to resume an earlier
    context before working on a local spark, this may lead to leas blocking.

    The RUN_CONTEXT event was posted from the load_context macro.  Change
    this to post the RUN_CONTEXT event explicitly.

    Fix some over-long lines.

    Conform to changes in mercury_threadscope.h.

runtime/mercury_thread.c:
    Add an explicit call to post the RUN_CONTEXT event.

compiler/layout_out.m:
    Add a missing output_layout_array_name call when writing out the
    threadscope string table array.

compiler/par_conj_gen.m:
    Conform to changes in runtime/mercury_threadscope.h
2011-05-24 04:16:48 +00:00
Julien Fischer
dc79a9a412 Fix a problem that was causing the namespace check to fail.
Branches: main

Fix a problem that was causing the namespace check to fail.

runtime/mercury_heap_profile.h:
	Make sure that MR_STATIC_CODE_CONST is defined when doing
	the namespace check.

	Fix some formatting issues.
2011-05-21 13:59:07 +00:00
Peter Wang
7e26b55e74 Implement a new form of memory profiling, which tells the user what memory
Branches: main

Implement a new form of memory profiling, which tells the user what memory
is being retained during a program run.  This is done by allocating an extra
word before each cell, which is used to "attribute" the cell to an
allocation site.  The attribution, or "allocation id", is an address to an
MR_AllocSiteInfo structure generated by the Mercury compiler, giving the
procedure, filename and line number of the allocation, and the type
constructor and arity of the cell that it allocates.

The user must manually instrument the program with calls to
`benchmarking.report_memory_attribution', which forces a GC and summarises
the live objects on the heap using the attributions.  The mprof tool is
extended with a new mode to parse and present that data.

Objects which are unattributed (e.g. by hand-written C code which hasn't
been updated) are still accounted for, but show up in profiles as "unknown".

Currently this profiling mode only works in conjunction with the Boehm
garbage collector, though in principle it can work with any memory allocator
for which we can access a list of the live objects.  Since term size
profiling relies on the same technique of using an extra word per memory
cell, the two profiling modes are incompatible.

The output from `mprof -s' looks like this:

------ [1] some label ------
   cells            words         cumul  procedure / type (location)
   14150            38872                total

*   1949/ 13.8%      4872/ 12.5%  12.5%  <predicate `parser.parse_rest/7' mode 0>
     975/  6.9%      1950/  5.0%         list.list/1 (parser.m:502)
     487/  3.4%      1948/  5.0%         term.term/1 (parser.m:501)
     487/  3.4%       974/  2.5%         term.const/0 (parser.m:501)

*   1424/ 10.1%      4272/ 11.0%  23.5%  <predicate `parser.parse_simple_term_2/6' mode 0>
     708/  5.0%      2832/  7.3%         term.term/1 (parser.m:643)
     708/  5.0%      1416/  3.6%         term.const/0 (parser.m:643)
...


boehm_gc/alloc.c:
boehm_gc/include/gc.h:
boehm_gc/misc.c:
boehm_gc/reclaim.c:
	Add a callback function to be called for every live object after a GC.

	Add a function to write out the GC_size_map array.

compiler/layout.m:
	Define the alloc_site_info type which is equivalent to the
	MR_AllocSiteInfo C structure.

	Add alloc_site_array as a kind of "layout" array.

compiler/llds.m:
	Add allocation sites to `cfile' structure.

	Replace TypeMsg argument (which was also for profiling) on `incr_hp'
	instructions by an allocation site identifier.

	Add a new foreign_proc_component for allocation site ids.

compiler/code_info.m:
compiler/global_data.m:
compiler/proc_gen.m:
	Keep the set of allocation sites in the code_info and global_data
	structures.

compiler/unify_gen.m:
	Add allocation sites to LLDS allocation instructions.

compiler/layout_out.m:
compiler/llds_out_file.m:
compiler/llds_out_instr.m:
	Output MR_AllocSiteInfo arrays in generated C files.

	Output code to register the MR_AllocSiteInfo array with the Mercury
	runtime.

	Output allocation site ids for memory allocation instructions.

compiler/llds_out_util.m:
	Add allocation sites to llds_out_info.

compiler/pragma_c_gen.m:
compiler/ml_foreign_proc_gen.m:
	Generate a macro MR_ALLOC_ID which resolves to an allocation site
	structure, for every foreign_proc whose C code contains the string
	"MR_ALLOC_ID".  This is to be used by hand-written C code which
	allocates memory.

	MR_PROC_LABELs are retained for backwards compatibility.  Though
	they were introduced for profiling, they seem to have been co-opted
	for printf-debugging since then.

compiler/ml_global_data.m:
	Add allocation site structures to the MLDS global data.

compiler/mlds.m:
compiler/ml_unify_gen.m:
	Add allocation site id to `new_object' instruction.

compiler/mlds_to_c.m:
	Output allocation site arrays and allocation ids in high-level C code.

	Output a call to register the allocation site array with the Mercury
	runtime.

	Delete an unused predicate.

compiler/exprn_aux.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/mercury_compile_llds_back_end.m:
compiler/middle_rec.m:
compiler/ml_accurate_gc.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_util.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/use_local_vars.m:
compiler/var_locn.m:
	Conform to changes.

compiler/pickle.m:
compiler/prog_event.m:
compiler/timestamp.m:
	Conform to changes in memory allocation macros.

library/benchmarking.m:
	Add the `report_memory_attribution' instrumentation predicates.

	Conform to changes to MR_memprof_record.

library/array.m:
library/bit_buffer.m:
library/bitmap.m:
library/construct.m:
library/deconstruct.m:
library/dir.m:
library/io.m:
library/mutvar.m:
library/store.m:
library/string.m:
library/thread.semaphore.m:
library/version_array.m:
	Use attributed memory allocation throughout the standard library so
	that objects don't show up in the memory profile as "unknown".

	Replace MR_PROC_LABEL by MR_ALLOC_ID.

mdbcomp/program_representation.m:
mdbcomp/rtti_access.m:
	Replace MR_PROC_LABEL by MR_ALLOC_ID.

profiler/Mercury.options:
profiler/globals.m:
profiler/mercury_profile.m:
profiler/options.m:
profiler/output.m:
profiler/snapshots.m:
	Add a new mode to `mprof' to parse and present the data from
	`Prof.Snapshots' files.

	Add options for the new profiling mode.

profiler/process_file.m:
	Fix a typo.

runtime/mercury_conf_param.h:
	#define MR_MPROF_PROFILE_MEMORY_ATTRIBUTION if memory profiling
	is enabled and we are using Boehm GC.

runtime/mercury.h:
	Make MR_new_object take an allocation id argument.

	Conform to changes in memory allocation macros.

runtime/mercury_memory.c:
runtime/mercury_memory.h:
runtime/mercury_types.h:
	Define MR_AllocSiteInfo.

	Add memory allocation functions and macros which take into the
	account the additional word necessary for the new profiling mode.
	These should be used in preferences to the raw memory allocation
	functions wherever possible so that objects do not show up in the
	profile as "unknown".

	Add analogues of realloc/free which take into account the offset
	introduced by the attribution word.

	Add function versions of the MR_new_object macros, which can't be
	written in standard C.  They are only used when necessary.

	Add built-in allocation site ids, to be used in the runtime and
	other hand-written code when context-specific ids are unavailable.

runtime/mercury_heap.h:
	Make MR_tag_offset_incr_hp_msg and MR_tag_offset_incr_hp_atomic_msg
	allocate an extra word when memory attribution is desired, and store
	the allocation id there.

	Similarly for MR_create{1,2,3}_msg.

	Replace proclabel arguments in allocation macros by alloc_id
	arguments.

	Replace MR_hp_alloc_atomic by MR_hp_alloc_atomic_msg.  It was only
	used for boxing floats.

	Conform to change to MR_new_object macro.

runtime/mercury_bootstrap.h:
	Delete obsolete macro hp_alloc_atomic.

runtime/mercury_heap_profile.c:
runtime/mercury_heap_profile.h:
	Add the code to summarise the live objects on the Boehm GC heap and
	writes out the data to `Prof.Snapshots', for display by mprof.

	Don't store the procedure name in MR_memprof_record: the procedure
	address is enough and faster to compare.

runtime/mercury_prof.c:
	Finish and close the `Prof.Snapshots' file when the program
	terminates.

	Conform to changes in MR_memprof_record.

runtime/mercury_misc.h:
	Add a macro to expand to the name of the allocation sites array
	in LLDS grades.

runtime/mercury_bitmap.c:
runtime/mercury_bitmap.h:
	Pass allocation id through bitmap allocation functions.

	Delete unused function MR_string_to_bitmap.

runtime/mercury_string.h:
	Add MR_make_aligned_string_copy_msg.

	Make string allocation macros take allocation id arguments.

runtime/mercury.c:
runtime/mercury_array_macros.h:
runtime/mercury_context.c:
runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_dlist.c:
runtime/mercury_engine.c:
runtime/mercury_float.h:
runtime/mercury_hash_table.c:
runtime/mercury_ho_call.c:
runtime/mercury_label.c:
runtime/mercury_prof_mem.c:
runtime/mercury_stacks.c:
runtime/mercury_stm.c:
runtime/mercury_string.c:
runtime/mercury_thread.c:
runtime/mercury_trace_base.c:
runtime/mercury_trail.c:
runtime/mercury_type_desc.c:
runtime/mercury_type_info.c:
runtime/mercury_wsdeque.c:
	Use attributed memory allocation throughout the runtime so that
	objects don't show up in the profile as "unknown".

runtime/mercury_memory_zones.c:
	Attribute memory zones to the Mercury runtime.

runtime/mercury_tabling.c:
runtime/mercury_tabling.h:
	Use attributed memory allocation macros for tabling structures.

	Delete unused MR_table_realloc_* and MR_table_copy_bytes macros.

runtime/mercury_deep_copy_body.h:
	Try to retain the original attribution word when copying values.

runtime/mercury_ml_expand_body.h:
	Conform to changes in memory allocation macros.

runtime/mercury_tags.h:
	Replace proclabel arguments by alloc_id arguments in allocation macros.

runtime/mercury_wrapper.c:
	If memory attribution is enabled, tell Boehm GC that pointers may be
	displaced by an extra word.

trace/mercury_trace.c:
trace/mercury_trace_tables.c:
	Conform to changes in memory allocation macros.

extras/net/tcp.m:
extras/solver_types/library/any_array.m:
extras/trailed_update/tr_array.m:
	Conform to changes in memory allocation macros.

doc/user_guide.texi:
	Document the new profiling mode.

doc/reference_manual.texi:
	Update a commented out example.
2011-05-20 04:16:58 +00:00
Paul Bone
e8b8499ec4 When a program 'waits' on a future it takes the future's lock, checks if the
future is available and if it is reads the value and unlocks the future.  We
can avoid the locking operation in many cases by testing if the future is
available before taking the lock.  If the future is not available then take
the lock and re-test to see if the future is available.

To make this safe we now write the future's value before writing to the field
that says it's available, these two writes are stored in the correct order by
using an 'sfence' instruction.

runtime/mercury_par_builtin.m:
	As above.

	Also re-order the fields of the future structure, putting fut_value and
	fut_signalled next to each other, they're more likely to be in te same
	cache line this way.

library/Mmakefile:
	Make par_builtin.o depend on mercury_par_builtin.h in the runtime.
2011-05-10 00:28:15 +00:00
Peter Wang
c456f0b058 Disable garbage collection during early runtime initialisation, when little or
Branches: main

Disable garbage collection during early runtime initialisation, when little or
no garbage is created anyway.

runtime/mercury_wrapper.c:
	As above.
2011-05-05 05:50:28 +00:00
Peter Wang
6063cd6fda Fix allocation when building the MERCURY_OPTIONS_progname
Branches: main, 11.01

runtime/mercury_wrapper.c:
	Fix allocation when building the MERCURY_OPTIONS_progname
	string, which was short by one byte.
2011-05-03 04:13:26 +00:00
Zoltan Somogyi
f3389a7197 Remove unnecessary mechanism for managing a non-existent module
Estimated hours taken: 0.5

runtime/mercury_threadscope.[ch]:
	Remove unnecessary mechanism for managing a non-existent module
	of hand-translated-to-C Mercury code.

	Fix deviations from our programming style.
2011-05-02 07:55:04 +00:00