Commit Graph

62 Commits

Author SHA1 Message Date
Peter Wang
72f174b4e2 Don't print value of errno in MR_fatal_error.
The majority of calls to MR_fatal_error do not follow an operation that
sets errno, so printing out an error message unrelated to the reason for
the fatal error will lead to confusion. It can also cause test failures
if errno happens to be set to non-zero some time prior to an expected
call to MR_fatal_error. Fixes bug #464.

runtime/mercury_misc.c:
    Don't print value of errno in MR_fatal_error.

runtime/mercury_context.c:
runtime/mercury_thread.c:
    Pass strerror strings to MR_fatal_error where appropriate.

runtime/mercury_memory_zones.c:
runtime/mercury_memory_zones.h:
    Pass strerror strings to MR_fatal_error following failures of
    MR_protect_pages. Document that this assumes MR_protect_pages sets
    errno on error.

    Skip unnecessary call to sprintf before MR_fatal_error.

runtime/mercury_deep_profiling.c:
    Skip unnecessary call to sprintf before MR_fatal_error.

    Reduce size of some buffers.

runtime/mercury_overflow.c:
runtime/mercury_stack_trace.c:
    Pass a fixed format string to MR_fatal_error just in case
    the message string may contain percentage signs.

runtime/mercury_tabling.c:
    Skip unnecessary call to sprintf before MR_fatal_error.

deep_profiler/timeout.m:
library/thread.m:
mdbcomp/shared_utilities.m:
    Pass strerror strings to MR_fatal_error where appropriate.

trace/mercury_trace.c:
    Skip unnecessary call to sprintf before MR_fatal_error.

trace/mercury_trace_external.c:
    Pass a fixed format string to MR_fatal_error just in case.
2018-08-19 12:19:19 +10:00
Mark Brown
d465fa53cb Update the COPYING.LIB file and references to it.
Discussion of these changes can be found on the Mercury developers
mailing list archives from June 2018.

COPYING.LIB:
    Add a special linking exception to the LGPL.

*:
    Update references to COPYING.LIB.

    Clean up some minor errors that have accumulated in copyright
    messages.
2018-06-09 17:43:12 +10:00
Julien Fischer
1fc495c33c Fix bug #357: parallel conjunction broken on OS X.
The parallel version of the runtime makes use of POSIX unnamed semaphores,
which do not exist on OS X (although annoyingly most of the relevant functions
*do* exist, they just don't do anything).  This was originally a problem for
the low-level C version of the parallel runtime, but now affects both C
backends (i.e. it's currently impossible to use threads or parallel
conjunctions on OS X).  The fix is to replace the use of POSIX unnamed
semaphores on OS X, with the semaphore implementation for libdispatch (which is
part of Grand Central Dispatch).  Note that the .par grades are (presumably)
still broken in versions of OS X prior to 10.6 -- they'll now just fail to
compile rather than fail at runtime.

configure.ac:
runtime/mercury_conf.h.in:
    Check whether libdispatch is present.

runtime/mercury_conf_param.h:
    Define a macro that says whether we want to use libdispatch; always
    define this macro on OS X.  Using libdispatch (optionally) on other systems
    is future work.

runtime/mercury_thread.{c,h}
    Define the MercurySem type and its associated operations appropriately
    when using libdispatch.

library/Mmakefile:
    Recompile the thread and thread.semaphore if any of the runtime headers
    that define the macros they use change.
2016-10-02 22:29:53 +11:00
Julien Fischer
74dfaa5953 Isolate all dependencies on POSIX unnamed semaphores.
Isolate all dependencies on POSIX unnamed semaphores to the runtime's
mercury_thread module.  This is a step towards fixing bug #357.

runtime/mercury_thread.{h,c}:
    Provide wrappers for the functions sem_destroy and sem_timedwait.
    The latter does not exist on OS X, so (for now) just abort if it
    is called.

runtime/mercury_context.c:
library/thread.m:
     Use the above wrappers instead of calling the sem_* functions
     directly.
2016-09-29 14:06:04 +10:00
Zoltan Somogyi
53b573692a Convert C code to use // style comments.
runtime/*.[ch]:
trace/*.[chyl]:
    As above. In some places, improve comments, e.g. by expanding contractions
    such as "we've". Add #ifndef guards against double inclusion around
    the trace/*.h files that did not already have them.

tools/*:
    Make the corresponding changes in shell scripts that generate .[ch] files
    in the runtime.

tests/*:
    Conform to a slight change in the text of a message.
2016-07-14 13:57:35 +02:00
Zoltan Somogyi
67326f16e4 Fix style issues in the runtime.
Move all .h and .c files to four-space indentation without tabs,
if they weren't there already.

Use the same vim line for all .h and .c files.

Align all backslashes at the ends of lines in macro definitions.
Align close comment signs.

In some places, fix inconsistent indentation.

Fix a bunch of comments. Add XXXs to a few of them.
2016-07-09 12:14:00 +02:00
Julien Fischer
0b69b9e753 Add a wrapper for initializing unnamed semaphores in the runtime.
runtime/mercury_thread.[ch]:
     Add a new wrapper function around sem_init: the wrapper ensures that the
     return value is always checked and isolates the direct calls to sem_init
     to within the the mercury_thread module.  The latter is important since
     in order to fix bug #357 we are going to have to replace the use of unnamed
     semaphores on OS X with something else (I'm currently looking at using
     the semaphore implementation from GCD's libdispatch as a possible fix.)

runtime/mercury_context.c:
     Use MR_sem_init where appropriate.

     s/sem_t/MercurySem/ in a few spots.
2015-09-18 16:22:47 +10:00
Peter Wang
56cebb3114 Add thread.spawn_native/4 and thread.spawn/4.
Most backends already mapped Mercury threads to "native" threads in spawn/3,
but it was and remains an implementation detail.  spawn_native provides
that behaviour as a documented feature for programs which require it,
including for the low-level C backend.

While we are at it, add a `thread' handle type.  It currently holds a
thread identifier (not yet formally exported), but it may also have
other uses such as a handle for a `thread.join' predicate, or a place to
hold result values or uncaught exceptions.

library/thread.m:
	Add abstract type `thread'.

	Add can_spawn_native.

	Add spawn_native/4.  It can report failure to start a thread,
	which was missing from the spawn/3 interface.

	Add spawn/4 to match spawn_native/4, without the native thread
	requirement.

	Make ML_create_exclusive_thread wait for a success code from
	the new thread before continuing.

	Reduce accessibility levels in C# and Java helper classes.

runtime/mercury_thread.c:
	Make MR_init_thread_inner and MR_setup_engine_for_threads
	return errors instead of aborting on failure.

tests/hard_coded/Mercury.options:
tests/hard_coded/Mmakefile:
tests/hard_coded/spawn_native.exp2:
tests/hard_coded/spawn_native.exp:
tests/hard_coded/spawn_native.m:
	Add test case.

NEWS:
	Announce change.
2014-07-10 14:58:14 +10:00
Peter Wang
29f2dcf213 Support dynamic creation of Mercury engines in low-level C parallel grades.
This change allows Mercury engines (each in a separate OS thread) to be
created and destroyed dynamically in low-level C grades.

We divide Mercury engines into two types:

    "Shared" engines may execute code from any Mercury thread.
    Shared engines may steal work from other shared engines, so are also
    called work-stealing engines; we do not have shared engines that
    refrain from work-stealing.

    "Exclusive" engines execute code only for a single Mercury thread.

Only exclusive engines may be created and destroyed dynamically so far.
This assumption could be lifted when and if the need should arise.

Exclusive engines are a means for the user to map a Mercury thread directly
to an OS thread.  Calls to blocking procedures on that thread will not block
progress in arbitrary other Mercury threads.  Foreign code which depends on
the OS thread-local state is usable when called from that thread.

We do not yet allow shared engines to steal parallel work from exclusive
engines.

runtime/mercury_wrapper.c:
runtime/mercury_wrapper.h:
	Rename MR_num_threads to MR_num_ws_engines.  It counts only
	work-stealing engines.  Move comment to the header file.

	Add MR_max_engines.  The default value is arbitrary.

	Add MERCURY_OPTIONS `--max-engines' option.

	Define MR_num_ws_engines and MR_max_engines only with
	MR_LL_PARALLEL_CONJ.

runtime/mercury_context.c:
runtime/mercury_context.h:
	Rename MR_num_idle_engines to MR_num_idle_ws_engines.
	It only counts idle work-stealing engines.

	Extend MR_spark_deques to MR_max_engines length.

	Extend engine_sleep_sync_data to MR_max_engines length.

	Add function to index engine_sleep_sync_data with optional bounds
	checking.

	Replace instances of MR_num_threads by MR_num_ws_engines or
	MR_max_engines as appropriate.

	Add MR_ctxt_exclusive_engine field.

	Rename existing MR_Context fields to remove the implication that the
	engine "owns" the context.  The new exclusive_engine field does
	imply a kind of ownership, hence potential confusion.

	Rename MR_SavedOwner, too.

	Make MR_find_ready_context respect MR_ctxt_exclusive_engine.

	Make MR_schedule_context respect MR_ctxt_exclusive_engine.

	Rename MR_try_wake_an_engine to MR_try_wake_ws_engine
	and restrict it to work-stealing engines.

	Rename MR_shutdown_all_engines to MR_shutdown_ws_engines
	and restrict it to work-stealing engines.

	Make try_wake_engine and try_notify_engine decrement
	MR_num_idle_ws_engines only for shared engines.

	In MR_do_idle, make exclusive engines bypass work-stealing
	and skip to the sleep state.

	In MR_do_sleep, make exclusive engines ignore work-stealing advice
	and abort the program if told to shut down.

	Assert that a context with an exclusive_engine really is only loaded
	by that engine.

	In MR_fork_new_child, make exclusive engines not attempt to wake
	work-stealing engines.  Its sparks cannot be stolen anyway.

	Make do_work_steal fail the attempt for exclusive engines.
	There is one call where this might happen.

	Add notes to MR_attempt_steal_spark.  Its behaviour is unchanged.

	Replace a call to MR_destroy_thread by MR_finalize_thread_engine.

	Delete MR_num_exited_engines.  It was unused.

runtime/mercury_thread.c:
runtime/mercury_thread.h:
	Delete MR_next_engine_id and MR_next_engine_id_lock.  We can no longer
	allocate engine ids by incrementing a counter.  Engine ids need to be
	reused as they act as indices into fixed-sized arrays.

	Extend MR_all_engine_bases to MR_max_engines entries.

	Add MR_all_engine_bases_lock to protect MR_all_engine_bases.

	Add MR_highest_engine_id.

	Add MR_EngineType with the two options described.

	Split the main part of MR_init_engine into a new function which
	accepts an engine type.  MR_init_engine is used by generated code so
	maintain the interface.

	Factor out setup/shutdown for thread support.

	Make MR_finalize_thread_engine call the shutdown function.

	Specialise MR_create_thread into MR_create_worksteal_thread.
	The generic form was unused.

	Move thread pinning into MR_create_worksteal_thread as other threads
	do not require it.

	Delete MR_destroy_thread.  Its one caller can use
	MR_finalize_thread_engine.

	Delete declaration for non-existent variable
	MR_init_engine_array_lock.

runtime/mercury_engine.c:
runtime/mercury_engine.h:
	Add MR_eng_type field.

	Make MR_eng_spark_deque a pointer to separately-allocated memory.
	The reason is given in MR_attempt_steal_spark.

	Add MR_ENGINE_ID_NONE, a dummy value for MR_ctxt_exclusive_engine.

	Delete MR_eng_owner_thread which was obsoleted by engine ids
	before.

	Delete misplaced declaration of MR_all_engine_bases.

runtime/mercury_memory_zones.c:
	Replace MR_num_threads by appropriate counters (I hope).

runtime/mercury_memory_handlers.c:
runtime/mercury_par_builtin.h:
	Conform to changes.

runtime/mercury_threadscope.c:
	Conform to renaming (but it might be wrong).

library/thread.m:
	Add hidden predicate `spawn_native' for testing.
	The interface is subject to change.

	Share much of the code with the high-level C backend.

library/par_builtin.m:
	Delete `num_os_threads' as it is unused.

doc/user_guide.texi:
	Document MERCURY_OPTIONS `--max-engines' option.
2014-07-10 14:57:48 +10:00
Julien Fischer
4987cd686e Initial support for .par grades with MinGW64.
The pthreads-win32 has worked with 64-bit compilers since version 2.9.
This diff adds the initial support for the .par grades with MinGW64 and
pthreads-win32.

configure.ac:
	Set C compiler and linker flags for x86_64-w64-mingw32.
	(This is provisional; on my system the library appears
	to have a different name, but I'm not sure how normal
	this is.)

runtime/mercury_thread.h:
	Adjust the definition of the MR_SELF_THREAD_ID macro
	so that the integer it expands to is at least as big
	as a pointer.  (Needed for pthreads-win32 since thread
	ids are pointer values, not integers with that.)

runtime/mercury_thread.c:
	Avoid warnings in some debugging code.
2013-04-05 17:12:09 +11:00
Paul Bone
e6577cfa5d ThreadScope support improvements.
Provide a new event for context re-use rather than creation.  This event
is true to Mercury's behaviour; the existing threadscope events were
not.

Bring Mercury's usage of the create context event into line with
ThreadScope's expectations.

mercury_threadscope.[ch]:
    Add a new event for when a context is re-used (and it's id is
    re-assigned).  This is like the create context event except that the
    storage came from a previously used context.

mercury_context.c:
    Post the reuse context event when a context is re-used from the
    free list.

    Post reuse context when a context that an engine already has is
    re-used for a stolen spark.  XXX: Check locally allocated contexts.

    A result of these changes is that the create context message is used
    even when a context is created to evaluate sparks.  This is
    deliberate: Some of ThreadScope's analyses require this.

mercury_thread.c:
mercury_context.c:
    Place the create context event in MR_create_context rather than
    after MR_create_context returns.

mercury_par_builtin.h:
    Fixed the order of some type qualifiers.  volatile was incorrectly
    referring to the pointer's target and not the pointer.
2012-06-19 11:08:16 +00:00
Peter Wang
e6cfcc53fc Fix a memory leak in thread.spawn in high-level C grades.
Branches: main

Fix a memory leak in thread.spawn in high-level C grades.

library/thread.m:
	Call MR_finalize_thread_engine after finishing the thread goal in
	ML_thread_wrapper.

runtime/mercury_thread.c:
	Call MR_destroy_engine in MR_finalize_thread_engine.
	(The XXX is from year 2000 -- a lot of things have changed since then.
	If the problem is still present, we should fix it.)
2011-11-28 05:14:26 +00:00
Paul Bone
a071eaba53 Improve thread pinning:
+ Now pins threads intelligently on SMT systems by balancing threads among
      cores.
    + performs fewer migrations when pinning threads (If a thread's current
      CPU is a valid CPU for pinning, then it is not migrated).
    + Handle cases where the user requests more threads than available CPUs.
    + Handle cases where the process is restricted to a subset of CPUs by its
      environment.  (for instance, Linux cpuset(7))

This is largely made possible by the hwloc library
http://www.open-mpi.org/projects/hwloc/  However, hwloc is not required and the
runtime system will fall back to sched_setaffinity(), it will simply be less
intelligent WRT SMT.

runtime/mercury_context.h:
runtime/mercury_context.c:
    Do thread pinning either via hwloc or sched_setaffinity.  Previously only
    sched_setaffinity was used.

    Update thread-pinning algorithm, this:

    Include the general thread pinning code only if MR_HAVE_THREAD_PINNING is
    defined.

    Use a combination of sysconf and sched_getaffinity to detect the number of
    processors when hwloc isn't available.  This makes the runtime compatible
    with Linux cpuset(7) when hwloc isn't available.

configure.in:
Mmake.common.in:
    Detect presence of the hwloc library.

configure.in:
    Detect sched_getaffinity()

aclocal.m4:
acinclude.m4:
    Move aclocal.m4 to acinclude.m4, the aclocal program will build aclocal.m4
    and retrieve macros from the system and the contents of acinclude.m4.

Mmakefile:
    Create a make target for aclocal.m4.

runtime/Mmakefile:
    Link the runtime with libhwloc in low-level C parallel grades.

    Include CFLAGS for libhwloc.

scripts/ml.in:
    Link programs and libraries with libhwloc in low-level C parallel grades.

runtime/mercury_conf.h.in:
    Define MR_HAVE_HWLOC when it is available.

    Define MR_HAVE_SCHED_GETAFFINITY when it is available.

runtime/mercury_conf_param.h:
    Define MR_HAVE_THREAD_PINNING if either hwloc or [sched_setaffinity and
    sched_getaffinity] are available.

runtime/mercury_thread.c:
runtime/mercury_wrapper.c:
    Only call MR_pin_thread and MR_pin_primordial_thread if
    MR_HAVE_THREAD_PINNING is defined.

runtime/mercury_thread.h:
runtime/mercury_context.h:
    Move the declaration of MR_pin_primordial_thread to mercury_context.h from
    mercury_thead.h since it's definition is in mercury_context.c.

    Require MR_HAVE_THREAD_PINNING for the declaration of
    MR_pin_primordial_thread.

runtime/mercury_wrapper.c:
    Conform to changes in mercury_context.h

INSTALL_CVS:
tools/test_mercury
    Run aclocal at the right times while testing Mercury.
2011-10-13 02:42:21 +00:00
Paul Bone
987d2e31e3 Fix ThreadScope support since my recent work stealing changes.
runtime/mercury_threadscope.h:
runtime/mercury_threadscope.c:
    Fix some compilation problems.

    Rename stop conjunction and stop conjunct events to use the word "end"
    rather than "stop".  The meaning is clearer and the name matches that used
    in the threadscope paper.

runtime/mercury_context.h:
runtime/mercury_context.c:
    Re-order some operations in the idle loop: try to resume an earlier
    context before working on a local spark, this may lead to leas blocking.

    The RUN_CONTEXT event was posted from the load_context macro.  Change
    this to post the RUN_CONTEXT event explicitly.

    Fix some over-long lines.

    Conform to changes in mercury_threadscope.h.

runtime/mercury_thread.c:
    Add an explicit call to post the RUN_CONTEXT event.

compiler/layout_out.m:
    Add a missing output_layout_array_name call when writing out the
    threadscope string table array.

compiler/par_conj_gen.m:
    Conform to changes in runtime/mercury_threadscope.h
2011-05-24 04:16:48 +00:00
Peter Wang
7e26b55e74 Implement a new form of memory profiling, which tells the user what memory
Branches: main

Implement a new form of memory profiling, which tells the user what memory
is being retained during a program run.  This is done by allocating an extra
word before each cell, which is used to "attribute" the cell to an
allocation site.  The attribution, or "allocation id", is an address to an
MR_AllocSiteInfo structure generated by the Mercury compiler, giving the
procedure, filename and line number of the allocation, and the type
constructor and arity of the cell that it allocates.

The user must manually instrument the program with calls to
`benchmarking.report_memory_attribution', which forces a GC and summarises
the live objects on the heap using the attributions.  The mprof tool is
extended with a new mode to parse and present that data.

Objects which are unattributed (e.g. by hand-written C code which hasn't
been updated) are still accounted for, but show up in profiles as "unknown".

Currently this profiling mode only works in conjunction with the Boehm
garbage collector, though in principle it can work with any memory allocator
for which we can access a list of the live objects.  Since term size
profiling relies on the same technique of using an extra word per memory
cell, the two profiling modes are incompatible.

The output from `mprof -s' looks like this:

------ [1] some label ------
   cells            words         cumul  procedure / type (location)
   14150            38872                total

*   1949/ 13.8%      4872/ 12.5%  12.5%  <predicate `parser.parse_rest/7' mode 0>
     975/  6.9%      1950/  5.0%         list.list/1 (parser.m:502)
     487/  3.4%      1948/  5.0%         term.term/1 (parser.m:501)
     487/  3.4%       974/  2.5%         term.const/0 (parser.m:501)

*   1424/ 10.1%      4272/ 11.0%  23.5%  <predicate `parser.parse_simple_term_2/6' mode 0>
     708/  5.0%      2832/  7.3%         term.term/1 (parser.m:643)
     708/  5.0%      1416/  3.6%         term.const/0 (parser.m:643)
...


boehm_gc/alloc.c:
boehm_gc/include/gc.h:
boehm_gc/misc.c:
boehm_gc/reclaim.c:
	Add a callback function to be called for every live object after a GC.

	Add a function to write out the GC_size_map array.

compiler/layout.m:
	Define the alloc_site_info type which is equivalent to the
	MR_AllocSiteInfo C structure.

	Add alloc_site_array as a kind of "layout" array.

compiler/llds.m:
	Add allocation sites to `cfile' structure.

	Replace TypeMsg argument (which was also for profiling) on `incr_hp'
	instructions by an allocation site identifier.

	Add a new foreign_proc_component for allocation site ids.

compiler/code_info.m:
compiler/global_data.m:
compiler/proc_gen.m:
	Keep the set of allocation sites in the code_info and global_data
	structures.

compiler/unify_gen.m:
	Add allocation sites to LLDS allocation instructions.

compiler/layout_out.m:
compiler/llds_out_file.m:
compiler/llds_out_instr.m:
	Output MR_AllocSiteInfo arrays in generated C files.

	Output code to register the MR_AllocSiteInfo array with the Mercury
	runtime.

	Output allocation site ids for memory allocation instructions.

compiler/llds_out_util.m:
	Add allocation sites to llds_out_info.

compiler/pragma_c_gen.m:
compiler/ml_foreign_proc_gen.m:
	Generate a macro MR_ALLOC_ID which resolves to an allocation site
	structure, for every foreign_proc whose C code contains the string
	"MR_ALLOC_ID".  This is to be used by hand-written C code which
	allocates memory.

	MR_PROC_LABELs are retained for backwards compatibility.  Though
	they were introduced for profiling, they seem to have been co-opted
	for printf-debugging since then.

compiler/ml_global_data.m:
	Add allocation site structures to the MLDS global data.

compiler/mlds.m:
compiler/ml_unify_gen.m:
	Add allocation site id to `new_object' instruction.

compiler/mlds_to_c.m:
	Output allocation site arrays and allocation ids in high-level C code.

	Output a call to register the allocation site array with the Mercury
	runtime.

	Delete an unused predicate.

compiler/exprn_aux.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/mercury_compile_llds_back_end.m:
compiler/middle_rec.m:
compiler/ml_accurate_gc.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_util.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/use_local_vars.m:
compiler/var_locn.m:
	Conform to changes.

compiler/pickle.m:
compiler/prog_event.m:
compiler/timestamp.m:
	Conform to changes in memory allocation macros.

library/benchmarking.m:
	Add the `report_memory_attribution' instrumentation predicates.

	Conform to changes to MR_memprof_record.

library/array.m:
library/bit_buffer.m:
library/bitmap.m:
library/construct.m:
library/deconstruct.m:
library/dir.m:
library/io.m:
library/mutvar.m:
library/store.m:
library/string.m:
library/thread.semaphore.m:
library/version_array.m:
	Use attributed memory allocation throughout the standard library so
	that objects don't show up in the memory profile as "unknown".

	Replace MR_PROC_LABEL by MR_ALLOC_ID.

mdbcomp/program_representation.m:
mdbcomp/rtti_access.m:
	Replace MR_PROC_LABEL by MR_ALLOC_ID.

profiler/Mercury.options:
profiler/globals.m:
profiler/mercury_profile.m:
profiler/options.m:
profiler/output.m:
profiler/snapshots.m:
	Add a new mode to `mprof' to parse and present the data from
	`Prof.Snapshots' files.

	Add options for the new profiling mode.

profiler/process_file.m:
	Fix a typo.

runtime/mercury_conf_param.h:
	#define MR_MPROF_PROFILE_MEMORY_ATTRIBUTION if memory profiling
	is enabled and we are using Boehm GC.

runtime/mercury.h:
	Make MR_new_object take an allocation id argument.

	Conform to changes in memory allocation macros.

runtime/mercury_memory.c:
runtime/mercury_memory.h:
runtime/mercury_types.h:
	Define MR_AllocSiteInfo.

	Add memory allocation functions and macros which take into the
	account the additional word necessary for the new profiling mode.
	These should be used in preferences to the raw memory allocation
	functions wherever possible so that objects do not show up in the
	profile as "unknown".

	Add analogues of realloc/free which take into account the offset
	introduced by the attribution word.

	Add function versions of the MR_new_object macros, which can't be
	written in standard C.  They are only used when necessary.

	Add built-in allocation site ids, to be used in the runtime and
	other hand-written code when context-specific ids are unavailable.

runtime/mercury_heap.h:
	Make MR_tag_offset_incr_hp_msg and MR_tag_offset_incr_hp_atomic_msg
	allocate an extra word when memory attribution is desired, and store
	the allocation id there.

	Similarly for MR_create{1,2,3}_msg.

	Replace proclabel arguments in allocation macros by alloc_id
	arguments.

	Replace MR_hp_alloc_atomic by MR_hp_alloc_atomic_msg.  It was only
	used for boxing floats.

	Conform to change to MR_new_object macro.

runtime/mercury_bootstrap.h:
	Delete obsolete macro hp_alloc_atomic.

runtime/mercury_heap_profile.c:
runtime/mercury_heap_profile.h:
	Add the code to summarise the live objects on the Boehm GC heap and
	writes out the data to `Prof.Snapshots', for display by mprof.

	Don't store the procedure name in MR_memprof_record: the procedure
	address is enough and faster to compare.

runtime/mercury_prof.c:
	Finish and close the `Prof.Snapshots' file when the program
	terminates.

	Conform to changes in MR_memprof_record.

runtime/mercury_misc.h:
	Add a macro to expand to the name of the allocation sites array
	in LLDS grades.

runtime/mercury_bitmap.c:
runtime/mercury_bitmap.h:
	Pass allocation id through bitmap allocation functions.

	Delete unused function MR_string_to_bitmap.

runtime/mercury_string.h:
	Add MR_make_aligned_string_copy_msg.

	Make string allocation macros take allocation id arguments.

runtime/mercury.c:
runtime/mercury_array_macros.h:
runtime/mercury_context.c:
runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_dlist.c:
runtime/mercury_engine.c:
runtime/mercury_float.h:
runtime/mercury_hash_table.c:
runtime/mercury_ho_call.c:
runtime/mercury_label.c:
runtime/mercury_prof_mem.c:
runtime/mercury_stacks.c:
runtime/mercury_stm.c:
runtime/mercury_string.c:
runtime/mercury_thread.c:
runtime/mercury_trace_base.c:
runtime/mercury_trail.c:
runtime/mercury_type_desc.c:
runtime/mercury_type_info.c:
runtime/mercury_wsdeque.c:
	Use attributed memory allocation throughout the runtime so that
	objects don't show up in the profile as "unknown".

runtime/mercury_memory_zones.c:
	Attribute memory zones to the Mercury runtime.

runtime/mercury_tabling.c:
runtime/mercury_tabling.h:
	Use attributed memory allocation macros for tabling structures.

	Delete unused MR_table_realloc_* and MR_table_copy_bytes macros.

runtime/mercury_deep_copy_body.h:
	Try to retain the original attribution word when copying values.

runtime/mercury_ml_expand_body.h:
	Conform to changes in memory allocation macros.

runtime/mercury_tags.h:
	Replace proclabel arguments by alloc_id arguments in allocation macros.

runtime/mercury_wrapper.c:
	If memory attribution is enabled, tell Boehm GC that pointers may be
	displaced by an extra word.

trace/mercury_trace.c:
trace/mercury_trace_tables.c:
	Conform to changes in memory allocation macros.

extras/net/tcp.m:
extras/solver_types/library/any_array.m:
extras/trailed_update/tr_array.m:
	Conform to changes in memory allocation macros.

doc/user_guide.texi:
	Document the new profiling mode.

doc/reference_manual.texi:
	Update a commented out example.
2011-05-20 04:16:58 +00:00
Paul Bone
bb5d0c7c0c Fix conditional compilation of some low-level thread safe code. This was
preventing the hlc.gc.par grade from compiling.

runtime/mercury_context.c:
runtime/mercury_thread.c:
    As above.
2011-04-19 03:03:37 +00:00
Paul Bone
f1779bd1e8 Improve work stealing. Spark deques have been associated with contexts so far.
This is a problem for the following reasons:

    The work stealing code must take a lock to access the resizeable array of
    work stealing dequeues.  This adds global contention that can be avoided if
    this array has a fixed size.

    If a context is blocked on a future then that engine cannot execute the
    sparks from that context, instead it tries to find global work, this is
    more expensive than necessary.

    If there are a few dozen contexts then there may be just as many work
    stealing queues to take work from, the density of these queues will be
    higher if they are fewer.  Therefore work stealing will be more successful
    on average.

This change associates spark deques with Mercury Engines rather than Contexts
to avoid these problems.

This has invalidated some invariants that allowed the runtime system to make
some worth-while optimisations.  These optimisations have been maintained.
Mercury's idle loop has been reimplemented to allow for this.  This
re-implementation has allowed for a number of other improvements:

    Polling was used to check for new global sparks.  This has been removed and
    each engine now sleeps using it's own semaphore.

    Checks for work can be done in different orders depending on how an engine
    joins the idle loop.

    When global work becomes available a particular engine can be woken up
    rather than any arbitrary engine.  We take advantage of this when making
    contexts runnable, we try to schedule them on the engine that last executed
    them.

    When an engine is woken up it can be instructed with what it should do upon
    waking up.

    When a engine looks for a context to run, it will try to pick a context
    that was last executed on it.  This may avoid cache misses when the context
    begins to run.

In the future we should consider:
    Experiment with telling engines which context to run.

    Improve the selection of which engine work should be scheduled on to be
    hardware and memory-hierarchy aware.

Things that need doing next (probably next week):
    ./configure should check for POSIX semaphore support.

    Profiling times have been broken by this change, they will need fixing.

    The threadscope event long now breaks an invariants that the threadscope
    graphical tool requires.

    Semaphores are setup but never released, this is not a big problem but the
    manual page says that some implementations may leak resources.

runtime/mercury_context.h:
runtime/mercury_context.c:
    Remove the spark deque field from the MR_Context structure.

    Export the new array of spark deques so that other modules may fill in
    elements as engines are setup.

    Modify the resume_owner_thread field of the MR_Context structure, this was
    used to ensure that a context returning through C code would be resumed on
    the engine with the correct C stack and depth.  This field is now an engine
    id and has been renamed to resume_owner_engine, it is advisory unless
    resume_engine_required is also set.  This way it is used to advise which
    engine most recently executed this context and therefore may have a warm
    cache.

    Remove code that dynamically resized the array of spark deques.  Including
    the lock that protected against updating this array while it was being read
    from other thread.

    Introduce code that initialises the statically sized array of spark deques.

    Reimplement the idle loop.  This replaces MR_runnext and MR_do_runnext with
    MR_idle and MR_do_idle respectively.  There are also two new entry points
    into the idle loop.  Which one to use depends on the state of the engine.

    Introduce new mechanisms for waking a particular engine.  For example the
    engine that last executed a context that is now runnable.

    Change the algorithm for selecting which context to run, try to select
    contexts that where last used on the current engine to avoid cache misses.

    Use an engine's victim counter rather than a global victim counter when
    trying to steal work.

    Introduce some conditionally-compiled code that can be used to profile how
    quickly new contexts can be created.

    Rename MR_init_thread_stuff and MR_finalize_thread_stuff.  The term thread
    has been replaced with context since they're in mercury_context.c.  This
    allows the creation of a new function MR_init_thread_stuff() in
    mercury_thread.c I also found the mismatch between the function names and
    file name confusing.  Move some of the code from MR_init_context_stuff to
    the new MR_init_thread_stuff function where it belongs.

    Refactor the thread pinning code so that even when thread pinning is
    disabled it can be used to allocate each thread to a CPU but not actually
    pin them.

    Fix some whitespace errors.

runtime/mercury_thread.h:
runtime/mercury_thread.c:
    In MR_init_engine():
        Allocate an engine id for each engine.

        A number of arrays had one slot per engine and where setup using a
        lock.  Now engine ids are used to index each array and setup is done
        without a lock, each engine simply sets up its own slot.

        Setup the new per-engine work stealing deques.

    The MR_all_engine_bases array has been moved to this file.

    Implement a new MR_init_thread_stuff function which initialises some global
    variables and locks.  Some of MR_init_thread_stuff has been moved from
    mercury_context.c

    Pin threads as part of MR_init_thread, excluding the primordial thread
    which must be pinned before threadscope is initialised.

    Add functions for debugging the use of semaphores.

    Add corresponding macros that can be used to redirect semaphore calls to
    debugging functions as above.

    Improved thread debugging code, ensured that stderr is flushed after every
    use, and that logging is done after calls return as well as before they're
    called.

    Conform to changes in mercury_context.h

runtime/mercury_engine.h:
runtime/mercury_engine.c:
    Add spark deque and victim counter fields to the MercuryEngine structure.

    Make the MR_eng_id field of the MercuryEngine structure available in all
    thread safe grades, formerly it was used in only threadscope grades.

    Move the MR_all_engine_bases variable to mercury_thread.[ch]

    Put a reference to the engine's spark queue into the global array.  This is
    done here, so that it is after thread pinning because the original plan was
    to have this array sorted by CPU rather then engine - we may yet do this in
    the future.

    Initialise an engine's spark deque when an engine is initialised.

    Setup the engine specific threadscope data in mercury_thread.c

    Conform to changes in mercury_context.h

runtime/mercury_wrapper.c:
    The engine base array is no longer setup here, that code has been moved to
    mercury_thread.c

    Conform to changes in mercury_context.h and mercury_thread.h

runtime/mercury_wsdeque.h:
runtime/mercury_wsdeque.c:
    The original implementation allocated an array for a spark queue only if
    one wasn't already allocated, which could happen when a context was reused.
    Now that spark queues are associated with engines arrays are always
    allocated.

    Replaced two macros with a single macro since there's no-longer a
    distinction between global and local work queues, all work queues are
    local.

runtime/mercury_wsdeque.c:
runtime/mercury_wsdeque.h:
    Remove the --worksteal-max-attempts and --worksteal-sleep-msecs options as
    they are no-longer used.

runtime/mercury_threadscope.h:
runtime/mercury_threadscope.c:
    The MR_EngineId type has been moved to mercury_types.h

    Engine IDs are no-longer allocated here, this is done in mercury_thread.c

    The run spark and steal spark messages now write 0xFFFFFFFF for the context
    id if there is no current context.  Previously this would dereference a
    null pointer.

runtime/mercury_memory_zones.c:
    When checking for an existing memory zone check the free_zones_list
    variable before taking a lock.  This can prevent taking the lock in cases
    where there are no free zones.

    Introduce some conditionally-compiled code that can be used to profile how
    quickly new contexts can be created.

runtime/mercury_bootstrap.h:
    Remove macros that no-longer resolve to functions due to changes in the
    runtime system.

runtime/mercury_types.h:
    Move the MR_EngineId type from mercury_threadscope.h to mercury_types.h

runtime/mercury_grade.h:
    Introduce a parallel grade version number, this change brakes binary
    compatibility with existing parallel code.

runtime/mercury_backjump.c:
runtime/mercury_par_builtin.c:
runtime/mercury_mm_own_stacks.c:
library/stm_builtin.m:
library/thread.m:
library/thread.semaphore.m:
    Conform to changes in mercury_context.h.

library/io.m:
    Make this module compatible with MR_debug_threads.

doc/user_guide.texi
    Remove the documentation for the --worksteal-max-attempts and
    --worksteal-sleep-msecs options.  The documentation was already commented
    out.
2011-04-13 13:19:42 +00:00
Julien Fischer
b47354d211 Support the use of the pthreads-win32 library on MinGW systems.
Branches: main, 10.04

Support the use of the pthreads-win32 library on MinGW systems.
(This is based on the patch provided by Sergey Khorev.)
The main change is to remove the assumption in the runtime code
that POSIX thread handles are integers; in the pthreads-win32 library
they are not.

With this change the hlc.par.gc grade will work on Windows / MinGW.
(The low-level C parallel grades will require further work.)

configure.in:
	Configure the Boehm GC to use pthreads-win32 if that is being used
	to provide threads for the runtime on MinGW.

	Delete the --with-pthreads-win32 option; it is no longer needed.

	Add a new option --with-gc-pthreads-win32 that forces the Boehm GC
	to use pthreads-win32.  This is the default for MinGW anyway, the option
	is intended for use by developers using pthreads-win32 in other ways,
	e.g. with MSVC.

runtime/mercury_thread.h:
	Add a new macro / function (depending on the implementation of pthreads)
	that returns the "null" thread.

	Add a new macro MR_thread_equal() that tests two thread handles
	for equality.

runtime/mercury_thread.c:
	Provide implementations of MR_null_thread().

	Add a macro, for use within this module, that returns the id
	of a thread in a form suitable for use in debugging messages.

runtime/mercury_engine.c:
runtime/mercury_context.c:
runtime/mercury_wrapper.c:
runtime/mercury_thread.c:
	Use MR_null_thread() instead of NULL or 0.

	Use MR_thread_equal() instead of directly comparing thread handles.
2010-12-13 05:59:42 +00:00
Julien Fischer
fd2bdc3448 Fix some formatting problems.
Branches: main

runtime/mercury_thread.[ch]:
	Fix some formatting problems.
2010-12-06 14:41:34 +00:00
Julien Fischer
1b0ea83641 Add a missing arugment to a call to fprintf.
runtime/mercury_thread.c:
	Add a missing arugment to a call to fprintf.
2010-06-01 02:34:01 +00:00
Paul Bone
df31fc6f94 Fix hlc.gc.par.
The hlc.gc.par grade was broken after committing my fix for the stack segment
parallel grades.  The problem is that hlc grades use Mercury engines but don't
follow the same code paths as the low-level C grades.  This means that they
expect that the MR_all_engine_bases array isn't allocated by the time the
engine structures are being created.

This change set makes the MR_all_engine_bases array only available in low-level
C parallel grades making problems involving this array impossible in high level
C grades.

runtime/mercury_engine.h:
runtime/mercury_engine.c:
    Only make MR_all_engine_bases available in thread safe low-level C grades.

runtime/mercury_thread.h:
runtime/mercury_thread.c:
    Only make MR_init_engine_array_lock available in thread safe low-level C
    grades.

    Only try to populate MR_all_engine_bases in thread safe low-level C grades.

runtime/mercury_context.c:
    Only initialise MR_init_engine_array_lock in thread safe low-level C
    grades.
2010-05-31 09:41:47 +00:00
Paul Bone
f6e5c3c647 Fix a crash that can occur in low-level C, parallel, stack-segment grades.
MR_destroy_context will cache contexts in case the runtime needs a context in
the near future.  Because the context no-longer represents an on-going
computation MR_destroy_context did not copy context-data out of the
MercuryEngine and real machine registers into the context before caching it.
However in a stack segments grade a one of these values is the current stack
segment, this means that a context may be cached and later re-used which refers
to a stack segment that another context is now using, the cached context will
then trash the other context's stack.

The solution is to save the context before caching it.

This change also contains code that was helpful in diagnosing this problem.

runtime/mercury_context.c:
    Fix the bug (as above).

    Initialise the MR_init_engine_array_lock in MR_setup_thread_stuff.

    Print out logging messages if MR_DEBUG_STACK_SEGMENTS is defined in various
    places.

runtime/mercury_debug.h:
runtime/mercury_debug.c:
    Write code for printing out debug log messages in a grade appropriate-way.

runtime/mercury_memory_handlers.c:
    When exiting a signal handler flush the threadscope buffers of all engines
    before re-raising the signal.

runtime/mercury_engine.h:
runtime/mercury_engine.c:
    In thread safe grades provide an array of pointers to Mercury engines in
    the runtime.  This is used to flush the threadscope buffers in the signal
    handlers.  It may be used to improve work stealing in the future.

runtime/mercury_thread.h:
runtime/mercury_thread.c:
    When creating threads add each one's engine to the array of engine pointers.

runtime/mercury_wrapper.c:
    Allocate the array of engine pointers when the runtime starts up.
2010-05-26 07:45:49 +00:00
Paul Bone
83a6f14708 Create a threadscope grade component.
Threadscope grades are enabled by using the grade component 'threadscope'.
They are supported only with low-lavel C parallel grades.  Support for
threadscope in high level C grades is intended in the future but does not work
now.

runtime/mercury_conf_param.h:
    Create the MR_THREADSCOPE macro that is defined if the grade is a
    threadscope grade.

    Define MR_PROFILE_FOR_PARALLEL_EXECUTION if MR_THREADSCOPE is defined.

    Emit an error if MR_LL_PARALLEL_CONJ is defined before it is implied by
    MR_THREADSAFE and ! MR_HIGHLEVEL_CODE

runtime/mercury_grade.h
    Update the grade symbol for the threadscope grade component.

runtime/mercury_atomic_ops.c:
runtime/mercury_atomic_ops.h:
runtime/mercury_context.c:
runtime/mercury_context.h:
runtime/mercury_engine.c:
runtime/mercury_engine.h:
runtime/mercury_thread.c:
runtime/mercury_threadscope.c:
runtime/mercury_threadscope.h:
runtime/mercury_wrapper.c:
    Now that MR_PROFILE_FOR_IMPLICIT_PARALLELISM is implied by MR_THREADSAFE we
    don't need to test for MR_THREADSAFE when we test for
    MR_PROFILE_FOR_IMPLICIT_PARALLELISM.  The same is true for
    MR_LL_PARALLEL_CONJ which is implied by MR_THREADSAFE &&
    !MR_HIGHLEVEL_CODE.

    Replace some occurances of MR_PROFILE_FOR_IMPLICIT_PARALLELISM with
    MR_THREADSCOPE where the conditionally compiled code is used to support
    threadscope profiling.

scripts/init_grade_options.sh-subr:
scripts/canonical_grade.sh-subr:
scripts/parse_grade_options.sh-subr:
scripts/final_grade_options.sh-subr:
scripts/mgnuc.in:
compiler/handle_options.m:
compiler/options.m:
compiler/compile_target_code.m:
configure.in:
    Add support for the new grade component.

    Pass -DMR_THREADSCOPE to the C compiler when using a threadscope grade.

    Add assertions to ensure that the 'threadscope' grade component is used
    only with the 'par' grade component.

doc/user_guide.texi:
    Added commented-out documentation for the threadscope greate component.

    Adjusted documentation of the --profile-parallel-execution runtime option
    to describe the correct prerequisite compile time options.

    Added my name to the authors list.

runtime/mercury_context.c:
    Corrected grammar and prose in comments in the MR_do_join_and_continue code.
2010-01-10 04:53:40 +00:00
Paul Bone
1c8875adc7 Act on post-commit review comments from Peter.
runtime/mercury_context.c:
    Corrected typos in code.

runtime/mercury_thread.c:
    Corrected a typo/spelling mistake.
2009-12-17 01:29:18 +00:00
Paul Bone
5cfd73644a Implement work stealing.
This patch is heavily based on earlier, uncommitted work by Peter Wang.  It
has been updated so that it applies against the current version of the source.
A number of other changes have been made.  Peter's original ChangeLog
follows:

	Implement work stealing for parallel conjunctions.  This builds on an
	older patch which introduced work-stealing deques to the runtime but
	didn't perform work stealing.

	Previously when we came across a parallel conjunct, we would place a spark
	into either the _global spark queue_ or the _local spark stack_ of the
	Mercury context.  A spark on the global spark queue may be picked up for
	parallel execution by an idle Mercury engine, whereas a spark on a local
	spark stack is confined to execution in the context that originated it.

	The problem is that we have to decide, ahead of time, where to put a
	spark.  Ideally, we should have just enough sparks in the global queue to
	keep the available Mercury engines busy, and leave the rest of the sparks
	to execute in their original contexts since that is more efficient.  But
	we can't predict the future so have to make do with guesses using simple
	heuristics.  A bad decision, once made, cannot be reversed.  An engine may
	sit idle due to an empty global spark queue, even while there are sparks
	available in some local spark stacks.

	In the work stealing scheme, sparks are always placed into each context's
	_local spark deque_.  Idle engines actively try to steal sparks from
	random spark deques.  We don't need to make irreversible and potentially
	suboptimal decisions about where to put sparks.  Making a spark available
	for parallel execution is cheap and happens by default because of the
	work-stealing deques; putting a spark on a global queue implies
	synchronisation with other threads.  The downside is that idle engines
	need to expend more time and effort to find the work from multiple places
	instead of just one place.

	Practically, the new scheme seems to work as well as the old scheme and
	vice versa, except that the old scheme often required
	`--max-context-per-threads' to be set "correctly" to get good results.

	Only tested on x86-64, which has a relatively constrained memory model.

My modifications include:

	The difference between 'shared' and 'private' synchronisation terms has
	been removed.  All sync terms are assumed to be shared and thread-safe
	operations are used everywhere.  This allows us to remove complicated code
	used when a private synchronisation term became shared.  This may change
	the performance of thread stealing, in particular it may become slower due
	to the assumption that all sync terms are shared and therefore atomic
	operations must always be used when decrementing their count field.

	I've re-factored MR_do_join_and_continue, It is now much simpler as the
	conditional code in it enumerates the possible cases clearly.

This change bootchecks and successfully runs the test suite in asm_fast.gc
asm_fast.gc.par hlc.gc and hlc.par, no other grades where tested.  I have not
yet tested performance.

runtime/mercury_context.c:
runtime/mercury_context.h:
	Keep pointers to all spark deques in a flat array, so we have access
    to them for stealing.

	Added functions to manage the global array of spark deques.

	Modify MR_do_run_next, it now attempts to steal work from other context's
	spark queues.  Threads sleeping on the condition variable in
	MR_do_run_next now use a timed wait so they can wakeup and try to steal
	sparks.

	Re-factored MR_do_join_and_continue.

	MR_num_idle_engines is used by atomic operations, it has been made an
	MR_Integer so that it's size matches the expectations of the atomic
	operations we have defined.

	Modified the MR_SyncTerm and MR_Spark structures.  Sparks now point to
	their sync terms.  The perant stack pointer has been moved into the
	SyncTerm structure.  The MR_st_is_shared field in the MR_SyncTerm
	structure has been removed.

runtime/mercury_atomic_ops.c:
runtime/mercury_atomic_ops.h:
	Implement a new atomic operation: decrement integer and is zero.  On the
	x86/x86_64 one can't atomically decrement an integer and fetch the result
	in a single instruction, a loop with a 'compare and exchange' instruction
	is necessary.  However since we only want to test if the value has become
	zero after the decrement we can use the processor's flags.  This can be
	done in two instructions, but more importantly a loop is not required and
	only one instruction is atomic.

runtime/mercury_wrapper.c:
runtime/mercury_wrapper.h:
	Added runtime tunable options for work stealing.  These control the number
	of attempts an idle engine will make when looking for work, and the
	duration to sleep after failing to find any work.

runtime/mercury_thread.c:
runtime/mercury_thread.h:
	Added MR_COND_TIMED_WAIT, which waits on condition variables like
	MR_COND_WAIT except that it may time out.

runtime/mercury_wsdeque.h:
runtime/mercury_wsdeque.c:
	MR_wsdeque_pop_bottom now uses it's second argument to return the code
	address to jump to rather the whole spark.

runtime/mercury_conf.h.in:
configure.in:
	Test for sched_yield()

	Change the synchronisation term structure.

doc/user_guide.texi:
    Add commented out documentation for two new tunable parameters,
    `--worksteal-max-attempts' and `--worksteal-sleep-msecs'.
    Implementors may want to experiment with different values but end
    users shouldn't need to know about them.
2009-12-15 02:29:07 +00:00
Paul Bone
92afa23af5 Support for threadscope profiling of the parallel runtime.
This change adds support for threadscope profiling of the parallel runtime in
low level C grades.  It can be enabled by compiling _all_ code with the
MR_PROFILE_PARALLEL_EXECUTION_SUPPORT C macro defined.  The runtime, libraries
and applications must all have this flag defined as it alters the MercuryEngine
and MR_Context structures.

See Don Jones Jr, Simon Marlow, Satnam Singh - Parallel Performance Tuning for
Haskell.

This change also includes:

    Smarter thread pinning (the primordial thread is pinned to the thread that
    it is currently running on).

    The addition of callbacks from the Boehm GC to notify the runtime of
    stop the world garbage collections.

    Implement some userspace spin loops and conditions.  These are cheaper than
    their POSIX equivalents, do not support sleeping, and are signal handler
    safe.

boehm_gc/alloc.h:
boehm_gc/alloc.c:
    Declare and define the new callback functions.

boehm_gc/alloc.c:
    Call the start and stop collect callbacks when we start and stop a
    stop-the-world collection.

    Correct how we record the time spent collecting, it now includes
    collections that stop prematurely.

boehm_gc/pthread_stop_world.c:
    Call the pause and resume thread callbacks in each thread where the GC
    arranges for that thread to be stopped during a stop-the-world collection.

runtime/mercury_threadscope.c:
runtime/mercury_threadscope.h:
    New files implementing the threadscope support.

runtime/mercury_atomic_ops.c:
runtime/mercury_atomic_ops.h:
    Rename MR_configure_profiling_timers to MR_do_cpu_feature_detection.

    Add a new function MR_read_cpu_tsc() to read the TSC register from the CPU,
    this simply abstracts the static MR_rdtsc function.

runtime/mercury_atomic_ops.h:
    Modify the C inline assembler to ensure we tell the C compiler that the
    value in the register mapped to the 'old' parameter is also an output from
    the instructions.  That is, the C compiler must not depend on the value of
    'old' being the same before and after the instruction is executed.  This
    has never been a problem in practice though.

    Implement some cheap userspace mutual exclusion locks and condition
    variables.  These will be faster than pthread's mutexes when critical
    sections are short and threads are pinned to separate CPUs.

runtime/mercury_context.c:
runtime/mercury_context.h:
    Add a new function for pinning the primordial thread.  If the OS supports
    sched_getcpu we use it to determine which CPU the primordial thread should
    use.  No other thread will be pinned to this CPU.

    Add a numeric id field to each context, this id is uniquely assigned and
    identifies each context for threadscope.

    MR_schedule_context posts the 'context runnable' threadscope event.

    MR_do_runnext has been modified to destroy engines differently, it ensures
    they cleanup properly so that their threadscope events are flushed properly
    and then calls pthread_exit(0)

    MR_do_runnext posts events for threadscope.

    MR_do_join_and_continue posts events for threadscope.

runtime/mercury_engine.h:
    Add new fields to the MercuryEngine structure including a buffer of
    threadscope events, a clock offset (used to synchronize the TSC clocks) and
    a unique identifier for the engine,

runtime/mercury_engine.c:
    Call MR_threadscope_setup_engine() and MR_threadscope_finalize_engine for
    newly created and about-to-be-destroyed engines.

    When the main context finishes on a thread that's not the primordial thread
    post a 'context is yielding' message before re-scheduling the context on
    the primordial thread.

runtime/mercury_thread.c:
    Added an XXX comment about a potential problem, it's only relevant for
    programs using thread.spawn.

    Added calls to the TSC synchronisation code used for threadscope profiling.
    It appears that this is not necessary on modern x86 machines, it has been
    commented out.

    Post a threadscope event when we create a new context.

    Don't call pthread_exit in MR_destroy_thread, we now do this in
    MR_do_runnext so that we can unlock the runqueue mutex after cleaning up.

runtime/mercury_wrapper.c:
    Conform to changes in mercury_atomic_ops.[ch]

    Post an event immediately before calling main to mark the beginning of the
    program in the threadscope profile.

    Post a "context finished" event at the end of the program.

    Wait until all engines have exited before cleaning up global data, this is
    important for finishing writing the threadscope data file.

configure.in:
runtime/mercury_conf.h.in:
    Test for the sched_getcpu C function and utmpx.h header file, these are
    used for thread pinning.

runtime/Mmakefile:
    Include the mercury_threadscope.[hc] files in the list of runtime headers
    and sources respectively.
2009-12-03 05:28:00 +00:00
Paul Bone
6807e11661 Re-factor the MR_join_and_continue macro.
This change replaces the MR_join_and_continue macro with a C procedure.  A
smaller macro named MR_join_and_continue wraps the new C procedure and provides
a trampoline to prevent C stack leaks.  MR_join_and_continue will now have the
additional cost of a C procedure call rather than always being inlined.  This
code is only used in the implementation of parallel conjunctions in the low
level C grades, it does not affect other grades.

An earlier revision of this code was causing deadlocks, to debug them support
was added to the MR_SIGNAL MR_BROADCAST and MR_WAIT macros to enable better
logging of the use of condition variables when MR_DEBUG_THREADS is defined at
compile time.

This change passes bootcheck and the test suite in the asm_fast.gc.par grade.

runtime/mercury_context.h:
runtime/mercury_context.c:
    Created MR_do_join_and_continue procedure from old MR_join_and_continue
    macro.
    Added additional comments to this procedure, describing how it works.
    Created a new macro MR_join_and_continue that wraps the new procedure.
    Conform to changes in the MR_WAIT, MR_SIGNAL and MR_BROADCAST macros.

runtime/mercury_thread.h:
    Added a from parameter to the MR_WAIT, MR_SIGNAL and MR_BROADCAST macros.
    Added a from parameter to the C procedures' declarations that implement the
    debugging versions of the condition operations above.
    Adjusted the formatting of these declarations to match the C style used in
    the project.

runtime/mercury_thread.c:
    The C procedures implementing the debugging versions of the condition
    operations now print out their from parameter.
    MR_cond_broadcast now uses "broadcast" in it's log message rather than
    "signal"
    MR_cond_wait's log message now more clearly specifies which argument is the
    lock and which is the condition variable.
2009-11-27 03:51:20 +00:00
Paul Bone
d5d4457463 Parallel runtime thread pinning.
This change introduces two new features in the mercury runtime;
pinning of threads to CPU cores/threads and runtime detection of the number of
CPU cores/threads available.

If MR_num_threads has not been specified in the runtime options with the -P
flag we use the sysconf(_SC_NPROCESSORS_ONLN) call if available to detect the
number of CPUs online and set MR_num_threads available.  As before this
defaults to 1.

Thread pinning is enabled if the runtime was able to detect the number of CPUs
on the machine or the user specifically requests thread pinning with the
--thread-pinning runtime option.  The sched_setaffinity() call is used to pin
each thread to a specific CPU.

I believe that in some cases thread pinning can achieve better performance,
this is yet to be determined and it may depend on the machine's architecture.
It does make profiling of the runtime system more reliable where the RDTSCP
instruction is not available.  It ensuring that a thread is not migrated to a
different CPU between sampling of the CPU's TSC.

configure.in:
runtime/mercury_conf.h.in:
	Detect the presence of sched.h sysconf() sched_setaffinity() and
	_SC_NPROCESSORS_ONLN.

doc/user_guide.texi:
	Document the new --thread-pinning runtime option.
	Adjust the documentation of -P to reflect the new behaviour.

runtime/mercury_context.c:
	Add the MR_pin_thread() function.
	Create a new global MR_bool MR_pin_threads;
	Add the calculation of the number of threads to use to
	MR_init_thread_stuff()
	Correct a bug in a format string in my previous patch.

runtime/mercury_context.h:
	Export the new MR_pin_thread() function.
	Export the new MR_pin_threads global.
	Correct a previous spelling mistake.
	Adjust the documentation of MR_init_thread_stuff to reflect the new
	behaviour.

runtime/mercury_wrapper.c:
	Pin the primordial thread to a CPU after it spawns the other threads.
	Add the --thread-pinning runtime configuration option.
	Move the calculation of MR_max_outstanding_contexts until after
	MR_init_thread_stuff() so that it is calculated after the number of CPUs
	available has been determined.
	Add a pause instruction to a spinloop for better behaviour on later
	i386/x86_64 processors.  See the documentation for MR_ATOMIC_PAUSE.

runtime/mercury_thread.c:
	After a thread is spawned call MR_pin_thread() to pin a thread to a CPU if
	the thread has been created to pickup work from the global work queue.
2009-08-23 22:52:35 +00:00
Ben Mellor
d353fe6b4b Update the debugging thread synchronization procedures to match the pthread_*
calls that are made when thread debugging is not enabled.

runtime/mercury_thread.c
runtime/mercury_thread.h
    Update the functions to which the debug versions of MR_LOCK,
    MR_UNLOCK, MR_SIGNAL, etc, expand. Make them all return int error
    codes, as do the underlying pthread_* functions, make
    MR_cond_signal call pthread_cond_signal instead of
    pthread_cond_broadcast, and create an analogous MR_cond_broadcast
    function.
2009-05-04 01:50:41 +00:00
Peter Wang
f6080ebf93 Prevent multi-threaded programs from terminating as soon as the main thread
Branches: main

Prevent multi-threaded programs from terminating as soon as the main thread
terminates, i.e. the process should not terminate until all threads started by
thread.spawn/3 terminate.

This is done by maintaining a a global count of the number of threads started
by thread.spawn.  In low-level C grades the main context will suspend if it
reaches the global_success label and finds there are other contexts still
outstanding.  The last context to terminate then reschedules the main context
to resume.

Similarly, in high-level C grades the main thread waits on a condition
variable, which is signalled by the last thread to terminate.

library/thread.m:
runtime/mercury_context.c:
runtime/mercury_thread.c:
runtime/mercury_thread.h:
runtime/mercury_wrapper.c:
	As above.

	Add some extra assertions related to this.

tests/par_conj/Mmakefile:
tests/par_conj/thread_barrier.exp:
tests/par_conj/thread_barrier.m:
	Add test case

NEWS:
	Announce the change.
2007-05-01 01:13:58 +00:00
Zoltan Somogyi
7989f17311 Fix some software rot that prevented I/O operations from working in mmos
Estimated hours taken: 6
Branches: main

Fix some software rot that prevented I/O operations from working in mmos
grades. The problem was the change to the I/O module to make it use thread
local storage via a new field of the MR_Context structure which was accessed
via the MR_eng_this_context field of the engine, instead of via the
MR_eng_context field. The new field was not set by the code for initializing
the contexts used by own stack minimal model tabling.

runtime/mercury_context.h:
runtime/mercury_engine.h:
	Add significant new documentation about how fields of the MR_Context
	structure are accessed, both because the documentation is useful and to
	make similar mistakes less likely in future.

	Add a macro for use by own stack minimal model tabling.

runtime/mercury_thread.c:
	Add a comment about a link to mercury_engine.h.

runtime/mercury_thread.h:
	Convert to four-space indentation, and fix some formatting.

runtime/mercury_mm_own_stacks.c:
	Add code for filling in the missing fields of newly created contexts.

runtime/mercury_wrapper.c:
	In own stack minimal model grades, set up the main context properly.
	The previous code was based on a flawed understanding of the
	relationalship between MR_eng_context and MR_eng_this_context.

tests/debugger/mmos_print.{m,inp,exp}:
	Add a new test case (which we don't yet pass due to a problem with
	formatting of mdb output) to test the fix. The old versions of the
	compiler don't pass this test case, because the "p *" commands of the
	debugger invoke I/O code in the Mercury standard library, which fails
	with a segfault due to the thread local fields of generators' contexts
	being uninitialized.

	Note that the .inp aborts execution, because without the abort the
	execution would go into an infinite loop since mmos grades don't yet
	have code for detecting completion.

tests/debugger/Mmakefile:
	Enable the new test case in mmos grades.

	Fix inconsistent indentation.

tests/tabling/Mmakefile:
	Do not try to execute minimal tests in mmos grades, since we don't pass
	them yet, and the symptom is in many cases an infinite loop.
2007-04-17 05:38:22 +00:00
Peter Wang
b2f14e1afa Some bug fixes to do with threads.
Branches: main

Some bug fixes to do with threads.

library/io.m:
	ML_maybe_make_err_msg() was not thread-safe but was called from some
	`thread_safe' foreign_procs.  Make ML_maybe_make_err_msg() acquire the
	global lock if the caller does not acquire the global lock itself.

library/thread.m:
runtime/mercury_thread.c:
	Create threads in the detached state so that resources will be
	automatically freed when threads terminate (we don't call
	pthread_join() anywhere).

library/thread.semaphore.m:
	Wake up waiting threads in FIFO order, instead of LIFO order.

runtime/mercury_context.c:
runtime/mercury_context.h:
runtime/mercury_engine.c:
runtime/mercury_engine.h:
	Change the way we enforce that a Mercury context returning from Mercury
	code back into a C function runs on the original Mercury engine that
	called the C function.

	Previously, if a C function called into Mercury code, the Mercury
	context would be "owned" by that Mercury engine until the C function
	finished.  If the Mercury code suspended (e.g. waiting on a semaphore),
	it could not be resumed by another Mercury engine.  This was
	unnecessarily conservative.

	Now any Mercury engine can resume a suspended context.  Just before
	returning into C functions, we check that the context is actually
	running on the Mercury engine in which the C function was started.  If
	not, *then* we reschedule the context so that it will only be picked up
	by the right Mercury engine.

	Add a comment that none of this is implemented for grades not using gcc
	non-local gotos (nor was it implemented before).

runtime/mercury_memory_zones.c:
	Fix an off-by-one bug and a thread-safety bug in MR_next_offset().
2007-03-03 03:43:35 +00:00
Peter Wang
d0f1ea2529 Index: runtime/mercury_thread.c
===================================================================
RCS file: /home/mercury1/repository/mercury/runtime/mercury_thread.c,v
retrieving revision 1.29
diff -u -r1.29 mercury_thread.c
--- runtime/mercury_thread.c	12 Jan 2007 05:00:31 -0000	1.29
+++ runtime/mercury_thread.c	16 Jan 2007 23:42:33 -0000
@@ -234,7 +234,7 @@
 #ifdef MR_THREAD_SAFE
     pthread_mutex_init(&muts->MR_tlm_lock, MR_MUTEX_ATTR);
 #endif
-    muts->MR_tlm_values = MR_NEW_ARRAY(MR_Word, numslots);
+    muts->MR_tlm_values = MR_GC_NEW_ARRAY(MR_Word, numslots);

     return muts;
 }
2007-01-16 23:45:05 +00:00
Peter Wang
81b8e55825 Add support for thread-local mutables. These can take on a different value for
Estimated hours taken: 15
Branches: main

Add support for thread-local mutables.  These can take on a different value for
each Mercury thread.  Child threads automatically inherit the thread-local
values of the parent thread that spawned it.

compiler/make_hlds_passes.m:
compiler/prog_io.m:
compiler/prog_item.m:
compiler/prog_mutable.m:
	Accept a `thread_local' attribute for mutables and update the
	source-to-source transformation.

doc/reference_manual.texi:
	Document the `thread_local' attribute as a Melbourne Mercury compiler
	extension.

runtime/mercury_context.c:
runtime/mercury_context.h:
	Add a `thread_local_mutables' field to MR_Context, which points to an
	array which holds all the values of thread-local mutables in the
	program.  Each thread-local mutable has an associated index into the
	array, which is allocated during initialisation.  A child thread
	inherits the parent's thread-locals simply by copying the array.

	Add a `thread_local_mutables' field to MR_Spark and update the parallel
	conjunction implementation to take into account thread-locals.

runtime/mercury_thread.c:
runtime/mercury_thread.h:
	Add the functions and macros which are used by the code generated for
	thread-local mutables.

runtime/mercury_wrapper.c:
	Allocate a thread-local mutable array for the initial context at
	startup.

extras/concurrency/spawn.m:
	Update the spawn/3 implementation to make child threads inherit the
	thread-local values of the parent.

	Make different threads in high-level C grades use different
	MR_Contexts.  This makes it possible to use the same implementation of
	thread-local mutables as in the low-level C grades.

tests/hard_coded/mutable_decl.exp:
tests/hard_coded/mutable_decl.m:
tests/hard_coded/pure_mutable.exp:
tests/hard_coded/pure_mutable.m:
tests/invalid/bad_mutable.err_exp:
tests/invalid/bad_mutable.m:
	Add some thread-local mutables to these test cases.

NEWS:
	Announce the addition.
2007-01-12 05:00:32 +00:00
Peter Wang
b73932b567 Fix two bugs in my recent changes to the parallel execution mechanism which
Estimated hours taken: 4
Branches: main

Fix two bugs in my recent changes to the parallel execution mechanism which
showed up on Sparc/Solaris.

runtime/mercury_context.c:
runtime/mercury_context.h:
	There was a problem with `MR_schedule_spark_locally' accessing the
	engine base address.  The register we thought contained the engine
	base address has different contents after a C function call (I guess
	due to register windows).  Since the function is short and should be
	inlined anyway, turn it into a macro.

runtime/mercury_thread.c:
	In `MR_init_thread', delay a call to `MR_save_registers' until after
	a context has been loaded into the engine, otherwise the program
	crashes on startup.
2006-10-03 11:41:46 +00:00
Peter Wang
70a83b2632 A common way to use parallel conjunction can cause a lot of Mercury contexts
Estimated hours taken: 7
Branches: main

A common way to use parallel conjunction can cause a lot of Mercury contexts
to be allocated, e.g.

    map([], []).
    map([H0|T0], [H|T]) :-
	( p(H0, H)	% contains no parallel conjunctions
	& map(T0, T)
	).

When the left parallel conjunct completes, the engine that was executing it
must suspend the context in which it was run, waiting for the right conjunct
to finish.  The engine is then idle and will attempt to find further work to
execute in a _new_ context.  To avoid excessive memory consumption due to
contexts we currently limit the number of contexts we allocate.  However,
that severely limits the parallelism we can exploit in this example (and
similar patterns of work distribution).  There are a lot of contexts
allocated but most of them are simply suspended.

Assuming that most parallel conjuncts contain small sub-computations, we can
allow many contexts to be allocated without excessive memory consumption by
just giving them smaller stacks.  This patch creates a simple variant of a
MR_Context structure which has smaller stacks than the initial MR_Context
structure and executes parallel conjuncts in the smaller contexts if
larger contexts are unavailable.


runtime/mercury_memory.c:
runtime/mercury_wrapper.c:
runtime/mercury_wrapper.h:
doc/user_guide.texi:
	Add globals to hold the desired sizes of small det and nondet stacks.

	Add `--small-detstack-size' and `--small-nondetstack-size'
	options for the MERCURY_OPTIONS environment variable to set the
	desired sizes.

runtime/mercury_context.h:
	Add a MR_ctxt_size field to MR_Context to indicate whether it has
	regular or small sized stacks.

runtime/mercury_context.c:
	Add an argument to MR_create_context() specifying whether we want a
	regular or small context.

	Ask for small stacks when creating new contexts to begin execution
	from a spark (i.e. parallel conjuncts).

	Create a new free-list to hold unused small contexts.

extras/concurrency/spawn.m:
runtime/mercury_mm_own_stacks.c:
runtime/mercury_thread.c:
	Match the interface change to MR_create_context().  We give the
	initial context and contexts created due for explicit Mercury threads
	regular-sized stacks.
2006-10-02 10:14:40 +00:00
Peter Wang
712027f307 This patch changes the parallel execution mechanism in the low level backend.
Estimated hours taken: 100
Branches: main

This patch changes the parallel execution mechanism in the low level backend.
The main idea is that, even in programs with only moderate parallelism, we
won't have enough processors to exploit it all.  We should try to reduce the
cost in the common case, i.e. when a parallel conjunction gets executed
sequentially.  This patch does two things along those lines:

(1) Instead of unconditionally executing all parallel conjuncts (but the last)
in separate Mercury contexts, we allow a context to continue execution of the
next conjunct of a parallel conjunction if it has just finished executing the
previous conjunct.  This saves on allocating unnecessary contexts, which can
be a big reduction in memory usage.

We also try to execute conjuncts left-to-right so as to minimise the
need to suspend contexts when there are dependencies between conjuncts.

(2) Conjuncts that *are* executed in parallel still need separate contexts.
We used to pass variable bindings to those conjuncts by flushing input
variable values to stack slots and copying the procedure's stack frame to the
new context.  When the conjunct finished, we would copy new variable bindings
back to stack slots in the original context.

What happens now is that we don't do any copying back and forth.  We introduce
a new abstract machine register `parent_sp' which points to the location of
the stack pointer at the time that a parallel conjunction began.  In parallel
conjuncts we refer to all stack slots via the `parent_sp' pointer, since we
could be running on a different context altogether and `sp' would be pointing
into a new detstack.  Since parallel conjuncts now share the procedure's stack
frame, we have to allocate stack slots such that all parallel conjuncts in a
procedure that could be executing simultaneously have distinct sets of stack
slots.  We currently use the simplest possible strategy, i.e. don't allow
variables in parallel conjuncts to reuse stack slots.

Note: in effect parent_sp is a frame pointer which is only set for and used by
the code of parallel conjuncts.  We don't call it a frame pointer as it can be
confused with "frame variables" which have to do with the nondet stack.


compiler/code_info.m:
	Add functionality to keep track of how deep inside of nested parallel
	conjunctions the code generator is.

	Add functionality to acquire and release "persistent" temporary stack
	slots.  Unlike normal temporary stack slots, these don't get implicitly
	released when the code generator's location-dependent state is reset.

	Conform to additions of `parent_sp' and parent stack variables.

compiler/exprn_aux.m:
	Generalise the `substitute_lval_in_*' predicates by
	`transform_lval_in_*' predicates.  Instead of performing a fixed
	substitution, these take a higher order predicate which performs some
	operation on each lval.  Redefine the substitution predicates in terms
	of the transformation predicates.

	Conform to changes in `fork', `join_and_terminate' and
	`join_and_continue' instructions.

	Conform to additions of `parent_sp' and parent stack variables.

	Remove `substitute_rval_in_args' and `substitute_rval_in_arg' which
	were unused.

compiler/live_vars.m:
	Introduce a new type `parallel_stackvars' which is threaded through
	`build_live_sets_in_goal'.  We accumulate the sets of variables which
	are assigned stack slots in each parallel conjunct.  At the end of
	processing a parallel conjunction, use this information to force
	variables which are assigned stack slots to have distinct slots.

compiler/llds.m:
	Change the semantics of the `fork' instruction.  It now takes a single
	argument: the label of the next conjunct after the current one.  The
	instruction now "sparks" the next conjunct to be run, either in a
	different context (possibly in parallel, on another Mercury engine) or
	is queued to be executed in the current context after the current
	conjunct is finished.

	Change the semantics of the `join_and_continue' instruction.  This
	instruction now serves to end all parallel conjuncts, not just the
	last one in a parallel conjunction.

	Remove the `join_and_terminate' instruction (no longer used).

	Add the new abstract machine register `parent_sp'.

	Introduce "parent stack slots", which are similar to normal stack
	slots but relative to the `parent_sp' register.

compiler/par_conj_gen.m:
	Change the code generated for parallel conjunctions.  That is:

	- use the new `fork' instruction at the beginning of a parallel
	  conjunct;

	- use the `join_and_continue' instruction at the end of all parallel
	  conjuncts;

	- keep track of how deep the code generator currently is in parallel
	  conjunctions;

	- set and restore the `parent_sp' register when entering a non-nested
	  parallel conjunction;

	- after generating the code of a parallel conjunct, replace all
	  references to stack slots by parent stack slots;

	- remove code to copy back output variables when a parallel conjunct
	  finishes.

	Update some comments.

runtime/mercury_context.c:
runtime/mercury_context.h:
	Add the type `MR_Spark'.  Sparks are allocated on the heap and contain
	enough information to begin execution of a single parallel conjunct.

	Add globals `MR_spark_queue_head' and `MR_spark_queue_tail'.  These
	are pointers to the start and end of a global queue of sparks.  Idle
	engines can pick up work from this queue in the same way that they can
	pick up work from the global context queue (the "run queue").

	Add new fields to the MR_Context structure.  `MR_ctxt_parent_sp' is a
	saved copy of the `parent_sp' register for when the context is
	suspended.  `MR_ctxt_spark_stack' is a stack of sparks that we decided
	not to put on the global spark queue.

	Update `MR_load_context' and `MR_save_context' to save and restore
	`MR_ctxt_parent_sp'.

	Add the counters `MR_num_idle_engines' and
	`MR_num_outstanding_contexts_and_sparks'.  These are used to decide,
	when a `fork' instruction is reached, whether a spark should be put on
	the global spark queue (with potential for parallelism but also more
	overhead) or on the calling context's spark stack (no parallelism and
	less overhead).

	Rename `MR_init_context' to `MR_init_context_maybe_generator'.  When
	initialising contexts, don't reset redzones of already allocated
	stacks.  It seems to be unnecessary (and the reset implementation is
	buggy anyway, though it's fine on Linux).

	Rename `MR_schedule' to `MR_schedule_context'.  Add new functions
	`MR_schedule_spark_globally' and `MR_schedule_spark_locally'.

	In `MR_do_runnext', add code for idle engines to get work from the
	global spark queue.  Resuming contexts are prioritised over sparks.

	Rename `MR_fork_new_context' to `MR_fork_new_child'.  Change the
	definitions of `MR_fork_new_child' and `MR_join_and_continue' as per
	the new behaviour of the `fork' and `join_and_continue' instructions.
	Delete `MR_join_and_terminate'.

	Add a new field `MR_st_orig_context' to the MR_SyncTerm structure to
	record which context originated the parallel conjunction instance
	represented by a MR_SyncTerm instance, and update `MR_init_sync_term'.
	This is needed by the new behaviour of `MR_join_and_continue'.

	Update some comments.

runtime/mercury_engine.h:
runtime/mercury_regs.c:
runtime/mercury_regs.h:
runtime/mercury_stacks.h:
	Add the abstract machine register `parent_sp' and code to copy it to
	and from the fake_reg array.

	Add a macro `MR_parent_sv' to access stack slots via `parent_sp'.

	Add `MR_eng_parent_sp' to the MercuryEngine structure.

runtime/mercury_wrapper.c:
runtime/mercury_wrapper.h:
	Add Mercury runtime option `--max-contexts-per-thread' which is saved
	in the global variable `MR_max_contexts_per_thread'.  The number
	`MR_max_outstanding_contexts' is derived from this.  It sets a soft
	limit on the number of sparks we put in the global spark queue,
	relative to the number of threads we are running.  We don't want to
	put too many sparks on the global queue if there are plenty of ready
	contexts or sparks already on the global queues, as they are likely to
	result in new contexts being allocated.

	When initially creating worker engines, wait until all the worker
	engines have acknowledged that they are idle before continuing.  This
	is mainly so programs (especially benchmarks and test cases) with only
	a few fork instructions near the beginning of the program don't
	execute the forks before any worker engines are ready, resulting in no
	parallelism.

runtime/mercury_engine.c:
runtime/mercury_thread.c:
	Don't allocate a context at the time a Mercury engine is created.  An
	engine only needs a new context when it is about to pick up a spark.

configure.in:
compiler/options.m:
scripts/Mercury.config.in:
	Update to reflect the extra field in MR_SyncTerm.

	Add the option `--sync-term-size' and actually make use the result of
	the sync term size calculated during configuration.

compiler/code_util.m:
compiler/continuation_info.m:
compiler/dupelim.m:
compiler/dupproc.m:
compiler/global_data.m:
compiler/hlds_llds.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/llds_out.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/reassign.m:
compiler/stack_layout.m:
compiler/use_local_vars.m:
compiler/var_locn.m:
	Conform to changes in `fork', `join_and_terminate' and
	`join_and_continue' instructions.

	Conform to additions of `parent_sp' and parent stack variables.

	XXX not sure about the changes in stack_layout.m

library/par_builtin.m:
	Conform to changes in the runtime system.
2006-09-26 03:53:23 +00:00
Peter Wang
8bd4af47cf Add coroutining support for dependent parallel conjunctions in lowlevel
Estimated hours taken: 20
Branches: main

Add coroutining support for dependent parallel conjunctions in lowlevel
parallel grades.

library/par_builtin.m:
	Change definitions of synchronisation primitives so that waiting on a
	future causes the current context to be suspended.  Signalling a
	future causes all the contexts waiting on the future to be scheduled.

runtime/mercury_context.c:
runtime/mercury_thread.c:
runtime/mercury_thread.h:
runtime/mercury_wrapper.c:
	Add a global `MR_primordial_thread' to hold the thread id of the
	primordial thread.

	Add sanity checks, in particular that the primordial thread does not
	exit like other threads as it needs to clean up the Mercury runtime.

tests/par_conj/Mmakefile:
	Actually run dependent parallel conjunction tests since they should
	no longer deadlock.

tests/par_conj/*.exp:
	Add expected outputs for test cases which didn't have them.
2006-07-05 03:00:48 +00:00
Peter Wang
61a6cc518e When possible, make use of the gcc `__thread' extension for introducing
Estimated hours taken: 5
Branches: main

When possible, make use of the gcc `__thread' extension for introducing
thread-local variables.  Currently we use the POSIX thread facility for storing
the address of the Mercury engine that is running on the current thread, i.e.
pthread_getspecific/pthread_setspecific.  If gcc global registers are not
used, then each access of a Mercury register incurs a call to
`pthread_getspecific'.  Using the `__thread' extension is very much faster.

(If gcc global registers are used then the address of the Mercury engine is
kept in a hardward register.)


configure.in:
	Check if compiler has the `__thread' extension.

runtime/mercury_conf.h.in:
	#define MR_THREAD_LOCAL_STORAGE if `__thread' can be used.

runtime/mercury_context.c:
runtime/mercury_engine.h:
runtime/mercury_thread.c:
	Make `MR_thread_engine_base' a thread-local variable if possible,
	instead of using pthread_getspecific().

	#define MR_set_thread_engine_base() to hide the differences between
	when `__thread' is used or not.
2006-07-04 04:46:38 +00:00
Julien Fischer
367c774f13 Fix more bitrot in the runtime. This mainly affect grades
Estimated hours taken: 0.5
Branches: main

Fix more bitrot in the runtime.  This mainly affect grades
we don't use that much, e.g. lowlevel .par and .agc grades.

runtime/mercury_accurate_gc.c:
runtime/mercury_agc_debug.c:
	Conform to the new field names in the zone structure.

	Avoid warnings about casts in lvalues being a deprecated
	feature.

runtime/mercury_thread.c:
	As above and also fix a problem with a format string not
	matching the arguments in a call to fprintf.
2005-09-16 16:43:55 +00:00
Zoltan Somogyi
0023d13f18 Fix some layout issues in these files. There are no algorithmic
Estimated hours taken: 0.2
Branches: main

runtime/mercury_calls.h:
runtime/mercury_prof.h:
runtime/mercury_signal.h:
runtime/mercury_string.h:
runtime/mercury_thread.c:
runtime/mercury_thread.h:
	Fix some layout issues in these files. There are no algorithmic
	changes.
2005-06-20 02:16:44 +00:00
Peter Ross
9e07789cc1 Allow one to turn thread debugging on at runtime.
Estimated hours taken: 1
Branches: main, release

Allow one to turn thread debugging on at runtime.
However only modules which are compiled with MR_DEBUG_THREADS will
have debugging messages output.

runtime/mercury_thread.c:
	Add MR_debug_threads global variable which is used to control
	whether debugging messages are output.
	Always include the debug version of lock, unlock, signal and
	wait in the runtime library.

runtime/mercury_thread.h:
	When MR_DEBUG_THREADS is defined, conditionally choose using
	the MR_debug_threads global variable between the debug and the
	pthread library versions of lock, unlock, signal and wait.

runtime/mercury_wrapper.c:
	Parse the --debug-threads option in the MERCURY_OPTIONS
	environment variable.

doc/user_guide.texi:
	Document --debug-threads.
2003-03-03 14:58:34 +00:00
Peter Ross
fda33190c6 Get exception handling working in the parallel grades.
Estimated hours taken: 4
Branches: main, release

Get exception handling working in the parallel grades.

library/exception.m:
	Define the macros ML_GET_EXCEPTION_HANDLER and
	ML_SET_EXCEPTION_HANDLER which either save the exception
	handler into thread local storage for the parallel grades or
	save it into a global variable.

runtime/mercury_context.c:
	Initialise the thread local storage for holding the exception
	handler.

runtime/mercury_thread.c:
runtime/mercury_thread.h:
	Define the key used to access the thread local storage for the
	exception handler.
2003-03-02 11:12:06 +00:00
Simon Taylor
b7c4a317e9 Add MR_ prefixes to the remaining non-prefixed symbols.
Estimated hours taken: 4
Branches: main

Add MR_ prefixes to the remaining non-prefixed symbols.

This change will require all workspaces to be updated
The compiler will start generating references to MR_TRUE,
MR_bool, etc., which are not defined in the old runtime
header files.

runtime/mercury_std.h:
	Add MR_ prefixes to bool, TRUE, FALSE, max, min,
	streq, strdiff, strtest, strntest, strneq, strndiff,
	strntest, NO_RETURN.

	Delete a commented out definition of `reg'.

runtime/mercury_tags.h:
	Add an MR_ prefix to TAGBITS.

configure.in:
runtime/mercury_goto.h:
runtime/machdeps/i386_regs.h/mercury_goto.h:
	Add an MR_ prefix to PIC.

runtime/mercury_conf_param.h:
	Allow non-prefixed PIC and HIGHTAGS to be defined on
	the command line.

runtime/mercury_bootstrap.h:
	Add backwards compatibility definitions.

RESERVED_MACRO_NAMES:
	Remove the renamed macros.

compiler/export.m:
compiler/ml_code_gen.m:
	Use MR_bool rather than MR_Bool (MR_Bool is
	meant to be for references to the Mercury type
	bool__bool).

runtime/mercury_types.h:
	Add a comment the MR_Bool is for references to
	bool__bool.

*/*.c:
*/*.h:
*/*.m:
	Add MR_ prefixes.
2002-02-18 07:01:33 +00:00
Fergus Henderson
bb52e7bc8d Add support for `--gc none' to the MLDS->C back-end,
Estimated hours taken: 8
Branches: main

Add support for `--gc none' to the MLDS->C back-end,
i.e. support the `hlc' and `hl' grades.

runtime/mercury_float.h:
	Extra some of the code from MR_float_to_word() out into
	a new macro MR_make_hp_float_aligned(), for use in
	MR_box_float().

runtime/mercury.h:
	If CONSERVATIVE_GC is not defined, include "mercury_regs.h" and
	"mercury_engine.h", so that we get the definition of MR_hp,
	and "mercury_overflow.h", for MR_heap_overflow_check().
	Define MR_new_object() and MR_box_float() correctly for
	the !CONSERVATIVE_GC case.

runtime/mercury_context.h:
runtime/mercury_context.c:
runtime/mercury_engine.c:
runtime/mercury_debug.c:
runtime/mercury_thread.c:
runtime/mercury_stack_trace.c:
trace/mercury_trace_util.c:
	Add `#ifndef MR_HIGHLEVEL_CODE ... #endif' wrappers around
	sections of code that are specific to the LLDS back-end.

runtime/mercury_wrapper.c:
library/benchmarking.m:
	Initialize (in mercury_wrapper.c) and use (in benchmarking.m)
	the MercuryEngine struct in the !CONSERVATIVE_GC case, as well
	as in the !MR_HIGHLEVEL_CODE case.  The MercuryEngine struct
	is needed because that is where the heap pointer and heap zone
	are stored.

library/table_builtin.m:
	Use the correct names for type_ctor_infos when MR_HIGHLEVEL_CODE
	is enabled.  (Previously this was not an issue because these
	type_ctor_infos were only being used in the !CONSERVATIVE_GC case.)

tests/hard_coded/Mmakefile:
	For the test cases which use lots of memory, increase the heap
	size (using the MERCURY_OPTIONS environment variable) rather
	than compiling them with `--gc conservative'.  This avoids
	spurious test case failures when running the tests via
	`tools/bootcheck --grade hlc --no-bootcheck'.
2001-11-22 11:37:20 +00:00
Zoltan Somogyi
04e614485d Implement deep profiling; merge the changes on the deep2 branch back
Estimated hours taken: 500
Branches: main

Implement deep profiling; merge the changes on the deep2 branch back
onto the trunk.

The main documentation on the general architecture of the deep profiler
is the deep profiling paper.

doc/user_guide.texi:
	Document how to use the deep profiler.

deep_profiler:
deep_profiler/Mmakefile:
	A new directory holding the deep profiler and its mmakefile.

Mmakefile:
	Add targets for the new directory.

	Add support for removing inappropriate files from directories.

deep_profiler/interface.m:
	The deep profiler consists of two programs: mdprof_cgi.m, which acts
	as a CGI "script", and mdprof_server.m, which implements the server
	process that the CGI script talks to. Interface.m defines the
	interface between them.

script/mdprof.in:
	A shell script template. ../configure uses it to generate mdprof,
	which is a wrapper around mdprof_cgi that tells it how to find
	mdprof_server.

deep_profiler/mdprof_cgi.m:
	The CGI "script" program.

deep_profiler/mdprof_server.m:
	The top level predicates of the server.

deep_profiler/profile.m:
	The main data structures of the server and their operations.

deep_profiler/read_profile.m:
	Code for reading in profiling data files.

deep_profiler/startup.m:
	Code for post-processing the information in profiling data files,
	propagating costs from procedures to their ancestors and performing
	various kinds of summaries.

deep_profiler/server.m:
	Code for responding to requests from the CGI script.

deep_profiler/cliques.m:
	Code to find cliques in graphs.

deep_profiler/array_util.m:
deep_profiler/util.m:
	Utility predicates.

deep_profiler/dense_bitset.m:
	An implementation of (part of) the set ADT with dense bit vectors.

deep_profiler/measurements.m:
	Operations on profiling measurements.

deep_profiler/timeout.m:
	An implementation of a timeout facility.

deep_profiler/conf.m:
	Functions that depend on autoconfigured settings.

configure.in:
	Find out what command to use to find the name of the local host.

	Install deep profiling versions of the standard library along with the
	other profiling versions.

runtime/mercury_conf.h.in:
	Add some macros for deep_profiler/conf.m to use.

library/profiling_builtin.m:
runtime/mercury_deep_call_port_body.h:
runtime/mercury_deep_leave_port_body.h:
runtime/mercury_deep_redo_port_body.h:
	A new library module that implements deep profiling primitives.
	Some of these primitives have many versions, whose common code is
	factor is factored out in three new include files in the runtime.

compiler/deep_profiling.m:
	New module to perform the program transformations described in the
	paper.

compiler/notes/compiler_design.html:
	Document the new compiler module.

compiler/mercury_compiler.m:
	Invoke the new module in deep profiling grades. Allow global static
	data to be generated by deep_profiling.m.

compiler/options.m:
	Add options to turn on deep profiling and (for benchmarking purposes)
	control its implementation.

	Add an optiooption disable tailcall optimization in the LLDS backend,
	to help benchmarking deep profiling.

compiler/jumpopt.m:
compiler/optimize.m:
	Obey the option to disable tailcalls.

compiler/handle_options.m:
	Handle the implications of deep profiling.

compiler/modules.m:
	In deep profiling grades, automatically import profiling_builtin.m.

compiler/prog_util.m:
doc/Makefile:
library/library.m:
	Handle the new builtin module.

compiler/export.m:
	In deep profiling grades, wrap deep profiling code around exported
	procedures to handle the "unscheduled call" aspects of callbacks to
	Mercury from the foreign language.

compiler/higher_order.m:
profiler/demangle.m:
util/demangle.c:
	When creating a name for a higher-order-specialized predicate, include
	the mode number in the name.

compiler/add_trail_ops.m:
compiler/type_util.m:
	Move c_pointer_type from add_trail_ops to type_util, so it can also be
	used by deep_profiling.m.

compiler/hlds_goal.m:
	Add a new goal feature that marks a tail call, for use by
	deep_profiling.m.

compiler/hlds_pred.m:
	Add a new field to proc_info structures for use by deep_profiling.m.

	Add a mechanism for getting proc_ids for procedure clones.

	Remove next_proc_id, an obsolete and unused predicate.

compiler/hlds_data.m:
	Add a new cons_id to refer to the proc_static structure of a procedure.

compiler/bytecode_gen.m:
compiler/code_util.m:
compiler/dependency_graph.m:
compiler/hlds_out.m:
compiler/mercury_to_mercury.m:
compiler/ml_unify_gen.m:
compiler/opt_debug.m:
compiler/prog_rep.m:
compiler/rl_exprn.m:
compiler/switch_util.m:
compiler/unify_gen.m:
	Trivial changes to handle the new cons_id, goal feature and/or
	proc_info argument.

compiler/rtti.m:
	Add a utility predicate for extracting pred_id and proc_id from an
	rtti_proc_label, for use by hlds_out.m

compiler/layout.m:
compiler/layout_out.m:
compiler/llds.m:
compiler/llds_common.m:
	Add support for proc_static and call_site_static structures.

compiler/layout_out.m:
compiler/llds_out.m:
	Add code for the output of proc_static structures.

compiler/code_util.m:
	Make code_util__make_proc_label_from_rtti a function, and export it.

util/mkinit.c:
compiler/llds_out.m:
compiler/layout.m:
compiler/modules.m:
	Add support for a fourth per-module C function, for writing out
	proc_static structures (and the call_site_static structures they
	contains).

	Since proc_static structures can be referred to from LLDS code (and not
	just from other static structures and compiler-generated C code),
	reorganize the declarations of static structures slightly.

	Change the schema for the name of the first per-module C function
	slightly, to make it the addition of the fourth function easier.
	The scheme now is:

		mercury__<modulename>__init
		mercury__<modulename>__init_type_tables
		mercury__<modulename>__init_debugger
		mercury__<modulename>__write_out_proc_statics

	Improve formatting of the generated C code.

library/*.m:
runtime/mercury.c:
runtime/mercury_context.c:
runtime/mercury_engine.c:
runtime/mercury_ho_call.c:
runtime/mercury_tabling.c:
runtime/mercury_trace_base.c:
runtime/mercury_wrapper.c:
trace/mercrury_trace.[ch]:
trace/mercrury_trace_declarative.c:
trace/mercrury_trace_external.c:
trace/mercrury_trace_internal.c:
	Conform to the new scheme for initialization functions for hand-written
	modules.

compiler/mercury_compile.m:
library/benchmarking.m:
runtime/mercury_conf_param.h:
runtime/mercury.h:
runtime/mercury_engine.c:
runtime/mercury_goto.c:
runtime/mercury_grade.h:
runtime/mercury_ho_call.c:
runtime/mercury_label.[ch]:
runtime/mercury_prof.[ch]:
	Add an MR_MPROF_ prefix in front of the C macros used to control the
	old profiler.

compiler/handle_options.m:
runtime/mercury_grade.h:
scripts/canonical_grade.sh-subr:
scripts/init_grade_options.sh-subr:
scripts/parse_grade_options.sh-subr:
	Make deep profiling completely separate from the old profiling system,
	by making the deep profiling grade independent of MR_MPROF_PROFILE_TIME
	and the compiler option --profile-time.

library/array.m:
library/builtin.m:
library/std_util.m:
runtime/mercury_hand_unify_body.h:
runtime/mercury_hand_compare_body.h:
	In deep profiling grades, wrap the deep profiling call, exit, fail
	and redo codes around the bodies of hand-written unification
	and comparison procedures.

	Make the reporting of array bounds violations switchable between
	making them fatal errors, as we currently, and reporting them by
	throwing an exception. Throwing an exception makes debugging code
	using arrays easier, but since exceptions aren't (yet) propagated
	across engine boundaries, we keep the old behaviour as the default;
	the new behaviour is for implementors.

runtime/mercury_deep_profiling_hand.h:
	New file that defines macros for use in Mercury predicates whose
	definition is in hand-written C code.

library/exception.m:
runtime/mercury_exception_catch_body.h:
runtime/mercury_stacks.h:
	In deep profiling grades, wrap the deep profiling call, exit, fail
	and redo codes around the bodies of the various modes of builtin_catch.

	Provide a function that C code can use to throw exceptions.

library/benchmarking.m:
library/exception.m:
library/gc.m:
library/std_util.m:
runtime/mercury_context.[ch]:
runtime/mercury_engine.[ch]:
runtime/mercury_debug.c:
runtime/mercury_deep_copy.c:
runtime/mercury_overflow.h:
runtime/mercury_regs.h:
runtime/mercury_stacks.h:
runtime/mercury_thread.c:
runtime/mercury_wrapper.c:
	Add prefixes to the names of the fields in the engine and context
	structures, to make code using them easier to understand and modify.

runtime/mercury_deep_profiling.[ch]:
	New module containing support functions for deep profiling and
	functions for writing out a deep profiling data file at the end of
	execution.

runtime/mercury_debug.[ch]:
	Add support for debugging deep profiling.

	Add support for watching the value at a given address.

	Make the buffered/unbuffered nature of debugging output controllable
	via the -du option.

	Print register contents only if -dr is specified.

runtime/mercury_goto.h:
runtime/mercury_std.h:
	Use the macros in mercury_std.h instead of defining local variants.

runtime/mercury_goto.h:
runtime/mercury_stack_layout.h:
runtime/mercury_stack_trace.c:
runtime/mercury_tabling.c:
trace/mercury_trace.c:
trace/mercury_trace_declarative.c:
trace/mercury_trace_external.c:
trace/mercury_trace_vars.c:
	Standardize some of the macro names with those used in the debugger
	paper.

runtime/mercury_heap.h:
	Add support for memory profiling with the deep profiler.

runtime/mercury_prof.[ch]:
runtime/mercury_prof_time.[ch]:
	Move the functionality that both the old profiler and the deep profiler
	need into the new module mercury_prof_time. Leave mercury_prof
	containing stuff that is only relevant to the old profiler.

runtime/mercury_prof.[ch]:
runtime/mercury_strerror.[ch]:
	Move the definition of strerror from mercury_prof to its own file.

runtime/mercury_wrapper.[ch]:
	Add support for deep profiling.

	Add suppory for controlling whether debugging output is buffered or
	not.

	Add support for watching the value at a given address.

runtime/Mmakefile:
	Mention all the added files.

scripts/mgnuc.in:
	Add an option for turning on deep profiling.

	Add options for controlling the details of deep profiling. These
	are not documented because they are intended only for benchmarking
	the deep profiler itself, for the paper; they are not for general use.

tools/bootcheck:
	Compile the deep_profiler directory as well as the other directories
	containing Mercury code.

	Turn off the creation of deep profiling data files during bootcheck,
	since all but one of these in each directory will be overwritten
	anyway.

	Add support for turning on --keep-objs by default in a workspace.

tools/speedtest:
	Preserve any deep profiling data files created by the tests.

trace/mercury_trace.c:
	Trap attempts to perform retries in deep profiling grades, since they
	would lead to core dumps otherwise.

util/Mmakefile:
	Avoid compile-time warnings when compiling getopt.

tests/*/Mmakefile:
tests/*/*/Mmakefile:
	In deep profiling grades, switch off the tests that test features
	that don't work with deep profiling, either by design or because
	the combination hasn't been implemented yet.
2001-05-31 06:00:27 +00:00
Fergus Henderson
2f737704b9 Add some Mmake rules to the runtime and some code to tools/bootcheck
Estimated hours taken: 16

Add some Mmake rules to the runtime and some code to tools/bootcheck
so that we automatically check that the namespace remains clean.
Also add some `MR_' prefixes that Zoltan missed in his earlier change.

tools/bootcheck:
	Add an option, which is enabled by default, to
	build the check_namespace target in the runtime.

runtime/RESERVED_MACRO_NAMES:
	New file.  Contains a list of the macros names that
	don't start with `MR_' or the like.

runtime/Mmakefile:
	Change the rule for `check_headers' so that it checks for macros
	that don't occur in the RESERVED_MACRO_NAMES files, as well as not
	starting with `MR_' prefixes, and reports errors for such macros.
	Also add a rule for check_objs that checks whether the object
	files define any global symbols that don't have the right prefixes,
	and a rule `check_namespace' that does both of the above.

runtime/mercury_bootstrap.h:
	#include "mercury_types.h" and "mercury_float.h",
	to ensure that this header file is self-contained.
	Also make sure that all the old names are disabled if you
	compile with `-DMR_NO_BACKWARDS_COMPAT'.

runtime/mercury_context.c:
runtime/mercury_thread.h:
runtime/mercury_thread.c:
	Use `bool' rather than `MR_Bool' for the argument to
	MR_check_pending_contexts() and the return type of
	MR_init_thread(), since these are C bools, not Mercury
	bools, and there's no requirement that they have the
	same size as MR_Integer.

runtime/mercury_type_info.h:
runtime/mercury_deep_copy_body.h:
runtime/mercury_tabling.c:
library/std_util.m:
trace/mercury_trace_declarative.c:
trace/mercury_trace_external.c:
trace/mercury_trace_internal.c:
tests/hard_coded/existential_types_test.m:
	Add MR_ prefixes to UNIV_OFFSET_FOR_TYPEINFO and UNIV_OFFSET_FOR_VALUE.

trace/mercury_trace_external.c:
trace/mercury_trace_internal.c:
trace/mercury_trace_tables.c:
	Add MR_ prefixes to do_init_modules().

runtime/mercury_tabling.h:
runtime/mercury_tabling.c:
runtime/mercury_overflow.h:
runtime/mercury_debug.h:
	Add MR_ prefixes to table_*.

runtime/mercury_overflow.h:
runtime/mercury_debug.h:
	Add MR_ prefixes to IF().

runtime/mercury_context.h:
	Add MR_ prefixes to IF_MR_THREAD_SAFE().

runtime/mercury_engine.h:
	Add MR_ prefixes to IF_NOT_CONSERVATIVE_GC().

runtime/mercury_engine.h:
runtime/mercury_engine.c:
extras/aditi/aditi.m:
	Add MR_ prefixs to do_fail, do_redo, do_not_reached, etc.

compiler/trace.m:
compiler/fact_table.m:
compiler/llds_out.m:
	Add MR_ prefixes to the generated code.
2000-12-04 18:28:57 +00:00
Zoltan Somogyi
090552c993 Make everything in the runtime use MR_ prefixes, and make the compiler
Estimated hours taken: 10

Make everything in the runtime use MR_ prefixes, and make the compiler
bootstrap with -DMR_NO_BACKWARDS_COMPAT.

runtime/mercury_*.[ch]
	Add MR_ prefixes to all functions, global variables and almost all
	macros that could pollute the namespace. The (intentional) exceptions
	are

	1. some function, variable, type and label names that already start
	   with MR_, mercury_, Mercury or _entry;
	2. some standard C macros in mercury_std.h;
	3. the macros used in autoconfiguration (since they are used in scripts
	   as well as the runtime, the MR_ prefix may not be appropriate for
	   those).

	In some cases, I deleted things instead of adding prefixes
	if the "things" were obsolete and not user visible.

runtime/mercury_bootstrap.h:
	Provide MR_-less forms of the macros for bootstrapping and for
	backward compatibility for user code.

runtime/mercury_debug.[ch]:
	Add a FILE * parameter to a function that needs it.

compiler/code_info.m:
compiler/export.m:
compiler/fact_table.m:
compiler/llds.m:
compiler/llds_out.m:
compiler/pragma_c_gen.m:
compiler/trace.m:
	Add MR_ prefixes to the C code generated by the compiler.

library/*.m:
	Add MR_ prefixes to handwritten code.

trace/mercury_trace_*.c:
util/mkinit.c:
	Add MR_ prefixes as necessary.

extras/concurrency/semaphore.m:
	Add MR_ prefixes as necessary.
2000-11-23 02:01:11 +00:00
Zoltan Somogyi
1c8cb6faf2 Get the compiler to bootstrap with -DMR_NO_BACKWARDS_COMPAT.
Estimated hours taken: 2

Get the compiler to bootstrap with -DMR_NO_BACKWARDS_COMPAT.

compiler/c_util.m:
compiler/rtti_out.m:
	Add MR_ prefixes to various type names in generated code.

compiler/*.m:
browser/*.m:
library/*.m:
	Add MR_prefixes to various type and function names in pragma C code.

runtime/*.[ch]:
trace/*.[ch]:
	Add MR_prefixes to various type and function names in
	hand-written code.
2000-10-16 01:34:14 +00:00
Tyson Dowd
9b53099dd9 Fix a bug with :- export and threads.
Estimated hours taken: 2.5

Fix a bug with :- export and threads.

Each time we called from C to Mercury, we initialized the thread engine
(if necessary).

This allocated a new context every time we entered Mercury.
Unfortunately, these contexts were never released, so every entry into
Mercury from C cost about 4Mb of memory (almost all of which is the
deterministic stack).  In a busy CORBA application you would run out of
memory really quickly.

compiler/export.m:
	When initializing threads, remember whether we are responsible
	for finalizing the engine (e.g. if we are the first C->Mercury
	call to create the engine, we'll be the last to exit it and should
	clean up afterwards).

runtime/mercury_engine.c:
	Finalize engines by destroying the context (this will put the
	memory zones onto a free list).

runtime/mercury_thread.c:
runtime/mercury_thread.h:
	Make init_thread return TRUE if an engine has been allocated and
	it is the caller's responsibility to finialize it.
2000-10-02 07:45:04 +00:00