mercury

mirror of https://github.com/Mercury-Language/mercury.git synced 2026-04-18 10:53:40 +00:00

Author	SHA1	Message	Date
Julien Fischer	f8d188fda8	Fix minor documentation problems. deep_profiler/display_report.m: deep_profiler/message.m: deep_profiler/recursion_patterns.m: deep_profiler/var_use_analsis.m: java/runtime/UnreachableDefault.java: runtime/mercury_engine.c: runtime/mercury_minimal_model.c: runtime/mercury_signal.h: runtime/mercury_stack_layout.h: runtime/mercury_wrapper.c: runtime/mercury_threadscope.c: trace/mercury_trace_external.c: HISTORY: As above.	2018-10-09 05:27:36 +00:00
Peter Wang	2be2e7468c	Do not use _snprintf functions directly in place of snprintf functions. The Windows _snprintf family of functions do not guarantee null termination when the output is truncated so cannot be used as direct replacements for the snprintf functions. Also, the _snprintf functions have different return values from the C99 snprintf functions when output is truncated (like some older snprintf implementations). Furthermore, on Windows snprintf/vsnprintf may be synonyms for _snprintf/_vsnprintf so cannot be relied upon to terminate their outputs either, even if the functions exist. runtime/mercury_string.c: runtime/mercury_string.h: Define MR_snprintf and MR_vsnprintf as macro synonyms for snprintf/vsnprintf ONLY if _snprintf/_vsnprintf do not exist. Otherwise, implement MR_snprintf and MR_vsnprintf functions that behave like the C99 functions, in terms of _vsnprintf. Require that either snprintf/vsnprintf or _snprintf/_vsnprintf are available. This should be true on all systems still in use. runtime/mercury_debug.c: runtime/mercury_ml_expand_body.h: runtime/mercury_runtime_util.c: runtime/mercury_stack_layout.c: runtime/mercury_stack_trace.c: runtime/mercury_stacks.c: runtime/mercury_tabling.c: runtime/mercury_threadscope.c: runtime/mercury_trace_base.c: runtime/mercury_wrapper.c: trace/mercury_trace_completion.c: trace/mercury_trace_internal.c: trace/mercury_trace_spy.c: trace/mercury_trace_vars.c: bytecode/mb_disasm.c: Use MR_snprintf instead of snprintf/_snprintf and MR_vsnprintf instead of vsnprintf/_vsnprintf. Drop code paths using sprintf as a fallback.	2018-07-23 10:26:29 +10:00
Mark Brown	d465fa53cb	Update the COPYING.LIB file and references to it. Discussion of these changes can be found on the Mercury developers mailing list archives from June 2018. COPYING.LIB: Add a special linking exception to the LGPL. *: Update references to COPYING.LIB. Clean up some minor errors that have accumulated in copyright messages.	2018-06-09 17:43:12 +10:00
Zoltan Somogyi	53b573692a	Convert C code to use // style comments. runtime/.[ch]: trace/.[chyl]: As above. In some places, improve comments, e.g. by expanding contractions such as "we've". Add #ifndef guards against double inclusion around the trace/.h files that did not already have them. tools/: Make the corresponding changes in shell scripts that generate .[ch] files in the runtime. tests/*: Conform to a slight change in the text of a message.	2016-07-14 13:57:35 +02:00
Zoltan Somogyi	67326f16e4	Fix style issues in the runtime. Move all .h and .c files to four-space indentation without tabs, if they weren't there already. Use the same vim line for all .h and .c files. Align all backslashes at the ends of lines in macro definitions. Align close comment signs. In some places, fix inconsistent indentation. Fix a bunch of comments. Add XXXs to a few of them.	2016-07-09 12:14:00 +02:00
Julien Fischer	787f8b2c6d	Fix spelling and grammer in runtime comments. runtime/*.[ch]: As above.	2015-09-03 15:43:35 +10:00
Julien Fischer	eb50bdf378	Improve runtime sanity checking and error messages. Always check the return value of calls to the function sem_init. This function is not implemented on OS X and always returns an error. Replace some direct calls to perror in the runtime with calls to MR_perror. The latter ensures that the error message mentions that the error originates from a call within the Mercury runtime. runtime/mercury_context.c runtime/mercury_prof_time.c: runtime/mercury_threadscope.c: As above.	2014-09-18 11:47:00 +10:00
Peter Wang	29f2dcf213	Support dynamic creation of Mercury engines in low-level C parallel grades. This change allows Mercury engines (each in a separate OS thread) to be created and destroyed dynamically in low-level C grades. We divide Mercury engines into two types: "Shared" engines may execute code from any Mercury thread. Shared engines may steal work from other shared engines, so are also called work-stealing engines; we do not have shared engines that refrain from work-stealing. "Exclusive" engines execute code only for a single Mercury thread. Only exclusive engines may be created and destroyed dynamically so far. This assumption could be lifted when and if the need should arise. Exclusive engines are a means for the user to map a Mercury thread directly to an OS thread. Calls to blocking procedures on that thread will not block progress in arbitrary other Mercury threads. Foreign code which depends on the OS thread-local state is usable when called from that thread. We do not yet allow shared engines to steal parallel work from exclusive engines. runtime/mercury_wrapper.c: runtime/mercury_wrapper.h: Rename MR_num_threads to MR_num_ws_engines. It counts only work-stealing engines. Move comment to the header file. Add MR_max_engines. The default value is arbitrary. Add MERCURY_OPTIONS `--max-engines' option. Define MR_num_ws_engines and MR_max_engines only with MR_LL_PARALLEL_CONJ. runtime/mercury_context.c: runtime/mercury_context.h: Rename MR_num_idle_engines to MR_num_idle_ws_engines. It only counts idle work-stealing engines. Extend MR_spark_deques to MR_max_engines length. Extend engine_sleep_sync_data to MR_max_engines length. Add function to index engine_sleep_sync_data with optional bounds checking. Replace instances of MR_num_threads by MR_num_ws_engines or MR_max_engines as appropriate. Add MR_ctxt_exclusive_engine field. Rename existing MR_Context fields to remove the implication that the engine "owns" the context. The new exclusive_engine field does imply a kind of ownership, hence potential confusion. Rename MR_SavedOwner, too. Make MR_find_ready_context respect MR_ctxt_exclusive_engine. Make MR_schedule_context respect MR_ctxt_exclusive_engine. Rename MR_try_wake_an_engine to MR_try_wake_ws_engine and restrict it to work-stealing engines. Rename MR_shutdown_all_engines to MR_shutdown_ws_engines and restrict it to work-stealing engines. Make try_wake_engine and try_notify_engine decrement MR_num_idle_ws_engines only for shared engines. In MR_do_idle, make exclusive engines bypass work-stealing and skip to the sleep state. In MR_do_sleep, make exclusive engines ignore work-stealing advice and abort the program if told to shut down. Assert that a context with an exclusive_engine really is only loaded by that engine. In MR_fork_new_child, make exclusive engines not attempt to wake work-stealing engines. Its sparks cannot be stolen anyway. Make do_work_steal fail the attempt for exclusive engines. There is one call where this might happen. Add notes to MR_attempt_steal_spark. Its behaviour is unchanged. Replace a call to MR_destroy_thread by MR_finalize_thread_engine. Delete MR_num_exited_engines. It was unused. runtime/mercury_thread.c: runtime/mercury_thread.h: Delete MR_next_engine_id and MR_next_engine_id_lock. We can no longer allocate engine ids by incrementing a counter. Engine ids need to be reused as they act as indices into fixed-sized arrays. Extend MR_all_engine_bases to MR_max_engines entries. Add MR_all_engine_bases_lock to protect MR_all_engine_bases. Add MR_highest_engine_id. Add MR_EngineType with the two options described. Split the main part of MR_init_engine into a new function which accepts an engine type. MR_init_engine is used by generated code so maintain the interface. Factor out setup/shutdown for thread support. Make MR_finalize_thread_engine call the shutdown function. Specialise MR_create_thread into MR_create_worksteal_thread. The generic form was unused. Move thread pinning into MR_create_worksteal_thread as other threads do not require it. Delete MR_destroy_thread. Its one caller can use MR_finalize_thread_engine. Delete declaration for non-existent variable MR_init_engine_array_lock. runtime/mercury_engine.c: runtime/mercury_engine.h: Add MR_eng_type field. Make MR_eng_spark_deque a pointer to separately-allocated memory. The reason is given in MR_attempt_steal_spark. Add MR_ENGINE_ID_NONE, a dummy value for MR_ctxt_exclusive_engine. Delete MR_eng_owner_thread which was obsoleted by engine ids before. Delete misplaced declaration of MR_all_engine_bases. runtime/mercury_memory_zones.c: Replace MR_num_threads by appropriate counters (I hope). runtime/mercury_memory_handlers.c: runtime/mercury_par_builtin.h: Conform to changes. runtime/mercury_threadscope.c: Conform to renaming (but it might be wrong). library/thread.m: Add hidden predicate `spawn_native' for testing. The interface is subject to change. Share much of the code with the high-level C backend. library/par_builtin.m: Delete `num_os_threads' as it is unused. doc/user_guide.texi: Document MERCURY_OPTIONS `--max-engines' option.	2014-07-10 14:57:48 +10:00
Paul Bone	a9f82d004b	On some systems the CPU's time stamp counter (TSC) cannot reliabily be used. Mercury's ThreadScope support will now use gettimeofday() by default, but use of the TSC may be enabled. Note that in Linux, gettimeofday() does not always make a system call. runtime/mercury_threadscope.[ch]: Add support for measuring time with gettimeofday(). Use gettimeofday() to measure time by default. runtime/mercury_atomic_ops.[ch] Add a new function MR_tsc_is_sensible(), It returns true if the TSC can (as far as the RTS can detect) be used. Fix trailing whitespace. runtime/mercury_wrapper.c: Add a new runtime option --threadscope-use-tsc. When specified this option allows threadscope to use the CPU's TSC to measure time. doc/userguide.texi: Document the --threadscope-use-tsc option. This documentation is commented out.	2012-06-20 13:13:34 +00:00
Paul Bone	e6577cfa5d	ThreadScope support improvements. Provide a new event for context re-use rather than creation. This event is true to Mercury's behaviour; the existing threadscope events were not. Bring Mercury's usage of the create context event into line with ThreadScope's expectations. mercury_threadscope.[ch]: Add a new event for when a context is re-used (and it's id is re-assigned). This is like the create context event except that the storage came from a previously used context. mercury_context.c: Post the reuse context event when a context is re-used from the free list. Post reuse context when a context that an engine already has is re-used for a stolen spark. XXX: Check locally allocated contexts. A result of these changes is that the create context message is used even when a context is created to evaluate sparks. This is deliberate: Some of ThreadScope's analyses require this. mercury_thread.c: mercury_context.c: Place the create context event in MR_create_context rather than after MR_create_context returns. mercury_par_builtin.h: Fixed the order of some type qualifiers. volatile was incorrectly referring to the pointer's target and not the pointer.	2012-06-19 11:08:16 +00:00
Paul Bone	af111c717e	Conform to latest ThreadScope expectations. ThreadScope compatibility is a moving target. This patch ensures that modern versions of ThreadScope can open eventlog files produced by Mercury. runtime/mercury_threadscope.c: An extra mandatory event has been added to ThreadScope, GcGlobalSync. Mercury now writes out this event at a suitable time so that ThreadScope can still open Mercury's eventlog files. Arguably this is a bug-compatibility patch because ThreadScope is supposed to be forwards and backwards compatible. The ThreadScope authors are considering removing the requirement for this event, so that they can open older eventlog files.	2012-06-16 06:30:57 +00:00
Zoltan Somogyi	9f55ffa28a	Fix typos in comments. Estimated hours taken: 0.1 Branches: main runtime/mercury_threadscope.c: Fix typos in comments.	2011-09-21 07:59:39 +00:00
Paul Bone	7c086e8dbe	ThreadScope updates. An event described in our ThreadScope paper had not been added to the runtime system. This event announces that an engine is attempting find work on the form of a local spark. This change also introduces a hierarchy of events, where one event 'extends' another existing event. We use this for Mercury's spark events which contain spark IDs in their payloads. These extend GHC's spark events. Other changes have been made to ensure that Mercury conforms with the ghc-events library, which is used by the ThreadScope tool. runtime/mercury_threadscope.h: runtime/mercury_threadscope.c: Add support for the LOOKING_FOR_LOCAL_SPARK event. Re-number the CALLING_MAIN event to make a Mercury specific event. Re-number the STRING event. Re-name the STRING event, it is now INTERN_STRING. No-longer use the deprecated SPARK_RUN and SPARK_STEAL events, instead use the new events and create Mercury specific events that extend these events. The Mercury-specific SPARKING event has been renamed to SPARK_CREATE and now extends the base SPARK_CREATE event. Made a correction to a comment. runtime/mercury_context.c: Post the LOOKING_FOR_LOCAL_SPARK event.	2011-09-08 01:53:08 +00:00
Paul Bone	491b089085	Fix some ThreadScope issues. Firstly, this change allows the ThreadScope tool to read Mercury's .eventlog files without aborting. This is fixed by making THREAD_START and THREAD_STOP events consistent. Secondly, this change implements the missing EVENT_SLEEPING event. This ensures that the implementation matches the description in the ThreadScope paper. Thirdly, the idle engines try to run a suspended context before running a spark. runtime/mercury_threadscope.c: Don't post THREAD_START or THREAD_STOP events if it wouldn't make sense, ie: the thread is already stopped. We do this to make RTS code simpler since an engine may hang on to a context even when that context is stopped. The RTS uses this for caching. Create a new event ENGINE_SLEEPING to be used when an engine goes to sleep. runtime/mercury_context.c: Add some missing calls to threadscope, this ensures that Mercury's eventlog file maintains some invariants expected by the ThreadScope visualisation tool. Modify how idle engines look for new work: now, in all cases, an idle engine will attempt to resume a context first. Avoid taking the lock to the global run queue of contexts if the runqueue pointer is NULL indicating that the queue is empty.	2011-06-23 08:13:50 +00:00
Paul Bone	0365571027	In ThreadScope grades each context has a unique ID. Previously when a context was re-used (as apposed to created from scratch) we would re-assign it's ID, so that it was clear to see when a new computation was started. This is no-longer necessary and prevents anyone using ThreadScope from understanding how contexts are re-used. This change also adds a new ThreadScope event that marks when a context is released back to the free context pool. runtime/mercury_context.c: Only allocate new context IDs for new contexts (not re-used contexts Use the new release_context event. Fixed spelling mistake. runtime/mercury_threadscope.h: runtime/mercury_threadscope.c: Add support for the release_context event.	2011-06-02 05:59:21 +00:00
Paul Bone	67f072901a	Include the name of futures in ThreadScope profiles. runtime/mercury_threadscope.h: runtime/mercury_threadscope.c: Add a second parameter for the NEW_FUTURE event. The parameter is the id of the string that holds the future's name. runtime/mercury_par_builtin.h: In threadscope grades use a two-args version of the new_future macro. library/par_builtin.m: Conform to changes in mercury_par_builtin.h, new_future now takes two arguments. compiler/dep_par_conj.m: Create a name variable for each future and pass it as a second parameter to calls to new_future. Thread a threadscope string table throughout this transformation so that strings for variables can be collected. compiler/hlds_module.m: Add a threadscope string table to the module_info structure. compiler/global_data.m: global_data_init now takes the threadscope string table and its size as parameters. This is necessary because the table may be non-empty before the LLDS transformation begins. compiler/mercury_compile_llds_back_end.m: Conform to changes in global_data.m mdbcomp/program_representation.m: Disable the polymorphism transformation for new_future/2 rather than the old new_future/1.	2011-05-31 03:14:21 +00:00
Paul Bone	987d2e31e3	Fix ThreadScope support since my recent work stealing changes. runtime/mercury_threadscope.h: runtime/mercury_threadscope.c: Fix some compilation problems. Rename stop conjunction and stop conjunct events to use the word "end" rather than "stop". The meaning is clearer and the name matches that used in the threadscope paper. runtime/mercury_context.h: runtime/mercury_context.c: Re-order some operations in the idle loop: try to resume an earlier context before working on a local spark, this may lead to leas blocking. The RUN_CONTEXT event was posted from the load_context macro. Change this to post the RUN_CONTEXT event explicitly. Fix some over-long lines. Conform to changes in mercury_threadscope.h. runtime/mercury_thread.c: Add an explicit call to post the RUN_CONTEXT event. compiler/layout_out.m: Add a missing output_layout_array_name call when writing out the threadscope string table array. compiler/par_conj_gen.m: Conform to changes in runtime/mercury_threadscope.h	2011-05-24 04:16:48 +00:00
Zoltan Somogyi	f3389a7197	Remove unnecessary mechanism for managing a non-existent module Estimated hours taken: 0.5 runtime/mercury_threadscope.[ch]: Remove unnecessary mechanism for managing a non-existent module of hand-translated-to-C Mercury code. Fix deviations from our programming style.	2011-05-02 07:55:04 +00:00
Paul Bone	f1779bd1e8	Improve work stealing. Spark deques have been associated with contexts so far. This is a problem for the following reasons: The work stealing code must take a lock to access the resizeable array of work stealing dequeues. This adds global contention that can be avoided if this array has a fixed size. If a context is blocked on a future then that engine cannot execute the sparks from that context, instead it tries to find global work, this is more expensive than necessary. If there are a few dozen contexts then there may be just as many work stealing queues to take work from, the density of these queues will be higher if they are fewer. Therefore work stealing will be more successful on average. This change associates spark deques with Mercury Engines rather than Contexts to avoid these problems. This has invalidated some invariants that allowed the runtime system to make some worth-while optimisations. These optimisations have been maintained. Mercury's idle loop has been reimplemented to allow for this. This re-implementation has allowed for a number of other improvements: Polling was used to check for new global sparks. This has been removed and each engine now sleeps using it's own semaphore. Checks for work can be done in different orders depending on how an engine joins the idle loop. When global work becomes available a particular engine can be woken up rather than any arbitrary engine. We take advantage of this when making contexts runnable, we try to schedule them on the engine that last executed them. When an engine is woken up it can be instructed with what it should do upon waking up. When a engine looks for a context to run, it will try to pick a context that was last executed on it. This may avoid cache misses when the context begins to run. In the future we should consider: Experiment with telling engines which context to run. Improve the selection of which engine work should be scheduled on to be hardware and memory-hierarchy aware. Things that need doing next (probably next week): ./configure should check for POSIX semaphore support. Profiling times have been broken by this change, they will need fixing. The threadscope event long now breaks an invariants that the threadscope graphical tool requires. Semaphores are setup but never released, this is not a big problem but the manual page says that some implementations may leak resources. runtime/mercury_context.h: runtime/mercury_context.c: Remove the spark deque field from the MR_Context structure. Export the new array of spark deques so that other modules may fill in elements as engines are setup. Modify the resume_owner_thread field of the MR_Context structure, this was used to ensure that a context returning through C code would be resumed on the engine with the correct C stack and depth. This field is now an engine id and has been renamed to resume_owner_engine, it is advisory unless resume_engine_required is also set. This way it is used to advise which engine most recently executed this context and therefore may have a warm cache. Remove code that dynamically resized the array of spark deques. Including the lock that protected against updating this array while it was being read from other thread. Introduce code that initialises the statically sized array of spark deques. Reimplement the idle loop. This replaces MR_runnext and MR_do_runnext with MR_idle and MR_do_idle respectively. There are also two new entry points into the idle loop. Which one to use depends on the state of the engine. Introduce new mechanisms for waking a particular engine. For example the engine that last executed a context that is now runnable. Change the algorithm for selecting which context to run, try to select contexts that where last used on the current engine to avoid cache misses. Use an engine's victim counter rather than a global victim counter when trying to steal work. Introduce some conditionally-compiled code that can be used to profile how quickly new contexts can be created. Rename MR_init_thread_stuff and MR_finalize_thread_stuff. The term thread has been replaced with context since they're in mercury_context.c. This allows the creation of a new function MR_init_thread_stuff() in mercury_thread.c I also found the mismatch between the function names and file name confusing. Move some of the code from MR_init_context_stuff to the new MR_init_thread_stuff function where it belongs. Refactor the thread pinning code so that even when thread pinning is disabled it can be used to allocate each thread to a CPU but not actually pin them. Fix some whitespace errors. runtime/mercury_thread.h: runtime/mercury_thread.c: In MR_init_engine(): Allocate an engine id for each engine. A number of arrays had one slot per engine and where setup using a lock. Now engine ids are used to index each array and setup is done without a lock, each engine simply sets up its own slot. Setup the new per-engine work stealing deques. The MR_all_engine_bases array has been moved to this file. Implement a new MR_init_thread_stuff function which initialises some global variables and locks. Some of MR_init_thread_stuff has been moved from mercury_context.c Pin threads as part of MR_init_thread, excluding the primordial thread which must be pinned before threadscope is initialised. Add functions for debugging the use of semaphores. Add corresponding macros that can be used to redirect semaphore calls to debugging functions as above. Improved thread debugging code, ensured that stderr is flushed after every use, and that logging is done after calls return as well as before they're called. Conform to changes in mercury_context.h runtime/mercury_engine.h: runtime/mercury_engine.c: Add spark deque and victim counter fields to the MercuryEngine structure. Make the MR_eng_id field of the MercuryEngine structure available in all thread safe grades, formerly it was used in only threadscope grades. Move the MR_all_engine_bases variable to mercury_thread.[ch] Put a reference to the engine's spark queue into the global array. This is done here, so that it is after thread pinning because the original plan was to have this array sorted by CPU rather then engine - we may yet do this in the future. Initialise an engine's spark deque when an engine is initialised. Setup the engine specific threadscope data in mercury_thread.c Conform to changes in mercury_context.h runtime/mercury_wrapper.c: The engine base array is no longer setup here, that code has been moved to mercury_thread.c Conform to changes in mercury_context.h and mercury_thread.h runtime/mercury_wsdeque.h: runtime/mercury_wsdeque.c: The original implementation allocated an array for a spark queue only if one wasn't already allocated, which could happen when a context was reused. Now that spark queues are associated with engines arrays are always allocated. Replaced two macros with a single macro since there's no-longer a distinction between global and local work queues, all work queues are local. runtime/mercury_wsdeque.c: runtime/mercury_wsdeque.h: Remove the --worksteal-max-attempts and --worksteal-sleep-msecs options as they are no-longer used. runtime/mercury_threadscope.h: runtime/mercury_threadscope.c: The MR_EngineId type has been moved to mercury_types.h Engine IDs are no-longer allocated here, this is done in mercury_thread.c The run spark and steal spark messages now write 0xFFFFFFFF for the context id if there is no current context. Previously this would dereference a null pointer. runtime/mercury_memory_zones.c: When checking for an existing memory zone check the free_zones_list variable before taking a lock. This can prevent taking the lock in cases where there are no free zones. Introduce some conditionally-compiled code that can be used to profile how quickly new contexts can be created. runtime/mercury_bootstrap.h: Remove macros that no-longer resolve to functions due to changes in the runtime system. runtime/mercury_types.h: Move the MR_EngineId type from mercury_threadscope.h to mercury_types.h runtime/mercury_grade.h: Introduce a parallel grade version number, this change brakes binary compatibility with existing parallel code. runtime/mercury_backjump.c: runtime/mercury_par_builtin.c: runtime/mercury_mm_own_stacks.c: library/stm_builtin.m: library/thread.m: library/thread.semaphore.m: Conform to changes in mercury_context.h. library/io.m: Make this module compatible with MR_debug_threads. doc/user_guide.texi Remove the documentation for the --worksteal-max-attempts and --worksteal-sleep-msecs options. The documentation was already commented out.	2011-04-13 13:19:42 +00:00
Paul Bone	3f336dfac5	ThreadScope updates. Introduce some new threadscope events for profiling the use of futures. Update mercury's threadscope runtime so that it conforms with changes to ghc-events (the threadscope event library). runtime/mercury_threadscope.h: Add some new typedefs. runtime/mercury_threadscope.c: Conform to changes in the threadscope eventlog format, in particular which event IDs belong to which events. Support the capset[3] events created by Duncan Coutts. Within Mercury we refer to capsets as "engine sets". Remove our runtime type event and use the capset runtime identifier event which takes a string rather than an integer. Use #define'd constants rather than magic numbers for the sizes of event attributes. Make the sparking event a Mercury specific event. It appears that GHC won't use this, even in the future[1, 2]. runtime/mercury_threadscope.h: runtime/mercury_threadscope.c: Add support for new threadscope events: New Future - A future is being created, the context of this event tells us which parallel conjunction the future belongs to. Wait on Future - Attempt to wait on the production of a future, both with and without suspending the current thread because the future may already be available. Signal future - After producing a value for a future signal that that value is now available. Note that waking up after suspending on a future is available already via THREAD_RUNNABLE when the thread is added to the run queue, and THREAD_RUNNING when an engine begins executing that thread. runtime/mercury_par_builtin.h: Use the new threadscope events for instrumenting the use of futures. References: 1. There are commented out event ids for creating sparks that are marked as deprecated in the ghc-events tool. 2. Online conversation in #ghc on irc.freenode.net, Simon said that Sparks will be deprecated in favor of: https://github.com/simonmar/monad-par. Which, I believe, is covered in a paper submitted a paper to ICFP. 3. http://tinyurl.com/67gf2om	2011-04-02 05:41:11 +00:00
Paul Bone	322feaf217	Add more threadscope instrumentation. This change introduces instrumentation that tracks sparks as well as parallel conjunctions and their conjuncts. This should hopefully give us more information to diagnose runtime performance issues. As of this date the ThreadScope program hasn't been updated to read or understand these new events. runtime/mercury_threadscope.[ch]: Added a function and types to register all the threadscope strings from an array. Add functions to post the new events (see below). runtime/mercury_threadscope.c: Added support for 5 new threadscope events. Registering a string so that other messages may refer to a constant string. Marking the beginning and ends of parallel conjunctions. Creating a spark for a parallel conjunct. Finishing a parallel conjunct. Re-arranged event IDs, I've started allocating IDs from 38 onwards for general purposes and 100 onwards for mercury specific events after talking with Duncan Coutts. Trimmed excess whitespace from the end of lines. runtime/mercury_context.h: Post a beginning parallel conjunction message when the sync term for the parallel conjunction is initialized. Post an event when creating a spark for a parallel conjunction. Add a MR_spark_id field to the MR_Spark structure, these identify sparks to threadscope. runtime/mercury_context.c: Post threadscope messages when a spark is about to be executed. Post a threadscope event when a parallel conjunct is completed. Add a missing memory barrier. runtime/mercury_wrapper.[ch]: Create a global function pointer for the code that registers strings in the threadscope string table, this is filled in by mkinit. Call this function pointer immediatly after setting up threadscope. runtime/mercury_wsdeque.[ch]: Modify MR_wsdeque_pop_bottom to return the spark pointer (which points onto the queue) rather then returning a result through a pointer and bool if the operation was successful. This pointer is safe to dereference until MR_wsdeque_push_bottom is used. runtime/mercury_wsdeque.c: Corrected a code comment. runtime/mercury_engine.h: Documented some of the fields of the engine structure that hadn't been documented. Add a next spark ID field to the engine structure. Change the type of the engine ID field to MR_uint_least16_t compiler/llds.m: Add a third field to the init_sync_term instruction that stores the index into the threadscope string table of the static conjunction ID. Add a field to the c_file structure containing the threadscope string table. compiler/layout.m: Added a new layout array name for the threadscope string table. compiler/layout_out.m: Implement code to write out the threadscope string table. compiler/llds_out_file.m: Write out the threadscope string table when writing out the c_file. compiler/par_conj_gen.m: Create strings that statically identify parallel conjunctions for each init_sync_term LLDS instruction. These strings are added to a table in the !CodeInfo and the index of the string is added to the init_sync_term instruction. Add an extra instruction after a parallel conjunction to post the message that the parallel conjunction has completed. compiler/global_data.m: Add fields to the global data structure to represent the threadscope string table and its current size. Add predicates to update and retrieve the table. Handle merging of threadscope string tables in global data by allowing the references to the strings to be remapped. Refactored remapping code so that a caller such as proc_gen only needs to call one remapping predicate after merging global data.. compiler/code_info.m: Add a table of strings for use with threadscope to the code_info_persistent type. Modify the code_info_init to initialise the threadscope string table fields. Add a predicate to get the string table and another to update it. compiler/proc_gen.m: Build the containing goal map before code generation for procedures with parallel conjunctions in a parallel grade. par_conj_gen.m depends on this. Conform to changes in code_info.m and global_data.m compiler/llds_out_instr.m: Write out the extra parameter in the init_sync_term instruction. compiler/dupelim.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/livemap.m: compiler/llds_to_x86_64.m: compiler/mercury_compile_llds_back_end.m: compiler/middle_rec.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/reassign.m: compiler/use_local_vars.m: Conform to changes in llds.m compiler/opt_debug.m: Conform to changes in layout.m compiler/mercury_compile_llds_back_end.m: Fix some trailing whitespace. util/mkinit.c: Build an initialisation function that registers all the strings in threadscope string tables. Correct the layout of a comment.	2011-03-25 03:13:42 +00:00
Paul Bone	bf6a35f5ec	Fix bugs 144 and 171. Bug 144 is a pathological case where right-recursion is used in a parallel conjunction and the conjuncts cannot be re-ordered. This can cause excess stack allocation and abysmal performance. The --max-contexts-per-thread runtime option is used to reduce the impact of these cases by reducing the amount of parallelism gained at runtime. Bug 171 is a simple case where the threadscope grade could not be compiled without enabling the Boehm garbage collector. runtime/mercury_threadscope.c: Enclose boehm GC specific code within #ifdef MR_BOEHM_GC runtime/mercury_context.[ch]: Record the number of contexts running or suspended at any time in a new variable, MR_num_outstanding_contexts Remove counts of other in-use objects such as the sum of outstanding contexts and sparks. Remove two granularity control macros that haven't been used for some time. compiler/granularity.m: Ensure that the runtime granularity decision is updated for when it is available. library/par_builtin.m: Remove granularity decisions for which support has been removed in the runtime. tests/par_conj/Mmakefile: tests/par_conj/pathological_right_recursion.{m,exp}: Add a test case for bug 144.	2011-02-08 03:48:11 +00:00
Paul Bone	9c3d650921	Fix a bug in the Mercury runtime's threadscope support that could cause a buffer overrun. runtime/mercury_threadscope.c: As above.	2010-05-24 06:40:11 +00:00
Paul Bone	edc230406e	Fix a number of errors and warnings in the runtime picked up by GCC 4.x in parallel and threadscope grades. We had been using types with the wrong signedness well calling atomic operations. GCC 4.x also picked up an error where #elif was used instead of #else. While testing these changes on a 32bit system more bugs where found on the i386 architecture and on AMD brand processors. runtime/mercury_atomic_ops.h: runtime/mercury_atomic_ops.c: Add unsigned variants of the following atomic operations: increment, add, add_and_fetch, dec_and_is_zero, Add a signed variant for compare and swap. Rename the MR_atomic_dec_<type>_and_is_zero operation to move the type to the end of the name. Use volatile storage in the MR_Stats structure. A 32bit machine cannot do atomic operations on 64bit values and MR_Stats must use 64bit values. Therefore 64bit values in the MR_Stats structure are now protected by a lock on 32bit machines. runtime/mercury_atomic_ops.h: Fix a typeo in the i386 version of MR_atomic_dec_and_is_zero_uint(). runtime/mercury_atomic_ops.c: AMD CPUs do not conform to Intel's specification for being able to extract the CPU clock speed from the brand string. When we cannot determine the CPU's clock speed then we write out threadscope timestamps in raw clock cycles rather than nanoseconds. On i386 machines the ebx register is used to implement PIC code, however the CPUID instruction uses it to output information. Save this register on C's stack while we issue CPUID and retrieve the result in ebx. We now pass native machine sized values to the inline assembler code that implements RDTSC and RDTSCP. Fix commenting style in some places. runtime/mercury_atomic_ops.c: Fix some incorrect C preprocessor code for conditional compilation. runtime/mercury_grade.h: Increment binary compatibility number. This should have been done in a prior change when the MR_runnext macro changed which broke binary compatibility in the parallel low-level C grades. runtime/mercury_context.h: In MR_SyncTerm_Struct use an unsigned value for the number of conjuncts remaining before the conjunction is complete. runtime/mercury_threadscope.c: Record raw cpu clock ticks rather than milliseconds when we don't know the processor's clock speed. runtime/mercury_context.c: runtime/mercury_wsdeque.h: runtime/mercury_wsdeque.c: Conform to changes in mercury_atomic_ops.h	2010-03-20 10:15:51 +00:00
Paul Bone	4bb2d83d91	Avoid some C99/GNUC specific code in threadscope. runtime/mercury_threadscope.c: Use MR_STATIC_INLINE rather than 'static __inline__' (A GCC-ism) Do not initialise an array by providing values for only some indexes (A C99-ism).	2010-02-17 03:44:13 +00:00
Paul Bone	1c4251c9b8	Support user-specified message events in threadscope. log_threadscope_message/3 allows programmers to post message events to threadsope. This can be used to help them identify parts of their program when looking at the events generated by threadscope. library/benchmarking.m: Create new log_threadscope_message/3 predicate. runtime/mercury_threadscope.c: runtime/mercury_threadscope.h: Create MR_threadscope_post_log_msg() to support the logging of arbitrary messages.	2010-02-16 03:00:34 +00:00
Paul Bone	f97267d5f7	Fix threadscope profiling bugs. 1. The garbage collector uses callbacks to notify threadscope when it stops and resumes the world. Threadscope uses this to post messages that it stops and starts executing the current thread for each engine. In the case where an engine was not currently executing a thread, threadscope would post an erroneous 'starting thread' message. This has been fixed by keeping a flag that is true of the context associated with an engine is stopped and therefore threadscope shouldn't send a 'starting thread' message when the garbage collector resumes the world. 2. In low-level C parallel grades Mercury keeps a pointer to the MR_Engine structure in a machine register by using gcc's register pinning feature. When the Boehm garbage collector calls into Mercury's runtime (via a callback) this register is not set correctly because Boehm is not compiled with register pinning. This is a problem when one of these callbacks is used before a mercury engine posts any threadscope events as the incorrect engine ID is placed into the event stream. This is fixed by using the MR_thread_engine_base pointer rather than the machine pinned address in any context where the threadscope code could have been called via a garbage collector callback. This changeset also introduces a new threadscope event type, 'looking for global work' which is posted when an engine has no work of it's own and is about to looking for some other work. runtime/mercury_threadscope.c: As above (bugs 1 and 2). runtime/mercury_threadscope.h: runtime/mercury_threadscope.c: runtime/mercury_context.c: Create a new threadscope event 'looking for global work' and post it in MR_do_runnext().	2010-02-12 03:26:31 +00:00
Paul Bone	83a6f14708	Create a threadscope grade component. Threadscope grades are enabled by using the grade component 'threadscope'. They are supported only with low-lavel C parallel grades. Support for threadscope in high level C grades is intended in the future but does not work now. runtime/mercury_conf_param.h: Create the MR_THREADSCOPE macro that is defined if the grade is a threadscope grade. Define MR_PROFILE_FOR_PARALLEL_EXECUTION if MR_THREADSCOPE is defined. Emit an error if MR_LL_PARALLEL_CONJ is defined before it is implied by MR_THREADSAFE and ! MR_HIGHLEVEL_CODE runtime/mercury_grade.h Update the grade symbol for the threadscope grade component. runtime/mercury_atomic_ops.c: runtime/mercury_atomic_ops.h: runtime/mercury_context.c: runtime/mercury_context.h: runtime/mercury_engine.c: runtime/mercury_engine.h: runtime/mercury_thread.c: runtime/mercury_threadscope.c: runtime/mercury_threadscope.h: runtime/mercury_wrapper.c: Now that MR_PROFILE_FOR_IMPLICIT_PARALLELISM is implied by MR_THREADSAFE we don't need to test for MR_THREADSAFE when we test for MR_PROFILE_FOR_IMPLICIT_PARALLELISM. The same is true for MR_LL_PARALLEL_CONJ which is implied by MR_THREADSAFE && !MR_HIGHLEVEL_CODE. Replace some occurances of MR_PROFILE_FOR_IMPLICIT_PARALLELISM with MR_THREADSCOPE where the conditionally compiled code is used to support threadscope profiling. scripts/init_grade_options.sh-subr: scripts/canonical_grade.sh-subr: scripts/parse_grade_options.sh-subr: scripts/final_grade_options.sh-subr: scripts/mgnuc.in: compiler/handle_options.m: compiler/options.m: compiler/compile_target_code.m: configure.in: Add support for the new grade component. Pass -DMR_THREADSCOPE to the C compiler when using a threadscope grade. Add assertions to ensure that the 'threadscope' grade component is used only with the 'par' grade component. doc/user_guide.texi: Added commented-out documentation for the threadscope greate component. Adjusted documentation of the --profile-parallel-execution runtime option to describe the correct prerequisite compile time options. Added my name to the authors list. runtime/mercury_context.c: Corrected grammar and prose in comments in the MR_do_join_and_continue code.	2010-01-10 04:53:40 +00:00
Paul Bone	92afa23af5	Support for threadscope profiling of the parallel runtime. This change adds support for threadscope profiling of the parallel runtime in low level C grades. It can be enabled by compiling _all_ code with the MR_PROFILE_PARALLEL_EXECUTION_SUPPORT C macro defined. The runtime, libraries and applications must all have this flag defined as it alters the MercuryEngine and MR_Context structures. See Don Jones Jr, Simon Marlow, Satnam Singh - Parallel Performance Tuning for Haskell. This change also includes: Smarter thread pinning (the primordial thread is pinned to the thread that it is currently running on). The addition of callbacks from the Boehm GC to notify the runtime of stop the world garbage collections. Implement some userspace spin loops and conditions. These are cheaper than their POSIX equivalents, do not support sleeping, and are signal handler safe. boehm_gc/alloc.h: boehm_gc/alloc.c: Declare and define the new callback functions. boehm_gc/alloc.c: Call the start and stop collect callbacks when we start and stop a stop-the-world collection. Correct how we record the time spent collecting, it now includes collections that stop prematurely. boehm_gc/pthread_stop_world.c: Call the pause and resume thread callbacks in each thread where the GC arranges for that thread to be stopped during a stop-the-world collection. runtime/mercury_threadscope.c: runtime/mercury_threadscope.h: New files implementing the threadscope support. runtime/mercury_atomic_ops.c: runtime/mercury_atomic_ops.h: Rename MR_configure_profiling_timers to MR_do_cpu_feature_detection. Add a new function MR_read_cpu_tsc() to read the TSC register from the CPU, this simply abstracts the static MR_rdtsc function. runtime/mercury_atomic_ops.h: Modify the C inline assembler to ensure we tell the C compiler that the value in the register mapped to the 'old' parameter is also an output from the instructions. That is, the C compiler must not depend on the value of 'old' being the same before and after the instruction is executed. This has never been a problem in practice though. Implement some cheap userspace mutual exclusion locks and condition variables. These will be faster than pthread's mutexes when critical sections are short and threads are pinned to separate CPUs. runtime/mercury_context.c: runtime/mercury_context.h: Add a new function for pinning the primordial thread. If the OS supports sched_getcpu we use it to determine which CPU the primordial thread should use. No other thread will be pinned to this CPU. Add a numeric id field to each context, this id is uniquely assigned and identifies each context for threadscope. MR_schedule_context posts the 'context runnable' threadscope event. MR_do_runnext has been modified to destroy engines differently, it ensures they cleanup properly so that their threadscope events are flushed properly and then calls pthread_exit(0) MR_do_runnext posts events for threadscope. MR_do_join_and_continue posts events for threadscope. runtime/mercury_engine.h: Add new fields to the MercuryEngine structure including a buffer of threadscope events, a clock offset (used to synchronize the TSC clocks) and a unique identifier for the engine, runtime/mercury_engine.c: Call MR_threadscope_setup_engine() and MR_threadscope_finalize_engine for newly created and about-to-be-destroyed engines. When the main context finishes on a thread that's not the primordial thread post a 'context is yielding' message before re-scheduling the context on the primordial thread. runtime/mercury_thread.c: Added an XXX comment about a potential problem, it's only relevant for programs using thread.spawn. Added calls to the TSC synchronisation code used for threadscope profiling. It appears that this is not necessary on modern x86 machines, it has been commented out. Post a threadscope event when we create a new context. Don't call pthread_exit in MR_destroy_thread, we now do this in MR_do_runnext so that we can unlock the runqueue mutex after cleaning up. runtime/mercury_wrapper.c: Conform to changes in mercury_atomic_ops.[ch] Post an event immediately before calling main to mark the beginning of the program in the threadscope profile. Post a "context finished" event at the end of the program. Wait until all engines have exited before cleaning up global data, this is important for finishing writing the threadscope data file. configure.in: runtime/mercury_conf.h.in: Test for the sched_getcpu C function and utmpx.h header file, these are used for thread pinning. runtime/Mmakefile: Include the mercury_threadscope.[hc] files in the list of runtime headers and sources respectively.	2009-12-03 05:28:00 +00:00

29 Commits