2536 Commits

Author SHA1 Message Date
Peter Wang
f3e5419499 Wait for work-stealing engine threads to terminate with pthread_join.
Previously, we created _detached_ threads to run work-stealing engines.
The only reason for using detached threads instead of joinable threads
was because the code for thread creation was originally designed for
creating Mercury threads (the interface exported by the thread.m module
expects detached threads).

When the program is about to end, the main thread notifies the engines
to shut down, then waits on a semaphore that is incremented when an
engine is shut down. But an engine can only increment the semaphore
BEFORE its thread terminates. That is, while the semaphore indicates
that the engine has shut down (no longer responding), the thread that
it was running on may continue for an indeterminate amount of time
before it is terminated. The main thread may think that it is safe
to proceed, even while some of the engine threads are still running.

I found that that on a Linux/glibc system, with a statically linked
binary, this setup could sometimes cause an "Aborted" error message
at program exit (after Mercury main/2). From backtraces, I believe the
problem is as described: the main thread is already in a exit() call
while engine threads are still performing their own cleanup, leading to
an abort() call.

The solution is to do what we should have done to begin with: run
work-stealing engines in non-detached threads, and call pthread_join()
to wait for engine threads to terminate before allowing the main thread
to continue with program termination.

runtime/mercury_context.c:
    Delete references to shutdown_ws_semaphore.

runtime/mercury_thread.c:
runtime/mercury_thread.h:
    Make MR_create_worksteal_thread create a non-detached thread.

runtime/mercury_wrapper.c:
    In mercury_runtime_init, record the IDs of the threads created for
    running work-stealing engines in an array.

    In mercury_runtime_terminate, after notifying each work-stealing
    engine to shut down, wait for the engine threads to terminate
    by calling pthread_join().

Sample backtrace:

    Thread 1 (Thread 0x7f6dcafb46c0 (LWP 11122) (Exiting)):
    #0  0x000000000093c40c in pthread_kill ()
    #1  0x00000000009219ce in raise ()
    #2  0x00000000004013b2 in abort ()
    #3  0x0000000000401bd4 in uw_init_context_1[cold] ()
    #4  0x00000000009cebda in _Unwind_ForcedUnwind ()
    #5  0x0000000000940044 in __pthread_unwind ()
    #6  0x000000000093b9e0 in pthread_exit ()
    #7  0x00000000009022ad in GC_pthread_exit ()
    #8  0x00000000008bd062 in action_shutdown_ws_engine ()
    #9  0x00000000008be0ea in scheduler_module_idle_sleep ()
    #10 0x00000000008ca58d in MR_call_engine ()
    #11 0x00000000008e5d05 in MR_init_thread_inner ()
    #12 0x00000000008e5e8c in MR_create_worksteal_thread_2 ()
    #13 0x00000000009048d9 in GC_inner_start_routine ()
    #14 0x00000000008f46fe in GC_call_with_stack_base ()
    #15 0x000000000093a85c in start_thread ()
    #16 0x000000000096875c in clone3 ()

    Thread 2 (Thread 0x1c533c0 (LWP 11092)):
    #0  0x000000000096656d in write ()
    #1  0x0000000000933f35 in _IO_new_file_write ()
    #2  0x00000000009321c1 in _IO_new_do_write ()
    #3  0x0000000000936906 in _IO_flush_all ()
    #4  0x0000000000936f2d in _IO_cleanup ()
    #5  0x000000000092236e in __run_exit_handlers ()
    #6  0x00000000009223be in exit ()
    #7  0x0000000000917d0f in __libc_start_call_main ()
    #8  0x0000000000919ed0 in __libc_start_main_impl ()
    #9  0x0000000000401d85 in _start ()

    Thread 3 (Thread 0x7f6dd97d16c0 (LWP 11093) (Exiting)):
    #0  0x0000000000941d19 in alloc_new_heap ()
    #1  0x00000000009421f2 in arena_get2.part ()
    #2  0x0000000000943fb9 in tcache_init.part ()
    #3  0x00000000009448c4 in malloc ()
    #4  0x00000000009d2141 in _Unwind_Find_FDE ()
    #5  0x00000000009cda7d in uw_frame_state_for ()
    #6  0x00000000009ce0d3 in uw_init_context_1 ()
    #7  0x00000000009cebda in _Unwind_ForcedUnwind ()
    #8  0x0000000000940044 in __pthread_unwind ()
    #9  0x000000000093b9e0 in pthread_exit ()
    #10 0x00000000009022ad in GC_pthread_exit ()
    #11 0x00000000008bd062 in action_shutdown_ws_engine ()
    #12 0x00000000008be0ea in scheduler_module_idle_sleep ()
    #13 0x00000000008ca58d in MR_call_engine ()
    #14 0x00000000008e5d05 in MR_init_thread_inner ()
    #15 0x00000000008e5e8c in MR_create_worksteal_thread_2 ()
    #16 0x00000000009048d9 in GC_inner_start_routine ()
    #17 0x00000000008f46fe in GC_call_with_stack_base ()
    #18 0x000000000093a85c in start_thread ()
    #19 0x000000000096875c in clone3 ()
2026-04-14 12:25:26 +10:00
Julien Fischer
15b54f5377 Use C99 conversion specifiers in the runtime.
C99 introduced additional conversion specifiers for use with the printf and
scanf family of functions. Specifically, the 'z' size specifier for size_t
values and the 't' size specifier for ptrdiff_t values. Historically, we have
not used them because the C99 support in a certain C compiler was lacking.
Since this is no longer the case, use them. This removes the need of a bunch
of casts, some of which were incorrect on 64-bit Windows.

runtime/mercury_accurate_gc.c:
runtime/mercury_memory_zones.c:
runtime/mercury_stacks.c:
runtime/mercury_wrapper.c:
    As above.
2026-03-29 03:29:34 +11:00
Julien Fischer
0e7baba9d6 Delete an unused variable.
runtime/mercury_wrapper.c:
    As above.
2026-03-27 20:46:24 +11:00
Julien Fischer
0d9b0c970a Fix a runtime error message.
runtime/mercury_wrapper.c:
    Add a missing space to an error message.
2026-03-27 20:41:29 +11:00
Julien Fischer
7966692d87 Fix copy-and-paste error.
runtime/mercury_hash_table.c:
     As above.
2026-03-27 20:31:33 +11:00
Zoltan Somogyi
953ea7667f Fix deconstructing direct args.
The crash that this diff fixes occurred when giving a command such as
"print Var^1" to mdb, where the first argument of Var is a direct arg.

runtime/mercury_ml_expand_body.h:
    When deconstructing a term with a direct arg, return NULL
    as the value of expand_info->chosen_arg_word_sized_ptr.
    The crash occurred when we returned a non-null pointer,
    which violated the expectations of trace/mercury_trace_vars.c
    and its callers. (Not surprising, since those that function and
    its callers were written long before the direct_arg optimization
    was added to the system.)

runtime/mercury_deconstruct.h:
    Document the rationale behind the above changes. (The contents of
    mercury_ml_expand_body.h are #included in mercury_deconstruct.c.)

trace/mercury_trace_vars.c:
    Add the debugging code I used to track down this issue, in disabled form.

    Fix missing copyright year.

trace/mercury_trace_browse.c:
    Delete obsolete comment.

    Fix missing copyright years.

tests/debugger/direct_arg_test.{m,inp,exp}:
    A test case for this bug.

tests/debugger/Mmakefile:
    Enable the new test case.

compiler/hlds_out_type_table.m:
    When dumping out the data constructors in the type table,
    if a constructor has names for some of its fields,
    put the name and the type of each field on different lines.
    In the original test case for this bug, of which direct_arg_test.m
    is an extreme simplification, pretty much every line overflows
    without this.

    Also, factor out some duplicated code, and replace bools with values
    of a bespoke type.
2026-03-13 14:50:15 +11:00
dallinjdahl
9820244337 fixing strrchr for C23 (#140)
Thank you.
2026-03-12 07:37:54 +11:00
Zoltan Somogyi
8e841bc63e Improve style, and fix some typos. 2026-03-09 09:13:52 +11:00
Zoltan Somogyi
821e03d8be Fix annoying unneeded recompilations.
runtime/Mmakefile:
    Some time ago, autoconf seems to have changed the text it puts
    into the files it generates, including mercury_conf.h.
    This broke the old code we had in this file that looked for
    the old text. Fix this by updating the pattern we look for.

    Document updates to autoconf as possible reasons for any future
    reoccurrence of this bug.

    Document the reason why we "standardize" the autoconfigured contents
    of mercury_conf.h.

    Fix programming style.

configure.ac:
    The changes in this file are only stylistic; they made it easier
    to track down the above bug.

    Put a comment about "order matters" *before* the lists whose order
    matters. Define the first of two lists being concatenated first
    (since order matters not just for code correctness, but also for
    reading comprehension :-().

    Fix misleading indentation.

    Add an XXX about some (seemingly) unneeded code.
2026-03-09 04:01:31 +11:00
Zoltan Somogyi
1bd50fb8a8 Add --boehm-write-size-map as a runtime option.
When specified (usually in the MERCURY_OPTIONS environment variable,
it tells the runtime to print how many words it allocates for each
request size.
2026-02-27 20:32:38 +11:00
Julien Fischer
45c0b183e4 Delete an ancient workaround for SunOS 4
configure.ac:
runtime/mercury_conf.h.in:
runtime/mercury_goto.h:
    Delete workaround for systems where the assembler did not
    support the .type directive.
2026-02-01 02:53:04 +11:00
Zoltan Somogyi
41b96317c1 Pass the CSD arg to profiling builtins explicitly.
runtime/mercury_deep_rec_depth_body.h:
    Instead assuming that the code #including this file has a variable
    named CSD in scope, get the caller to pass that variable via a macro.

library/profiling_builtin.m:
    Pass that variable using that macro. Since the macro definition is
    not a comment, it won't get a singleton warning. Mark the CSD argument
    of the relevant predicates as not-a-singleton.
2026-01-29 00:11:37 +11:00
Julien Fischer
521d62ee12 Fix a typo.
runtime/mercury_memory_handler.c:
    As above.
2026-01-26 17:58:53 +11:00
Zoltan Somogyi
fe1d779b25 Speed up {int,uint}{,8,16,32,64}_to_string ...
... when targeting C.

library/string.m:
    On 27 Oct 2021, Julien sped up one of these operations
    by avoiding the use of sprintf. Apply the same technique,
    suitably generalized, to all other similar operations.
    Use macros to reduce the amount of code duplication needed.

runtime/mercury_conf.h.in:
    Fix an old bug: spell MR_MERCURY_IS_{32,64}_BITS correctly,
    to match the name that is (conditionally) defined by configure.

configure.ac:
    Delete references to MERCURY_IS_{32,64}_BITS, which turn out
    to be totally unused.
2025-10-03 19:23:37 +10:00
Zoltan Somogyi
cd8187ae2a Fix compilation of mmsc grades. 2025-08-25 15:48:28 +02:00
Zoltan Somogyi
32006a1c7c Rename and generalize .c_debug to .target_debug.
runtime/mercury_grade.h:
    Rename the grade modifier, and the C macro that represents it.

compiler/options.m:
    Rename the --c-debug-grade option to --target-debug-grade.

compiler/compute_grade.m:
    Rename the grade modifier, and the option that represents it.

    Restrict the .target_debut grade modifier to MLDS grades.

compiler/handle_options.m:
    Implement --target-debug-grade by having it imply --target-debug.

compiler/compile_target_code.m:
compiler/link_target_code.m:
    Pay attention to either --target-debug-grade (for purposes related
    to the grade itself) and to --target-debug (for all other purposes).

scripts/canonical_grade.in:
scripts/canonical_grade.sh-subr:
scripts/final_grade_options.sh-subr:
scripts/init_grade_options.sh-subr:
scripts/parse_grade_options.sh-subr:
    Parse target_debug grade modifiers and --target-debug-grade options
    instead of c_debug grade modifiers and --c-debug-grade options.

    Add (normally commented-out) infrastructure to make it easier
    to debug changes.

    Restrict the .target_debut grade modifier to MLDS grades.

scripts/mgnuc.in:
scripts/mgnuc_file_opts.sh-subr:
    Rename some variables to clarify the distinction between the
    --target-debug option (which, like -g, enabled debugging of only one file)
    and the --target-debug-grade option (which enables it for the whole
    program).

configure.ac:
    Make it easier to debug grade-related changes by recording
    both autoconfigured and user-supplied grades that the rejected by
    the canonical_grade script.

    Conform to the changes above.

README.sanitizers:
doc/user_guide.texi:
grade_lib/grade_spec.m:
grade_lib/grade_string.m:
scripts/ml.in:
tests/warnings/help_text.err_exp:
tools/lmc.in:
tools/test_mercury:
    Conform to the changes above.

scripts/Mmake.vars.in:
    Add some XXXs about style.
2025-08-09 21:48:23 +02:00
Zoltan Somogyi
6c1f9a2d09 Get printing optdb-based help texts mostly working.
compiler/print_help.m:
    Switch to a coding scheme that allows us to add a prefix to the
    current optdb struct's help text *after* we generate that help text.
    Use this capability to mark private options as such.

    Do not wrap argument "names" that are not actually names.

compiler/options.m:
    Fix mismatches between option's default values and their help structures
    (such as value being maybe_string, implying an argument, but the
    help structure lacking an argument.

    Add some XXXs.

compiler/globals.m:
    Add some XXXs.

compiler/use_local_vars.m:
    Fix indentation.

compiler/handle_options.m:
runtime/mercury_region.h:
    Fix comments.
2025-06-19 15:32:14 +02:00
Zoltan Somogyi
8ecb023ce6 Document some things about conservative gc.
runtime/mercury_memory.c:
    Document why the implementation of the MR_GC_malloc family of functions
    does not really affect performance.

runtime/mercury_conf_param.h:
    Fix a comment.
2025-05-20 08:30:52 +10:00
Zoltan Somogyi
89fbafc053 Fix indentation. 2025-05-16 20:06:52 +10:00
Zoltan Somogyi
6fb4a322b9 Avoid redundant NULL check on Boehm alloc results.
runtime/mercury_memory.c:
    In the functions that call GC_MALLOC, GC_MALLOC_ATOMIC,
    GC_MALLOC_UNCOLLECTABLE or GC_REALLOC, do not check whether
    the pointers they return are NULL, since the oom (out-of-memory)
    handler we specify for Boehm gc will have already checked for NULL,
    and it will abort the program instead of returning NULL.

runtime/mercury_memory.h:
    Make some comments more readable.

runtime/mercury_wrapper.c:
    Document mercury_memory.c's reliance on the oom handler.

    Check for MR_BOEHM_GC explicitly, not via an inference.
2025-05-05 19:04:56 +10:00
Zoltan Somogyi
a816a5b571 Fix passive voice in a comment. 2025-05-03 17:30:04 +10:00
Peter Wang
02094aa6fe Handle aliased errno names with a secondary switch.
tools/generate_errno_name:
    Use a secondary switch to handle errno names that may be defined to
    the same value as other names.

runtime/mercury_errno_name.c:
    Regenerate this file.
2025-04-30 13:08:12 +10:00
Peter Wang
67512fdb43 Handle ENOTEMPTY == EEXIST in MR_errno_name().
On AIX, both ENOTEMPTY and EEXIST are defined to the same value,
which caused the code generated for MR_errno_name() to fail to compile.

tools/generate_errno_name:
    Handle the case that ENOTEMPTY == EEXIST in the generated function.

    Likewise for other errno name aliases.

runtime/mercury_errno_name.c:
    Regenerate this file.
2025-04-30 11:20:21 +10:00
Peter Wang
cc01a3f67a Fix incorrect uses of strncat.
The size argument to strncat() gives the number of characters to copy
from the input string, *not* the size of the destination buffer.

runtime/mercury_deep_profiling.c:
    Add a safe string concatenation function, MR_safe_strcat.

    Use MR_safe_strcat instead of strncat.
2025-04-28 12:58:40 +10:00
Zoltan Somogyi
0bf54e14ed Specify what happens at memory exhaustion.
runtime/mercury_memory.h:
    State explicitly that the C functions in this module all abort
    when they cannot allocate the memory they were asked to allocate.

    Put the documentation of some functions in the same order as their
    definitions.
2025-04-20 20:50:07 +10:00
Zoltan Somogyi
b4193413f4 Generate shorter code for filling trace slots.
compiler/trace_gen.m:
    Given that the standard trace slots are (by design) always the same,
    generate code that does not redundantly specify them.

runtime/mercury_trace_base.h:
    Add the three macros that trace_gen.m can now generate references to.
2025-03-29 05:27:40 +11:00
Zoltan Somogyi
a56fdd5b86 Delete some leftover relics of the .rt grade. 2025-03-28 17:35:32 +11:00
Zoltan Somogyi
1eb4f55834 Include file names in call site static structures.
Fix a problem that arises in the deep profiler if the program being profiled
was using both intermodule optimization and inlining. The issue was that
even though runtime/mercury_deep_profiling.c had access, for every call site
to the full context of that call site, containing both the file name and
the line number, it wrote out *only* the line number. The deep profiler
then got the file name from the file name stored in the proc_static structure
of the procedure containing the call site.

This works very close to 100% of the time, because

- user-written programs just about never use ":- pragma source_file", and
- in the absence of such pragmas, all the goals in a procedure will be
  from the same file as the procedure's context.

However, if

- a call site calls a procedure in another module,
- the compiler has access to the code of that procedure from a .opt file, and
- the compiler decides to inline that call,

then the call, whose context is in the original source file, will be replaced
by the code of the procedure from the .opt file, whose context will NOT have
the same file name. Any description of this call site will list

- the code from the .opt file (such as the callee's callee),
- the file name from the original source file, and
- the line number from the .opt file.

This mismatch is very confusing, which is why this diff fixes it.

runtime/mercury_deep_profiling.c:
    Fix this by writing out the file name part, as well as the line number
    part, of each call site. The space impact is not as big as one might
    expect, because compiler/deep_profiling.m already had an optimization
    that set the filename part of each call site context to the empty string
    if it was identical to the context of the procedure. Therefore for all
    call sites that do NOT exhibit the bug that this diff fixes, the space
    cost is only a single NULL character in the profiling data file.

    Since this IS a change in the file format, bump the format version number
    from 8 to 9.

deep_profiler/profile.m:
deep_profiler/read_profile.m:
    Handle reading in both version 8 and version 9 profiling data files.

deep_profiler/create_report.m:
    When creating descriptions of call sites, use the call site's filename
    if it is not the empty string; if it is the empty string, then use
    the containing procedure's file name, as we have done all along.

deep_profiler/display_report.m:
deep_profiler/dump.m:
deep_profiler/report.m:
    Dump the new field in commands intended only for implementors.

deep_profiler/startup.m:
    Conform to the changes above.
2025-03-20 20:00:15 +11:00
Julien Fischer
e1c7cd2fee Fix typo.
runtime/mercury_engine.h:
    As above.
2024-10-14 00:04:09 +11:00
Zoltan Somogyi
9fd51ae7af Add --deep-std-name option to control deep prof file names.
When this new runtime option is specified, the runtime system will use
Deep.{data,procrep} as the names of the files it writes out.

runtime/mercury_engine.h:
    Add a flag to the engine that records whether this option has been
    specified or not.

runtime/mercury_wrapper.c:
    Set the flag if/when we see the --deep-std-name option.

runtime/mercury_deep_profiling.c:
    If the new flag is set, use Deep.{data,procrep} as filenames.

doc/user_guide.texi:
    Document the new option.

tools/bootcheck:
    Specify the new option for bootchecks.

tests/debugger/Mmakefile:
tests/declarative_debugger/Mmakefile:
tests/hard_coded/Mmakefile:
tests/par_conj/Mmakefile:
tests/stm/Mmakefile:
    When specifying a value of MERCURY_OPTIONS that overrides the value
    set by tools/bootcheck, include --deep-std-name in that value.
2024-10-13 19:47:06 +11:00
Julien Fischer
15bd6da750 Fix XXX WINDOWS in deep profiling runtime.
runtime/mercury_deep_profiling.c:
    Use the wide character versions of some file operations on
    Windows.
2024-10-06 16:07:59 +11:00
Zoltan Somogyi
74559b7f4f Use better names than Deep.data for profiling data.
runtime/mercury_deep_profiling.c:
    Replace the "Deep" in "Deep.data" and "Deep.procrep" with

    - the name of the executable program, and
    - the date and time of the profiling run,

    yielding names such as

    mercury_compile_on_2024-10-05_at_08-43-34.data
    mercury_compile_on_2024-10-05_at_08-43-34.procrep

doc/user_guide.texi:
    Document the new filenames.

NEWS.md:
    Announce the change.
2024-10-05 08:53:50 +02:00
Zoltan Somogyi
958ba814bc Show cliques in mdb stack traces less obtrusively.
like this:

mdb> stack
   0         pred mutrec.q1/3-0 (det) (mutrec.m:133)
   1 ┌    2* pred mutrec.p2/3-0 (det) (mutrec.m:106 and others)
   3 │       pred mutrec.p3/3-0 (det) (mutrec.m:122)
   4 │    2* pred mutrec.p2/3-0 (det) (mutrec.m:102 and others)
   6 └       pred mutrec.p3/3-0 (det) (mutrec.m:122)
   7      2* pred mutrec.p1/3-0 (det) (mutrec.m:82 and others)
   9         pred mutrec.test/2-0 (det) (mutrec.m:42)
  10         pred mutrec.main/2-0 (det) (mutrec.m:33)

This implements a suggestion by Peter Wang from *April 2012*.

runtime/mercury_stack_trace.h:
    Add a new field to the data structure that represents each line
    in a stack trace. This field represents the string, if any,
    that implements the boxes around the calls in the clique.
    This will be two characters, a box character and a space,
    e.g. for the call at level 6, it will be "└ ".

runtime/mercury_stack_trace.c:
    Use the new capability to add boxes to calls in cliques,
    and equal-length padding to the calls that are not in cliques
    but which share stack traces with calls that are.

    Initialize some variables closer to their first use.
    Group some related statements together. Add some comments.
    Delete some already-acted-upon TODOs.

tests/debugger/mutrec.exp:
tests/debugger/mutrec_higher_order.exp:
    Expect the updated clique markers.

NEWS.md:
doc/user_guide.texi:
    Document the change.
2024-10-02 10:38:02 +02:00
Peter Wang
dffa94f8fd Don't disable Boehm GC during startup.
Don't call GC_disable() before GC_INIT(), which we did to avoid GC
during early runtime initialisation. This once had a small benefit,
but will not work after we upgrade to Boehm GC with commit 52a538ff
"Fix handling of GC_gc_no counter wrap in GC_clear_stack".

runtime/mercury_wrapper.c:
    As above.
2024-09-24 16:06:27 +10:00
Zoltan Somogyi
c148ce54fe Update style in a bunch of Mmakefiles.
Mmake.common.in:
bindist/Mmakefile:
compiler/Mmakefile:
compiler/notes/Mmakefile:
doc/Mmakefile:
extras/align_right/Mmakefile:
extras/base64/Mmakefile:
extras/dynamic_linking/Mmakefile:
extras/error/Mmakefile:
extras/fixed/Mmakefile:
extras/graphics/samples/gears/Mmakefile.MacOSX:
extras/graphics/samples/maze/Mmakefile.MacOSX:
extras/lex/Mmakefile:
extras/monte/Mmakefile:
extras/posix/Mmakefile:
extras/references/Mmakefile:
extras/references/samples/Mmakefile:
extras/split_file/Mmakefile:
library/Mmakefile:
mdbcomp/Mmakefile:
runtime/Mmakefile:
scripts/Mmakefile:
ssdb/Mmakefile:
tests/Mmake.common:
tests/mmc_make/Mmakefile:
trace/Mmakefile:
util/Mmakefile:
    Invoke the sh builtin "test" as "test", not as "[".

    Make some target names more descriptive.

    Fix indentation.
2024-09-17 11:09:18 +02:00
Peter Wang
a6606497e4 Use clang compiler intrinsics for more atomic ops.
runtime/mercury_atomic_ops.h:
    Use compiler intrinsic for MR_ATOMIC_ADD_AND_FETCH_WORD_BODY
    when compiling with clang, instead of falling back to the
    compare and swap implementation.

    Use compiler intrinsic for MR_ATOMIC_SUB_INT_BODY
    when compiling with clang but not targeting x86/x86-64.
    In those cases, MR_atomic_sub_int() was not being defined,
    and causing a link error.

    Use compiler intrinsic for MR_ATOMIC_DEC_AND_IS_ZERO_WORD_BODY
    when compiling with clang but not targeting x86/x86-64.
2024-09-02 10:27:50 +10:00
Peter Wang
cbd8bf182e Use (void) for functions taking no arguments.
runtime/mercury_deep_profiling.c:
runtime/mercury_memory_zones.c:
runtime/mercury_trace_base.c:
    As above. clang -Wstrict-prototypes warns about functions with a
    "()" argument list, even if there was a prototype declared with
    "(void)".
2024-08-29 11:46:44 +10:00
Zoltan Somogyi
4ec5c99312 Handle \e in runtime/mercury_trace_base.c.
runtime/mercury_trace_base.c:
    Add code to print escape characters as \e.

    Reorder the existing escape sequences.

    Fix a bug: a CR was printed as \b, not as \r.

library/mercury_term_lexer.m:
    Add the two affected functions in mercury_trace_base.c
    to the list of places that handle escape sequences.
2024-04-28 23:39:56 +10:00
Zoltan Somogyi
f4c2b62176 Add support for \e as the escape char, stage 1.
library/mercury_term_lexer.m:
    Convert any occurrences of the \e escape sequence to the escape character.

    List all the places in the library, compiler and runtime that also handle
    escape sequences, some of which handle all these sequences,
    and some of which handle only subsets.

    Sort the letters in recognized escape sequences.

compiler/parse_tree_out_pragma.m:
library/rtti_implementation.m:
library/term_io.m:
    Add comments to all the other places that handle escape sequences
    that direct readers to mercury_term_lexer.m as containing the master list
    of such sequences.

    Add commented-out code that, after stage 1 has been installed,
    stage 2 should enable.

runtime/mercury_ml_expand_body.h:
runtime/mercury_string.c:
    Turn escape characters back into their escape sequence form
    for characters and strings.

tests/valid_seq/char_escape_opt_helper_1.m:
    Test whether the compiler accepts \e as an escape sequence.

compiler/options.m:
    Add a mechanism for detecting the presence of this diff in the
    installed compiler.
2024-04-28 16:49:18 +10:00
Peter Wang
5e517abcad Fix comment.
runtime/mercury_string.h:
    MR_utf8_get_next may advance pos past the NUL terminator of a
    string.
2024-04-02 12:44:41 +11:00
Zoltan Somogyi
fd8c495e4e Fix trie string switches, speed them up, ...
... and turn them back on.

compiler/c_util.m:
    Provide a version of output_quoted_string_c which accepts strings
    that may not be well formed, and prints out any code_units that
    do not belong to a well-formed UTF-8 code point using an octal
    escape sequence.

    Ask for some predicates with trivial bodies to be inlined.

compiler/llds_out_data.m:
    Use the new predicate in c_util to output string constants.
    This fixes the bug that could cause trie string switches to fail
    in the presence of strings containing multi-byte code points.
    In such cases, given the string constants that trie switches can generate,
    which contain only part of such a multi-byte code unit sequence,
    llds_out_data.m used to write out a C string constant that contained
    the unicode replacement character instead of those code units.

compiler/string_switch.m:
    If a trie node has four or more code units that can lead to matches, then
    use binary search instead of linear search to find out what to do next.

    To make this possible, separate the action of generating such search code
    from the action of finding out what to do for each possible code unit
    value.

    Generate comments that can be helpful in tracking down bugs such as
    the one described above.

compiler/switch_gen.m:
    Allow trie string switches once again.

runtime/mercury_ml_expand_body.h:
    Add an XXX for new behavior that I found the need for while
    debugging this change.

tests/hard_coded/string_switch*.{m,exp}:
    Add some new alternatives to each switch to create nodes
    at which the LLDS code generator will now use binary search.
    Add a query string to test one of these alternatives.
    (All four test cases are identical, apart from their module names.)

    Expect the changes for the existing keys caused by the new switch arms,
    as well as the extra outputs for the new query string.
2024-03-31 03:27:09 +11:00
Julien Fischer
9f966be807 Fix some comments.
runtime/mercury_string.h:
    As above.
2024-03-31 02:41:33 +11:00
Zoltan Somogyi
20a4c81dbd Clarify what functions do for non-well-formed strings.
runtime/mercury_string.h:
    Expand the comments on the functions that work with UTF-8 strings
    by describing their behavior in the presence of non-well-formed
    code unit sequences in more detail.

runtime/mercury_string.c:
    Improve the style of the code, and the text of some comments.
2024-03-30 10:00:48 +11:00
Zoltan Somogyi
9dbee8bdb4 Implement trie string switches for the LLDS backend.
For now, the implementation covers only non-lookup switches.

compiler/builtin_ops.m:
    Generalize the existing offset_str_eq binary op by adding an optional
    size parameter, which, if present, restricts the equality test to look at
    the given number of code units at most.

compiler/llds_out_data.m:
compiler/mlds_to_c_data.m:
    Generalize the output of binop rvals whose operation is offset_str_eq.
    In llds_out_data.m, fix a bug in the original code. (This bug did not
    lead to problems because before this diff, we never generated this op.)

compiler/string_switch_util.m:
    Add a predicate that recognizes when a trie node that is NOT a leaf
    nevertheless represents the top of a stick, which means that it has
    only one possible next code unit, which itself may have only one
    possible next code unit, and so on, until we reach a node that *does*
    have two or more next code units. (One of those may be the code unit
    of the string-ending NULL character.)

compiler/ml_string_switch.m:
    Use the new predicate in string_switch_util.m to generate better code
    for sticks. Instead of comparing each character in the stick individually
    against the relevant code unit of the string being switched on, compare
    them all at once using the new binary op.

compiler/ml_switch_gen.m:
    Insist on both the host machine and the target machine
    using the C backend.

compiler/string_switch.m:
    Implement non-lookup trie switches. The code follows the approach used
    in ml_string_switch.m as much as possible, but there are plenty of
    differences caused by targeting the LLDS.

    Rename some predicates to specify which switch implementation method
    they belong to.

    Write a comment just once, and refer to it from elsewhere instead of
    duplicating it at each reference site.

compiler/switch_gen.m:
    Enable the use of trie switches when the option values call for it,
    and when the switch is not a lookup switch.

compiler/cse_detection.m:
    Do not flood the output of mmc -V with messages that have nothing to do
    with the module being compiled.

compiler/options.m:
    Add a way to specify --no-allow-inlining on the command line.
    This can help debug code generator changes like this, by disallowing
    a transform that can modify the Mercury code whose compilation process
    you are trying to debug. (The documentation of the --inlining option
    implies that --no-inlining should do the same job, but it does not.)
    The option is not documented for users.

compiler/string_encoding.m:
    Provide a version of from_code_unit_list_in_encoding that allows
    non-well-formed code unit sequences as input, and provide det versions
    of both versions. This is for use by both string_switch.m and
    ml_string_switch.m.

compiler/hlds_goal.m:
    Document the properties of case_ids.

compiler/llds.m:
    Document the possibility that string constants are not well formed.

compiler/bytecode.m:
compiler/code_util.m:
compiler/mlds_dump.m:
compiler/ml_global_data.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_java_data.m:
compiler/opt_debug.m:
    Conform to the changes above.

library/string.m:
    Replace the non-exported test predicate internal_encoding_is_utf8 with
    an exported function that returns an enum specifying the string encoding.

NEWS.md:
    Announce the new function.

runtime/mercury_string.h:
    Add the C macro that implements the new form of the offset_str_eq
    binary op.

tests/hard_coded/string_switch4.{m,exp}:
    We have long had three copies of the exact same code, in string_switch.m,
    string_switch2.m and string_switch3.m, which were compiled with

    - no smart switch implementation
    - smart switch implementation forced to use the hash table method
    - smart switch implementation forced to use binary search method

    Add this new copy, which is compiled with

    - smart switch implementation forced to use the new trie method

tests/hard_coded/Mmakefile:
    Add the new test case.

tests/hard_coded/Mercury.options:
    Update the options of the test cases, and specify them for the new.

tests/hard_coded/string_switch.m:
tests/hard_coded/string_switch2.m:
tests/hard_coded/string_switch3.m:
    Update the top-of-module comment block to be identical in all four copies
    of this module.
2024-03-26 21:17:31 +11:00
Zoltan Somogyi
9828700994 Check for surrogate chars in conversions to utf8.
library/char.m:
    Export a new predicate, char_int_is_surrogate, the duplicates
    the job of the MR_is_surrogate macro in the runtime, for use by
    the new check in string.m. Add a comment about the code duplication.
    The new predicate is not documented for users.

runtime/mercury_string.h:
    Add a comment about the code duplication.

library/string.m:
    Use the new predicate in char.m to check for surrogates when converting
    a code unit list to an utf8 string.
2024-03-26 18:36:25 +11:00
Julien Fischer
f5e71b1e90 Fix copyright notices in recently modified files.
compiler/*.m:
library/*.m:
mdbcomp/*.m:
runtime/*.[ch]:
    As above.

    Fix spelling in some spots.
2024-02-20 15:09:17 +11:00
Zoltan Somogyi
3f30497acb Add and use full_memory_stats_are_available.
library/benchmarking.m:
    Add this new predicate.

runtime/mercury_report_stats.[ch]:
    Implement this predicate for C. The implementation is next to
    the implementation of report_full_memory_stats.

NEWS.md:
    Announce the new predicate.

compiler/mercury_compile_main.m:
    Use the new predicate to avoid nuisance output from
    report_full_memory_stats in grades where those stats are not available.
2024-01-31 06:32:16 +11:00
Julien Fischer
3c82149d00 Fix a typo.
runtime/mercury_context.c:
    As above.
2024-01-26 15:52:35 +11:00
Julien Fischer
f256369bbb Attempt to preserve file permissions when copying.
Make the Mercury implementation of copying attempt to preserve file permissions
if we are on system that supports stat() and chmod().

compiler/copy_util.m:
    As above.

configure.ac:
runtime/mercury_conf.h.in:
    Check for the presence of the chmod() function.
2024-01-07 17:09:05 +11:00
Peter Wang
e797cd338f Update error messages for renamed functions.
runtime/mercury_string.c:
    Update error messages in MR_utf8_to_wide and MR_wide_to_utf8.
2023-12-20 15:32:02 +11:00