Commit Graph

60 Commits

Author SHA1 Message Date
Zoltan Somogyi
2a12732260 Improve well-formedness checks on UTF strings.
library/io.m:
    Add the predicates

        read_named_file_as_string_wf
        read_named_file_as_lines_wf
        read_file_as_string_wf
        read_file_as_string_and_num_code_units_wf

    which extend their base predicates (without the _wf suffix) by checking
    whether the string read from the file is well formed, and returning
    an error if it is not.

library/io.stream_op.m:
    Fix and expand a comment.

library/io.text_read.m:
    Add a comment.

library/string.m:
    Add check_well_formedness, a predicate that checks whether a string
    is well formed, and (if relevant) specifies the offset of the first
    non-well-formed character.

    Make the documentation of index_next and index_next_repl more detailed.

runtime/mercury_string.[ch]:
    Define MR_utf8_find_ill_formed_char, a version of MR_utf8_verify
    that returns the offset of the first ill-formed UTF-8 char in the
    given string, if there is one. This is used in the implementation of
    check_well_formedness for C.

NEWS.md:
    Announce the new library predicates.

    Sort some lists of pred names.
2023-04-21 15:55:23 +10:00
Zoltan Somogyi
4f9eb3e450 Update programming style. 2023-04-16 15:27:22 +10:00
Adrian Wong
5e827201f3 Fix a bug where valid code points were treated like invalid surrogates.
runtime/mercury_string.h:
    As above. Valid code points in the ranges [1D800, 1DFFF] and
    [10D800, 10DFFF] were treated like invalid surrogates because the
    high bits were masked away.

tests/hard_coded/Mmakefile:
tests/hard_coded/char_not_surrogate.exp:
tests/hard_coded/char_not_surrogate.m:
tests/hard_coded/string_not_surrogate.exp:
tests/hard_coded/string_not_surrogate.m:
    Add test cases.
2021-11-04 00:46:05 +11:00
Peter Wang
025bee0549 Check for surrogates when converting list of char to string.
library/string.m:
    Make from_char_list, from_rev_char_list, to_char_list throw an
    exception if the list of chars includes a surrogate code point that
    cannot be encoded in a UTF-8 string.

    Make semidet_from_char_list, semidet_from_rev_char_list,
    to_char_list fail if the list of chars includes a surrogate code
    point that cannot be encoded in a UTF-8 string.

runtime/mercury_string.h:
    Document return value of MR_utf8_width.

tests/hard_coded/Mmakefile:
tests/hard_coded/string_from_char_list_ilseq.exp:
tests/hard_coded/string_from_char_list_ilseq.exp2:
tests/hard_coded/string_from_char_list_ilseq.m:
    Add test case.

tests/hard_coded/null_char.exp:
    Expect new message in exceptions thrown by from_char_list,
    from_rev_char_list.

tests/hard_coded/string_hash.m:
    Don't generate surrogate code points in random strings.
2019-10-30 11:21:02 +11:00
Peter Wang
c08db9e5f7 Make MR_utf8_get return error for more ill-formed sequences.
runtime/mercury_string.c:
    Make MR_utf8_get, MR_utf8_get_mb return an error for code unit
    sequences that encode a surrogate code point, or a code point
    greater than U+10FFFF.

runtime/mercury_string.h:
    Adjust some comments.
2019-09-13 15:51:02 +10:00
Peter Wang
ed1b3fe19c Note some future work for handling ill-formed sequences in strings.
library/io.m:
library/string.m:
    Add warnings that Mercury strings are not limited to valid UTF-8 or
    UTF-16.

library/string.m
library/string.format.m
library/term_io.m
    Note some current behaviours of predicates/functions when passed
    strings containing invalid UTF-8 or UTF-16, and what to do about it.

runtime/mercury_string.h:
    Note behaviour of MR_utf8_encode.
2019-09-09 10:47:48 +10:00
Adrian Wong
a589fd1164 Fix some comments related to hashing.
library/hash_table.m:
library/string.m:
runtime/mercury_string.h:
    As above.
2019-07-29 11:35:53 +10:00
Peter Wang
2be2e7468c Do not use _snprintf functions directly in place of snprintf functions.
The Windows _snprintf family of functions do not guarantee null
termination when the output is truncated so cannot be used as direct
replacements for the snprintf functions. Also, the _snprintf functions
have different return values from the C99 snprintf functions when output
is truncated (like some older snprintf implementations).

Furthermore, on Windows snprintf/vsnprintf may be synonyms for
_snprintf/_vsnprintf so cannot be relied upon to terminate their outputs
either, even if the functions exist.

runtime/mercury_string.c:
runtime/mercury_string.h:
    Define MR_snprintf and MR_vsnprintf as macro synonyms for
    snprintf/vsnprintf ONLY if _snprintf/_vsnprintf do not exist.

    Otherwise, implement MR_snprintf and MR_vsnprintf functions
    that behave like the C99 functions, in terms of _vsnprintf.

    Require that either snprintf/vsnprintf or _snprintf/_vsnprintf
    are available. This should be true on all systems still in use.

runtime/mercury_debug.c:
runtime/mercury_ml_expand_body.h:
runtime/mercury_runtime_util.c:
runtime/mercury_stack_layout.c:
runtime/mercury_stack_trace.c:
runtime/mercury_stacks.c:
runtime/mercury_tabling.c:
runtime/mercury_threadscope.c:
runtime/mercury_trace_base.c:
runtime/mercury_wrapper.c:
trace/mercury_trace_completion.c:
trace/mercury_trace_internal.c:
trace/mercury_trace_spy.c:
trace/mercury_trace_vars.c:
bytecode/mb_disasm.c:
    Use MR_snprintf instead of snprintf/_snprintf
    and MR_vsnprintf instead of vsnprintf/_vsnprintf.

    Drop code paths using sprintf as a fallback.
2018-07-23 10:26:29 +10:00
Julien Fischer
4455d3601e Escape characters in strings returned by deconstruct.functor/4.
runtime/mercury_string.[ch]:
    Add a function that returns a copy of a string inside double
    quotes and with escapes for any control characters inserted.

    Add macro that tests if a character is a control character.

runtime/mercury_ml_expand_body.h:
    Use the above function to escape characters in the representation
    of string functors.

library/rtti_implementation.m:
    Do the same for the C# and Java backends.

library/term_io.m:
    Update a comment.

NEWS:
    Announce the above change to deconstruct.functor/4, as well
    the earlier change to its handling of characters.

tests/hard_coded/deconstruct_arg.m:
tests/hard_coded/deconstruct_arg.exp:
    Extend this test case to cover strings.
2018-07-11 01:42:33 +10:00
Mark Brown
d465fa53cb Update the COPYING.LIB file and references to it.
Discussion of these changes can be found on the Mercury developers
mailing list archives from June 2018.

COPYING.LIB:
    Add a special linking exception to the LGPL.

*:
    Update references to COPYING.LIB.

    Clean up some minor errors that have accumulated in copyright
    messages.
2018-06-09 17:43:12 +10:00
Zoltan Somogyi
53b573692a Convert C code to use // style comments.
runtime/*.[ch]:
trace/*.[chyl]:
    As above. In some places, improve comments, e.g. by expanding contractions
    such as "we've". Add #ifndef guards against double inclusion around
    the trace/*.h files that did not already have them.

tools/*:
    Make the corresponding changes in shell scripts that generate .[ch] files
    in the runtime.

tests/*:
    Conform to a slight change in the text of a message.
2016-07-14 13:57:35 +02:00
Zoltan Somogyi
67326f16e4 Fix style issues in the runtime.
Move all .h and .c files to four-space indentation without tabs,
if they weren't there already.

Use the same vim line for all .h and .c files.

Align all backslashes at the ends of lines in macro definitions.
Align close comment signs.

In some places, fix inconsistent indentation.

Fix a bunch of comments. Add XXXs to a few of them.
2016-07-09 12:14:00 +02:00
Zoltan Somogyi
7c8eeadabf Fix comments. 2015-12-17 00:11:18 +11:00
Zoltan Somogyi
d041b83943 Implement string switches via tries for the MLDS backend.
The code we emit to decide which arm of the switch is selected looks like this:

    case_num = -1;
    switch (MR_nth_code_unit(switchvar, 0)) {
        case '98':
            switch (MR_nth_code_unit(switchvar, 1)) {
                case '99':
                    if (MR_offset_streq(2, switchvar, "abc"))
                        case_num = 0;
                    break;
                case '100':
                    if (MR_offset_streq(2, switchvar, "aceg"))
                        case_num = 1;
                    break;
            }
            break;
        case '99':
            if (MR_offset_streq(2, switchvar, "bbb"))
                case_num = 2;
            break;
    }

The part that acts on this will look like this for lookup switches:

    if (case_num < 0)
        succeeded = MR_FALSE;
    else {
        outvar1 = vector_common[case_num].f1;
        ...
        outvarn = vector_common[case_num].fn;
        succeeded = MR_TRUE;
    }

and like this for non-lookup switches:

    switch (case_num) {
    case 0:
        <code for case 0>
        break;
    ...
    case n:
        <code for case n>
        break;
    default:                    /* if the switch is can_fail */
        <code for failure>
        break;
    }

compiler/ml_string_switch.m:
    Implement both non-lookup and lookup string switches via tries,
    along the lines shown above.

compiler/ml_switch_gen.m:
    Invoke the predicates that implement string switches via tries
    in the circumstances in which option values call for them.

    For now, we generate tries only for the C backend. Once the
    problems identified for mlds_to_{cs,java,managed} below are fixed,
    we can enable them on those backends as well.

compiler/options.m:
doc/user_guide.texi:
    Add an option that governs the minimum size of trie switches.

compiler/ml_lookup_switch.m:
    Factor out the code common to the implementation of all model-non
    lookup switches, both in ml_lookup_switch.m and ml_string_switch.m,
    and put it all into a new exported predicate.

    The previously existing MLDS implementation methods for lookup switches
    all build their lookup tables from maps that maps each cons_id
    in the switch cases to the values of the output arguments of those cases.
    For switch cases that apply to more than one cons_id, this map had
    one entry for each of those cons_ids. For tries, we need a map
    from *case ids*, not *cons ids* to the outputs. Since it is
    easier to convert the one-to-one case_id->outputs map to the
    many-to-one cons_id->outputs map than vice versa, change the
    main data structure from which lookup tables are built to store data
    in a case_id->outputs format, and provide predicates for its conversion
    to the other (previously the only) format.

    Rename ml_gen_lookup_switch to ml_gen_atomic_lookup_swith to distinguish
    it from other predicates that also generate (other kinds of) lookup
    switches.

compiler/switch_util.m:
    Have the types representating lookup tables represent their contents
    as a map, not as the assoc list derived from the map. Previously,
    we didn't do anything with the map other than flatten it to the assoc list,
    but for the MLDS backend, we may now also need to convert it to another
    form of map (see immediately above).

compiler/builtin_ops.m:
    Add two new builtin ops. The first, string_unsafe_index_code_unit,
    returns the nth code unit in a string; the second, offset_str_eq,
    does a string equality test on the nth and later code units of
    two strings. They are used in the implementation of tries.

compiler/c_util.m:
    Add a new binop category for each new binop, since they are not like
    existing binops.

    Put some existing binops into their own categories as well, since
    bundling them with the other ops they were bundled with seems like
    a bad idea.

compiler/hlds_goal.m:
    Make the identifier of switch arms in tagged_cases a separate type
    from int.

compiler/mlds_to_c.m:
compiler/llds_out_data.m:
    Handle the new kinds of binops.

    When writing out binop expressions, we used to do a switch on the binop
    to get its category, and then another switch on the category. We now
    switch on the binop directory, since this much harder to write out
    code using new binops badly, and should be faster to boot.

    In mlds_to_c.m, also make some cosmetic changes to the output to make it
    easier to read, and thus to debug.

compiler/mlds_to_il.m:
    Handle the new kinds of binops.

compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
    Do not handle the new kinds of binops, since doing so would require
    changing the whole approach of how these modules handle binops.

    Clean up some predicates.

compiler/bytecode.m:
compiler/erl_call_gen.m:
compiler/lookup_switch.m:
compiler/ml_global_data.m:
compiler/ml_optimize.m:
compiler/ml_tag_switch.m:
compiler/opt_debug.m:
compiler/string_switch.m:
    Conform to the changes above.

compiler/ml_code_gen.m:
    Put the predicates of this module into a consistent order.

library/string.m:
    Fix white space.

runtime/mercury_string.h:
    Add a macro for each of the two new builtin operations.
2015-02-24 16:03:30 +11:00
Zoltan Somogyi
93ffbc4375 Add string hashing ops that are independent of word width.
We currently have three string hashing functions,

    string.hash
    string.hash2
    string.hash3

This diff adds

    string.hash4
    string.hash5
    string.hash6

each of which is the same as the corresponding previously existing hash
function, but cuts off any bits beyond the bottom 30. This should make these
functions return the same results on both 32 and 64 bit platforms. (We could
possibly keep one more bit, but it would require working on unsigned
immediates, and that is too hard to do in the Mercury version.)

library/string.m:
runtime/mercury_string.h:
    Add Mercury and C implementations

compiler/builtin_ops.m:
    Add hash[456] as new builtin ops.

compiler/c_util.m:
compiler/java_util.m:
    Map the new builtin ops to their implementations.

compiler/switch_util.m:
    Decide between string.hash{4,5,6}, not string.hash{,2,3}.

    Avoid using numeric suffixes on predicate names.

compiler/*.m:
    Conform to the changes above.
2015-02-11 16:17:54 +11:00
Zoltan Somogyi
4776d05cbc Fix the I/O section of the transition guide. 2015-02-11 09:31:43 +11:00
Zoltan Somogyi
1c318fe0c8 Convert mercury_string.h to four-space indentation. 2015-02-09 10:32:35 +11:00
Peter Wang
dcd969f61e Optimise some UTF-8 routines in C grades and fix a few bugs.
Branches: main, 11.07

Optimise some UTF-8 routines in C grades and fix a few bugs.

library/string.m:
	Avoid function calls in unsafe_index, unsafe_index_next, and
	unsafe_prev_index in the ASCII case.

	Handle illegal code unit at start of string in first_char(in, uo, in)
	and first_char(in, uo, uo) modes.

runtime/mercury_string.c:
runtime/mercury_string.h:
	Fix a bug where MR_utf8_next would not advance from pos 0.  Fortunately
	MR_utf8_next is only rarely called, to skip past illegal code units.

	Delete redundant initial test in MR_utf8_prev.

	Add MR_utf8_get_mb to extract multibyte code points only.
	Unroll a loop.

	Add MR_utf8_get_next_mb to extract multibyte code points only.

	Make MR_utf8_prev_get avoid an extra function call in the ASCII case.

	Use MR_Integer consistently for string offsets instead of int.
2012-03-26 06:57:34 +00:00
Julien Fischer
6db56dea42 Use MR_GNUC in place of __GNUC__ in some spots.
Branches: main, 11.07

runtime/mercury_string.h:
runtime/mercury_types.h:
	Use MR_GNUC in place of __GNUC__ in some spots.
2011-08-02 08:28:19 +00:00
Zoltan Somogyi
b4092d2e4e Further improvements in the implementation of string switches, along with
Estimated hours taken: 12
Branches: main

Further improvements in the implementation of string switches, along with
some bug fixes.

If the chosen hash function does not yield any collisions for the strings
in the switch arms, then we can optimize away the table column that we would
otherwise need for open addressing. This was implemented in a previous diff.

For an ordinary (non-lookup) string switch, the hash table has two columns
in the presence of collisions and one column in their absence. Therefore if
doubling the size of the table allows us to eliminate collisions, the table
size is unaffected, though the corresponding array of labels we have to put
into the computed_goto instruction we generate has to double as well.
Thus the only cost of such doubling is an increase in "code" size, and
for small tables, the elimination of the open addressing loop may compensate
for this, at least partially.

For lookup string switches, doubling the table size this way has a bigger
space cost, but the elimination of the open addressing loop still brings
a useful speed boost.

We therefore now DO double the table size if this eliminates collisions.
In the library, compiler etc directories, this eliminates collisions in
19 out of 47 switch switches that had collisions with the standard table size.

compiler/switch_util.m:
	Replace the separate sets of predicates we used to have for computing
	hash maps (one for lookup switches and one for non-lookup switches)
	with a single set that works for both.

	Change this set to double the table size if this eliminates collisions.
	This requires it to decide the table size, a task previously done
	separately by each of its callers.

	One version of this set had an old bug, which caused it to effectively
	ignore the second and third string hash functions. This diff fixes it.

	There were two bugs in my previous diff: the unneeded table column
	was not being optimized away from several_soln lookup switches, and the
	lookup code for one_soln lookup switches used the wrong column offset.
	This diff fixes these too.

	Since doubling the table size requires recalculating all the hash
	values, decouple the computation of the hash values from generating
	code for each switch arm, since the latter shouldn't be done more than
	once.

	Add a note on an old problem.

compiler/ml_string_switch.m:
compiler/string_switch.m:
	Bring the code for generating code for the arms of string switches
	here from switch_util.m.

tests/hard_coded/Mmakefile:
	Fix the reason why the bugs mentioned above were not detected:
	the relevant test cases weren't enabled.

tests/hard_coded/string_hash.m:
	Update this test case to test the correspondence of the compiler's
	and the runtime's versions of not just the first hash function,
	but also the second and third.

runtime/mercury_string.h:
	Fix a typo in a comment.
2011-08-02 00:05:44 +00:00
Peter Wang
7e26b55e74 Implement a new form of memory profiling, which tells the user what memory
Branches: main

Implement a new form of memory profiling, which tells the user what memory
is being retained during a program run.  This is done by allocating an extra
word before each cell, which is used to "attribute" the cell to an
allocation site.  The attribution, or "allocation id", is an address to an
MR_AllocSiteInfo structure generated by the Mercury compiler, giving the
procedure, filename and line number of the allocation, and the type
constructor and arity of the cell that it allocates.

The user must manually instrument the program with calls to
`benchmarking.report_memory_attribution', which forces a GC and summarises
the live objects on the heap using the attributions.  The mprof tool is
extended with a new mode to parse and present that data.

Objects which are unattributed (e.g. by hand-written C code which hasn't
been updated) are still accounted for, but show up in profiles as "unknown".

Currently this profiling mode only works in conjunction with the Boehm
garbage collector, though in principle it can work with any memory allocator
for which we can access a list of the live objects.  Since term size
profiling relies on the same technique of using an extra word per memory
cell, the two profiling modes are incompatible.

The output from `mprof -s' looks like this:

------ [1] some label ------
   cells            words         cumul  procedure / type (location)
   14150            38872                total

*   1949/ 13.8%      4872/ 12.5%  12.5%  <predicate `parser.parse_rest/7' mode 0>
     975/  6.9%      1950/  5.0%         list.list/1 (parser.m:502)
     487/  3.4%      1948/  5.0%         term.term/1 (parser.m:501)
     487/  3.4%       974/  2.5%         term.const/0 (parser.m:501)

*   1424/ 10.1%      4272/ 11.0%  23.5%  <predicate `parser.parse_simple_term_2/6' mode 0>
     708/  5.0%      2832/  7.3%         term.term/1 (parser.m:643)
     708/  5.0%      1416/  3.6%         term.const/0 (parser.m:643)
...


boehm_gc/alloc.c:
boehm_gc/include/gc.h:
boehm_gc/misc.c:
boehm_gc/reclaim.c:
	Add a callback function to be called for every live object after a GC.

	Add a function to write out the GC_size_map array.

compiler/layout.m:
	Define the alloc_site_info type which is equivalent to the
	MR_AllocSiteInfo C structure.

	Add alloc_site_array as a kind of "layout" array.

compiler/llds.m:
	Add allocation sites to `cfile' structure.

	Replace TypeMsg argument (which was also for profiling) on `incr_hp'
	instructions by an allocation site identifier.

	Add a new foreign_proc_component for allocation site ids.

compiler/code_info.m:
compiler/global_data.m:
compiler/proc_gen.m:
	Keep the set of allocation sites in the code_info and global_data
	structures.

compiler/unify_gen.m:
	Add allocation sites to LLDS allocation instructions.

compiler/layout_out.m:
compiler/llds_out_file.m:
compiler/llds_out_instr.m:
	Output MR_AllocSiteInfo arrays in generated C files.

	Output code to register the MR_AllocSiteInfo array with the Mercury
	runtime.

	Output allocation site ids for memory allocation instructions.

compiler/llds_out_util.m:
	Add allocation sites to llds_out_info.

compiler/pragma_c_gen.m:
compiler/ml_foreign_proc_gen.m:
	Generate a macro MR_ALLOC_ID which resolves to an allocation site
	structure, for every foreign_proc whose C code contains the string
	"MR_ALLOC_ID".  This is to be used by hand-written C code which
	allocates memory.

	MR_PROC_LABELs are retained for backwards compatibility.  Though
	they were introduced for profiling, they seem to have been co-opted
	for printf-debugging since then.

compiler/ml_global_data.m:
	Add allocation site structures to the MLDS global data.

compiler/mlds.m:
compiler/ml_unify_gen.m:
	Add allocation site id to `new_object' instruction.

compiler/mlds_to_c.m:
	Output allocation site arrays and allocation ids in high-level C code.

	Output a call to register the allocation site array with the Mercury
	runtime.

	Delete an unused predicate.

compiler/exprn_aux.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/mercury_compile_llds_back_end.m:
compiler/middle_rec.m:
compiler/ml_accurate_gc.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_util.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/use_local_vars.m:
compiler/var_locn.m:
	Conform to changes.

compiler/pickle.m:
compiler/prog_event.m:
compiler/timestamp.m:
	Conform to changes in memory allocation macros.

library/benchmarking.m:
	Add the `report_memory_attribution' instrumentation predicates.

	Conform to changes to MR_memprof_record.

library/array.m:
library/bit_buffer.m:
library/bitmap.m:
library/construct.m:
library/deconstruct.m:
library/dir.m:
library/io.m:
library/mutvar.m:
library/store.m:
library/string.m:
library/thread.semaphore.m:
library/version_array.m:
	Use attributed memory allocation throughout the standard library so
	that objects don't show up in the memory profile as "unknown".

	Replace MR_PROC_LABEL by MR_ALLOC_ID.

mdbcomp/program_representation.m:
mdbcomp/rtti_access.m:
	Replace MR_PROC_LABEL by MR_ALLOC_ID.

profiler/Mercury.options:
profiler/globals.m:
profiler/mercury_profile.m:
profiler/options.m:
profiler/output.m:
profiler/snapshots.m:
	Add a new mode to `mprof' to parse and present the data from
	`Prof.Snapshots' files.

	Add options for the new profiling mode.

profiler/process_file.m:
	Fix a typo.

runtime/mercury_conf_param.h:
	#define MR_MPROF_PROFILE_MEMORY_ATTRIBUTION if memory profiling
	is enabled and we are using Boehm GC.

runtime/mercury.h:
	Make MR_new_object take an allocation id argument.

	Conform to changes in memory allocation macros.

runtime/mercury_memory.c:
runtime/mercury_memory.h:
runtime/mercury_types.h:
	Define MR_AllocSiteInfo.

	Add memory allocation functions and macros which take into the
	account the additional word necessary for the new profiling mode.
	These should be used in preferences to the raw memory allocation
	functions wherever possible so that objects do not show up in the
	profile as "unknown".

	Add analogues of realloc/free which take into account the offset
	introduced by the attribution word.

	Add function versions of the MR_new_object macros, which can't be
	written in standard C.  They are only used when necessary.

	Add built-in allocation site ids, to be used in the runtime and
	other hand-written code when context-specific ids are unavailable.

runtime/mercury_heap.h:
	Make MR_tag_offset_incr_hp_msg and MR_tag_offset_incr_hp_atomic_msg
	allocate an extra word when memory attribution is desired, and store
	the allocation id there.

	Similarly for MR_create{1,2,3}_msg.

	Replace proclabel arguments in allocation macros by alloc_id
	arguments.

	Replace MR_hp_alloc_atomic by MR_hp_alloc_atomic_msg.  It was only
	used for boxing floats.

	Conform to change to MR_new_object macro.

runtime/mercury_bootstrap.h:
	Delete obsolete macro hp_alloc_atomic.

runtime/mercury_heap_profile.c:
runtime/mercury_heap_profile.h:
	Add the code to summarise the live objects on the Boehm GC heap and
	writes out the data to `Prof.Snapshots', for display by mprof.

	Don't store the procedure name in MR_memprof_record: the procedure
	address is enough and faster to compare.

runtime/mercury_prof.c:
	Finish and close the `Prof.Snapshots' file when the program
	terminates.

	Conform to changes in MR_memprof_record.

runtime/mercury_misc.h:
	Add a macro to expand to the name of the allocation sites array
	in LLDS grades.

runtime/mercury_bitmap.c:
runtime/mercury_bitmap.h:
	Pass allocation id through bitmap allocation functions.

	Delete unused function MR_string_to_bitmap.

runtime/mercury_string.h:
	Add MR_make_aligned_string_copy_msg.

	Make string allocation macros take allocation id arguments.

runtime/mercury.c:
runtime/mercury_array_macros.h:
runtime/mercury_context.c:
runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_dlist.c:
runtime/mercury_engine.c:
runtime/mercury_float.h:
runtime/mercury_hash_table.c:
runtime/mercury_ho_call.c:
runtime/mercury_label.c:
runtime/mercury_prof_mem.c:
runtime/mercury_stacks.c:
runtime/mercury_stm.c:
runtime/mercury_string.c:
runtime/mercury_thread.c:
runtime/mercury_trace_base.c:
runtime/mercury_trail.c:
runtime/mercury_type_desc.c:
runtime/mercury_type_info.c:
runtime/mercury_wsdeque.c:
	Use attributed memory allocation throughout the runtime so that
	objects don't show up in the profile as "unknown".

runtime/mercury_memory_zones.c:
	Attribute memory zones to the Mercury runtime.

runtime/mercury_tabling.c:
runtime/mercury_tabling.h:
	Use attributed memory allocation macros for tabling structures.

	Delete unused MR_table_realloc_* and MR_table_copy_bytes macros.

runtime/mercury_deep_copy_body.h:
	Try to retain the original attribution word when copying values.

runtime/mercury_ml_expand_body.h:
	Conform to changes in memory allocation macros.

runtime/mercury_tags.h:
	Replace proclabel arguments by alloc_id arguments in allocation macros.

runtime/mercury_wrapper.c:
	If memory attribution is enabled, tell Boehm GC that pointers may be
	displaced by an extra word.

trace/mercury_trace.c:
trace/mercury_trace_tables.c:
	Conform to changes in memory allocation macros.

extras/net/tcp.m:
extras/solver_types/library/any_array.m:
extras/trailed_update/tr_array.m:
	Conform to changes in memory allocation macros.

doc/user_guide.texi:
	Document the new profiling mode.

doc/reference_manual.texi:
	Update a commented out example.
2011-05-20 04:16:58 +00:00
Peter Wang
3788a9d6fb Improve Unicode support.
Branches: main

Improve Unicode support.

Declare that we use the Unicode character set, and UTF-8 or UTF-16 for the
internal string representation (depending on the backend).  User code may be
written to those assumptions.  Other external encodings can be supported in
the future by translating to/from Unicode internally.

The `char' type now represents a Unicode code point.

NOTE: questions about how to handle unpaired surrogate code points, etc. have
been left for later.


library/char.m:
        Define a `char' to be a Unicode code point and extend ranges
        appropriately.

        Add predicates: to_utf8, to_utf16, is_surrogate, is_noncharacter.

	Update some documentation.

library/io.m:
	Declare I/O predicates on text streams to read/write code points, not
	ambiguous "characters".  Text files are expected to use UTF-8 encoding.
	Supporting other encodings is for future work.

        Update the C and Erlang implementations to understand UTF-8 encoding.

	Update Java and C# implementations to read/write code points (Mercury
	char) instead of UTF-16 code units.

	Add `may_not_duplicate' attributes to some foreign_procs.

	Improve Erlang implementations of seeking and getting the stream size.

library/string.m:
	Declare the string representations, as described earlier.

        Distinguish between code units and code points everywhere.
	Existing functions and predicates which take offset and length
	arguments continue to take them in terms of code units.

        Add procedures: count_code_units, count_codepoints, codepoint_offset,
	to_code_unit_list, from_code_unit_list, index_next, unsafe_index_next,
	unsafe_prev_index, unsafe_index_code_unit, split_by_codepoint,
	left_by_codepoint, right_by_codepoint, substring_by_codepoint.

	Make index, index_det call error/1 if an illegal sequence is detected,
	as they already do for invalid offsets.

	Clarify that is_all_alpha, is_all_alnum_or_underscore,
	is_alnum_or_underscore only succeed for the ASCII characters under each
	of those categories.

        Clarify that whitespace stripping functions only strip whitespace
        characters in the ASCII range.

	Add comments about the future treatment of surrogate code points
	(not yet implemented).

	Use Mercury format implementation when necessary instead of `sprintf'.
	The %c specifier does not work for code points which require multi-byte
	representation.  The field width modifier for %s only works if the
	string contains only single-byte code points.

library/lexer.m:
        Conform to string encoding changes.

        Simplify code dealing with \uNNNN escapes now that encoding/decoding
        is handled by the string module.

library/term_io.m:
        Allow code points above 126 directly in Mercury source.

        NOTE: \x and \o codes are treated as code points by this change.

runtime/mercury_types.h:
        Redefine `MR_Char' to be `int' to hold a Unicode code point.

	`MR_String' has to be defined as a pointer to `char' instead of a
	pointer to `MR_Char'.  Some C foreign code will be affected by this
	change.

runtime/mercury_string.c:
runtime/mercury_string.h:
        Add UTF-8 helper routines and macros.

        Make hash routines conform to type changes.

compiler/c_util.m:
        Fix output_quoted_string_lang so that it correctly outputs non-ASCII
        characters for each of the target languages.

        Fix quote_char for non-ASCII characters.

compiler/elds_to_erlang.m:
        Write out code points above 126 normally instead of using escape
        syntax.

        Conform to string encoding changes.

compiler/mlds_to_cs.m:
        Change Mercury `char' to be represented by C# `int'.

compiler/mlds_to_java.m:
        Change Mercury `char' to be represented by Java `int'.

doc/reference_manual.texi:
        Uncomment description of \u and \U escapes in string literals.

        Update description of C# and Java representations for Mercury `char'
	which are now `int'.

tests/debugger/tailrec1.m:
        Conform to renaming.

tests/general/string_replace.exp:
tests/general/string_replace.m:
	Test non-ASCII characters to string.replace.

tests/general/string_test.exp:
tests/general/string_test.m:
	Test non-ASCII characters to string.duplicate_char,
	string.pad_right, string.pad_left, string.format_table.

tests/hard_coded/char_unicode.exp:
tests/hard_coded/char_unicode.m:
	Add test for new procedures in `char' module.

tests/hard_coded/contains_char_2.m:
	Test non-ASCII characters to string.contains_char.

tests/hard_coded/nonascii.exp:
tests/hard_coded/nonascii.m:
tests/hard_coded/nonascii_gen.c:
        Add code points above 255 to this test case.

	Change test data encoding to UTF-8.

tests/hard_coded/string_class.exp:
tests/hard_coded/string_class.m:
	Add test case for string.is_alpha, etc.

tests/hard_coded/string_codepoint.exp:
tests/hard_coded/string_codepoint.exp2:
tests/hard_coded/string_codepoint.m:
	Add test case for new string procedures dealing with code points.

tests/hard_coded/string_first_char.exp:
tests/hard_coded/string_first_char.m:
	Add test case for all modes of string.first_char.

tests/hard_coded/string_hash.m:
	Don't use buggy random.random/5 predicate which can overflow on
	a large range (such as the range of code points).

tests/hard_coded/string_presuffix.exp:
tests/hard_coded/string_presuffix.m:
	Add test case for string.prefix, string.suffix, etc.

tests/hard_coded/string_set_char.m:
	Test non-ASCII characters to string.set_char.

tests/hard_coded/string_strip.exp:
tests/hard_coded/string_strip.m:
	Test non-ASCII characters to string stripping procedures.

tests/hard_coded/string_sub_string_search.m:
	Test non-ASCII characters to string.sub_string_search.

tests/hard_coded/unicode_test.exp:
        Update expected output due to change of behaviour of
        `string.to_char_list'.

tests/hard_coded/unicode_test.m:
	Test non-ASCII character in separator string argument to
	string.join_list.

tests/hard_coded/utf8_io.exp:
tests/hard_coded/utf8_io.m:
	Add tests for UTF-8 I/O.

tests/hard_coded/words_separator.exp:
tests/hard_coded/words_separator.m:
        Add test case for `string.words_separator'.

tests/hard_coded/Mmakefile:
	Add new test cases.

	Make special_char test case run on all backends.

tests/hard_coded/special_char.exp:
tests/valid/mercury_java_parser_follow_code_bug.m:
	Reencode these files in UTF-8.

NEWS:
	Add a news entry.
2011-04-04 07:10:42 +00:00
Peter Wang
382013edd8 Add missing non-macro definitions of MR_hash_string2, MR_hash_string3.
Branches: main, 11.01

runtime/mercury_string.c:
runtime/mercury_string.h:
        Add missing non-macro definitions of MR_hash_string2, MR_hash_string3.
        These are required for non-gcc compilers.
2011-02-09 00:36:01 +00:00
Zoltan Somogyi
bf5c9a79a5 Until now, all hash tables on strings used a single standard hash function.
Estimated hours taken: 3
Branches: main

Until now, all hash tables on strings used a single standard hash function.
However, any single hash function has a pretty good probability of generating
collisions on small hash tables.

This diff adds two new hash functions on strings. When generating hash tables,
we now try out all three hash functions, and use the one that generates
the fewest collisions.

runtime/mercury_string.h:
	Add C implementations of the new hash functions.

library/string.m:
	Add Mercury implementations of the new hash functions.

compiler/builtin_ops.m:
	Add the new hash functions as builtin operations.

compiler/switch_util.m:
	Select the best hash function for each hash switch on strings.

compiler/ml_string_switch.m:
compiler/string_switch.m:
	Use the selected hash function for each hash table.

compiler/bytecode.m:
compiler/c_util.m:
compiler/java_util.m:
compiler/llds.m:
compiler/llds_tox86_64.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/opt_debumlds_to_ilg.m:
	Conform to the presence of the new hash functions.
2010-11-12 02:22:06 +00:00
Zoltan Somogyi
4db9b2adbf Until now, the only indexing we did for switches on strings was using a hash
Estimated hours taken: 40
Branches: main

Until now, the only indexing we did for switches on strings was using a hash
table containing jump targets (represented as indices into a list of labels).
This diff supplements this with

- binary searches of tables containing jump targets,
- binary searches of tables containing values (lookup tables), and
- hash searches of tables containing values (lookup tables).

For now, the new methods exist in the LLDS backend only.

NEWS:
	Mention the new capability.

compiler/string_switch.m:
	Add predicates that implement the new indexing methods on strings.
	Factor out code from existing predicates as required for this.

compiler/switch_gen.m:
	Invoke the new predicates in string_switch.m when relevant.

	Avoid passing the constant "no" as the initial value of !MaybeEnd
	to predicates where we know this will ALWAYS happen.

compiler/options.m:
doc/user_guide.texi:
	Add an option to control when we use binary searches for switches
	on strings.

compiler/lookup_switch.m:
	This module previously handled lookup switches on integers.
	Generalize it so that pieces of it are now also usable to help
	implement lookup switches on strings. Rename the predicates specific
	to switches on integers to make clear this specificity, and separate
	them from the predicates that help implement lookup switches on
	variables of all the supported types.

	Export some types, predicates and functions for use in string_switch.m.

	Fix the code so that it correctly handles det switches, which
	can happen e.g. if we know the possible set of values of the
	switched-on variable.

	Use tail-recursive code to handle the list of switch arms, to allow us
	to handle very large switches.

	Remove an obsolete comment from the top about a previously implemented
	optimization.

compiler/lookup_util.m:
	Make set_liveness_and_end_branch update MaybeEnd, to account for the
	reservation of stack slots for holding the current and last rows in
	later solutions tables for model_non lookup switches.

compiler/switch_util.m:
	Make the exported predicates of this module more general, making them
	usable for switches on strings as well as ints. Also make them easier
	to use. In one case this meant bundling two predicates that were always
	used together into one predicate. In another, it meant splitting one
	predicate into two, since some of its callers needed an intermediate
	result. In the case of a type, it means reordering its fields
	to make the order match the order of their use in the implementation.

	Add some predicates specifically for switches on strings.

compiler/ml_lookup_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
	Conform to the changes to switch_util.m.

compiler/jumpopt.m:
	If the comment associated with a label ends with "nofulljump", then
	inhibit fulljump optimization of jumps to that label. That
	optimization would replace the jumps with the code starting at that
	label. This is avoids the overhead of jump instructions, and it is a
	good idea in the usual case of forward jumps. However, for the few
	backward jumps we generate, the block that replaces the jump
	instruction can actually END with the same jump instruction (which may
	be conditionally executed), which means that our usual repeated
	invocation of jumpopt can replace the original jump instruction
	with MANY copies of the block it jumps to. In some cases, such as those
	in hash switches, you get more copies than can ever be executed in any
	actual execution. Lookup switches therefore now mark the labels that
	are targets of backward jumps with this marker.

compiler/llds.m:
	Document the new behavior of jumpopt.

compiler/code_info.m:
	Export a predicate for use in improving the code we generate for lookup
	switches.

	Make some other predicates simpler and/or more efficient.

compiler/builtin_ops.m:
	Add a builtin op for doing string comparisons by calling strcmp.
	This is to prevent the need for two traversals of the strings being
	compared in each iteration of binary search.

compiler/bytecode.m:
compiler/c_util.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/llds.m:
compiler/llds_to_x86_64.m:
	Conform to the change in builtin_ops.m.

compiler/disj_gen.m:
	Conform to the change in lookup_util.m

compiler/frameopt.m:
compiler/proc_gen.m:
compiler/unify_gen.m:
	Take advantage of the change in fulljump optimization.

compiler/opt_debug.m:
	Improve the string representation of rvals by recording the types of
	the operands of binary operations, and making the output a bit more
	consistent looking.

compiler/dupproc.m:
compiler/var_locn.m:
	Minor style fixes.

runtime/mercury_string.h:
	Add a version of strcmp for use by our code generator. This version
	casts the arguments before calling the real strcmp. We need it since we
	usually specify the arguments as r1, r2 etc, which are declared as
	MR_Word, not char *.

tests/hard_coded/lookup_disj.{m,exp}:
tests/hard_coded/string_switch.{m,exp}:
	Make these existing tests significantly tougher by making them exercise
	a wider range of use scenarios.

tests/hard_coded/string_switch{2,3}.{m,exp}:
tests/hard_coded/Mercury.options
	While the string_switch test case tests the handling of jump switches,
	these two new test cases test the handling of binary search tables and
	hash tables respectively. Their code is identical to the code of
	the string_switch test case, but Mercury.options causes them to be
	compiled with different options.

tests/hard_coded/int_switch.{m,exp}:
	A new test case, equivalent in structure to the string switch test
	cases, to test the handling of lookup switches on atomic values.
2010-11-01 04:03:06 +00:00
Zoltan Somogyi
2dc982cfe4 Make a representation of the program available to the deep profiler.
Estimated hours taken: 50
Branches: main

Make a representation of the program available to the deep profiler. We do
this by letting the user request, via the option "--deep-procrep-file"
in MERCURY_OPTIONS, that when the Deep.data file is written, a Deep.procrep
file should be written alongside it.

The intended use of this information is the discovery of profitable
parallelism. When a conjunction contains two expensive calls, e.g. p(...) and
q(...) connected by some shared variables, the potential gain from executing
them in parallel is limited by how early p produces those variables and how
late q consumes them, and knowing this requires access to the code of p and q.

Since the debugger and the deep profiler both need access to program
representations, put the relevant data structures and the operations on them
in mdbcomp. The data structures are significantly expanded, since the deep
profiler deals with the whole program, while the debugger was interested only
in one procedure at a time.

The layout structures have to change as well. In a previous change, I changed
proc layout structures to make room for the procedure representation even in
non-debugging grades, but this isn't enough, since the procedure representation
refers to the module's string table. This diff therefore makes some parts of
the module layout structure, including of course the string table, also
available in non-debugging grades.

configure.in:
	Check whether the installed compiler can process switches on foreign
	enums correctly, since this diff depends on that.

runtime/mercury_stack_layout.[ch]:
runtime/mercury_types.h:
	Add a new structure, MR_ModuleCommonLayout, that holds the part of
	the module layout that is common to deep profiling and debugging.

runtime/mercury_deep_profiling.[ch]:
	The old "deep profiling token" enum type was error prone, since at
	each point in the data file, only a subset was applicable. This diff
	breaks up the this enum into several enums, each consisting of the
	choice applicable at a given point.

	This also allows some of the resulting enums to be used in procrep
	files.

	Rename some enums and functions to avoid ambiguities, and in one case
	to conform to our naming scheme.

	Make write_out_proc_statics take a second argument. This is a FILE *
	that (if not NULL) asks write_out_proc_statics to write the
	representation of the current module to specified stream.

	These module representations go into the middle part of the program
	representation file. Add functions to write out the prologue and
	epilogue of this file.

	Write out procedure representations if this is requested.

	Factor out some code that is now used in more than one place.

runtime/mercury_deep_profiling_hand.h:
	Conform to the changes to mercury_deep_profiling.h.

runtime/mercury_builtin_types.c:
	Pass the extra argument in the argument lists of invocations of
	write_out_proc_statics.

runtime/mercury_trace_base.[ch]:
	Conform to the name change from proc_rep to proc_defn_rep in mdbcomp.

runtime/mercury_grade.h:
	Due to the change to layout structures, increment the binary
	compatibility version numbers for both debug and deep profiling grades.

runtime/mercury_wrapper.[ch]:
	Provide two new MERCURY_OPTION options. The first --deep-procrep-file,
	allows the user to ask for the program representation to be generated.
	The second, --deep-random-write, allows tools/bootcheck to request that
	only a fraction of all program invocations should generate any deep
	profiling output.

	The first option will be documented once it is tested much more fully.
	The second option is deliberately not documented.

	Update the type of the variable that holds the address of the
	(mkinit-generated) write_out_proc_statics function to accept the second
	argument.

util/mkinit.c:
	Pass the extra argument in the argument list of write_out_proc_statics.

mdbcomp/program_representation.m:
	Extend the existing data structures for representing a procedure body
	to represent a procedure (complete with name), a module and a program.
	The name is implemented as string_proc_label, a form of proc_label that
	can be written out to files. This replaces the old proc_id type the
	deep profiler.

	Extend the representation of switches to record the identity of the
	variable being switched on, and the cons_ids of the arms. Without the
	former, we cannot be sure when a variable is first used, and the latter
	is needed for meaningful prettyprinting of procedure bodies.

	Add code for reading in files of bytecodes, and for making sense of the
	bytecodes themselves. (It is this code that uses foreign enums.)

mdbcomp/prim_data.m:
	Note the relationship of proc_label with string_proc_label.

mdbcomp/rtti_access.m:
	Add the access operations needed to find module string tables with the
	new organization of layout structures.

	Provide operations on bytecodes and string tables generally.

trace/mercury_trace_cmd_browsing.c:
	Conform to the change to mdbcomp/program_representation.m.

compiler/layout.m:
	Add support for a MR_ModuleCommonLayout.

	Rename some function symbols to avoid ambiguities.

compiler/layout_out.m:
	Handle the new structure.

compiler/stack_layout.m:
	Generate the new structure and the procedure representation bytecode
	in deep profiling grades.

compiler/llds_out.m:
	Generate the code required to write out the prologue and epilogue
	of program representation files.

	Pass the extra argument in the argument lists of invocations of
	write_out_proc_statics that tells those invocations to write out
	the module representations between the prologue and the epilogue.

compiler/prog_rep.m:
	When generating bytecodes, include the new information for switches.

compiler/continuation_info.m:
	Replace a bool with a more expressive type.

compiler/proc_rep.m:
	Conform to the change to continuation_info.m.

compiler/opt_debug.m:
	Conform to the change to layout.m.

deep_profiler/mdprof_procrep.m:
	A new test program to test the reading of program representations.

deep_profiler/DEEP_FLAGS.in:
deep_profiler/Mmakefile:
	Copy the contents of the mdbcomp module to this directory on demand,
	instead of linking to it. This is necessary now that the deep profiler
	depends directly on mdbcomp even if it is compiled in a non-debugging
	grade.

	The arrangements for doing this were copied from the slice directory,
	which has long done the same.

	Avoid a duplicate include of Mmake.deep.params.

	Add the new test program to the list of programs in this directory.

Mmakefile:
	Go through deep_profiler/Mmakefile when deciding whether to do "mmake
	depend" in the deep_profiler directory. The old actions won't work
	correctly now that we need to copy some files from mdbcomp before we
	can run "mmake depend".

deep_profiler/profile.m:
	Remove the code that was moved (in cleaned-up form) to mdbcomp.

deep_profiler/dump.m:
deep_profiler/profile.m:
	Conform to the changes above.

browser/declarative_execution.m:
browser/declarative_tree.m:
	Conform to the changes in mdbcomp.

doc/user_guide.texi:
	Add commented out documentation of the two new options.

slice/Mmakefile:
	Fix formatting, and a bug.

library/exception.m:
library/par_builtin.m:
library/thread.m:
library/thread.semaphore.m:
	Update all the handwritten modules to pass the extra argument now
	required by write_out_proc_statics.

tests/debugger/declarative/dependency.exp:
	Conform to the change from proc_rep to proc_defn_rep.

tools/bootcheck:
	Write out deep profiling data only from every 25th invocation, since
	otherwise the time for a bootcheck takes six times as long in deep
	profiling grades than in asm_fast.gc.

	However, do test the ability to write out program representations.

	Use the mkinit from the workspace, not the installed one.

	Don't disable line wrapping.
2007-09-12 06:21:20 +00:00
Simon Taylor
5647714667 Make all functions which create strings from characters throw an exception
Estimated hours taken: 15
Branches: main

Make all functions which create strings from characters throw an exception
or fail if the list of characters contains a null character.

This removes a potential source of security vulnerabilities where one
part of the program performs checks against the whole of a string passed
in by an attacker (processing the string as a list of characters or using
`unsafe_index' to look past the null character), but then passes the string
to another part of the program or an operating system call that only sees
up to the first null character.  Even if Mercury stored the length with
the string, allowing the creation of strings containing nulls would be a
bad idea because it would be too easy to pass a string to foreign code
without checking.

For examples see:
<http://insecure.org/news/P55-07.txt>
<http://www.securiteam.com/securitynews/5WP0B1FKKQ.html>
<http://www.securityfocus.com/archive/1/445788>
<http://www.securityfocus.com/archive/82/368750>
<http://secunia.com/advisories/16420/>

NEWS:
	Document the change.

library/string.m:
	Throw an exception if null characters are found in
	string.from_char_list and string.from_rev_char_list.

	Add string.from_char_list_semidet and string.from_rev_char_list_semidet
	which fail rather throwing an exception.  This doesn't match the
	normal naming convention, but string.from_{,rev_}char_list are widely
	used, so changing their determinism would be a bit too disruptive.

	Don't allocate an unnecessary extra word for each string created by
	from_char_list and from_rev_char_list.

	Explain that to_upper and to_lower only work on un-accented
	Latin letters.

library/lexer.m:
	Check for invalid characters when reading Mercury strings and
	quoted names.

	Improve error messages by skipping to the end of any string
	or quoted name containing an error.  Previously we just stopped
	processing at the error leaving an unmatched quote.

library/io.m:
	Make io.read_line_as_string and io.read_file_as_string return
	an error code if the input file contains a null character.

	Fix an XXX: '\0\' is not recognised as a character constant,
	but char.det_from_int can be used to make a null character.

library/char.m:
	Explain the workaround for '\0\' not being accepted as a char
	constant.

	Explain that to_upper and to_lower only work on un-accented
	Latin letters.

compiler/layout.m:
compiler/layout_out.m:
compiler/c_util.m:
compiler/stack_layout.m:
compiler/llds.m:
compiler/mlds.m:
compiler/ll_backend.*.m:
compiler/ml_backend.*.m:
	Don't pass around strings containing null characters (the string
	tables for the debugger).  This doesn't cause any problems now,
	but won't work with the accurate garbage collector.  Use lists
	of strings instead, and add the null characters when writing the
	strings out.

tests/hard_coded/null_char.{m,exp}:
	Change an existing test case to test that creation of a string
	containing a null throws an exception.

tests/hard_coded/null_char.exp2:
	Deleted because alternative output is no longer needed.

tests/invalid/Mmakefile:
tests/invalid/null_char.m:
tests/invalid/null_char.err_exp:
	Test error messages for construction of strings containing null
	characters by the lexer.

tests/invalid/unicode{1,2}.err_exp:
	Update the expected output after the change to the handling of
	invalid quoted names and strings.
2007-03-18 23:35:04 +00:00
Simon Taylor
9c650e1d83 Improvements for bitmap.m, to make it useable as a general container
Estimated hours taken: 80
Branches: main

Improvements for bitmap.m, to make it useable as a general container
for binary data.

library/bitmap.m:
runtime/mercury_bitmap.c:
runtime/mercury_bitmap.h:
	Specialize the representation of bitmaps to an array of unsigned
	bytes defined as a foreign type.

	This is better than building on top of array(int) because it:
	- is better for interfacing with foreign code
	- has a more sensible machine-independent comparison order
	  (same as array(bool))
	- avoids storing the size twice
	- has more efficient copying, unification, comparison and tabling
	  (although we should probably specialize the handling of array(int)
	  and isomorphic types as well)
	- uses GC_MALLOC_ATOMIC to avoid problems with bit patterns that look
	  like pointers (although we should do that for array(int) as well)

	XXX The code for the Java and IL backends is untested.
	Building the library in grade Java with Sun JDK 1.6 failed (but
	at least passed error checking), and I don't have access to a
	copy of MSVS.NET.  The foreign code that needs to be tested is
	trivial.

	Add fields `bit', `bits' and `byte' to get/set a single bit,
	multiple bits (from an int) or an 8 bit byte.

	Add functions for converting bitmaps to hex strings and back,
	for use by stream.string_writer.write and deconstruct.functor/4.

	bitmap.intersect was buggy in the case where the input bitmaps
	had a different size.  Given that bitmaps are implemented with
	a fixed domain (lookups out of range throw an exception), it
	makes more sense to throw an exception in that case anyway,
	so all of the set operations do that now.

	The difference operation actually performed xor.  Fix it and
	add an xor function.

library/version_bitmap.m:
	This hasn't been fully updated to be the same as bitmap.m.
	The payoff would be much less because foreign code can't
	really do anything with version_bitmaps.

	Add a `bit' field.

	Deprecate the `get/2' function in favour of the `bit' field.

	Fix the union, difference, intersection and xor functions
	as for bitmap.m.

	Fix comparison of version_arrays so that it uses the same
	method as array.m: compare size then elements in order.
	The old code found version_arrays to be equal if one was
	a suffix of the other.

library/char.m:
	Add predicates for converting between hex digits and integers.

library/io.m:
library/stream.string_writer.m:
library/term.m:
	Read and write bitmaps.

runtime/mercury_type_info.h:
runtime/mercury_deep_copy_body.h:
runtime/mercury_mcpp.h:
runtime/mercury_table_type_body.h:
runtime/mercury_tabling_macros.h:
runtime/mercury_unify_compare_body.h:
runtime/mercury_construct.c:
runtime/mercury_deconstruct.c:
runtime/mercury_term_size.c:
runtime/mercury_string.h:
library/construct.m:
library/deconstruct.m
compiler/prog_type.m:
compiler/mlds_to_gcc.m:
compiler/rtti.m:
	Add a MR_TypeCtorRep for bitmaps, and handle it in the library
	and runtinme.

library/Mercury.options:
	Compile bitmap.m with `--no-warn-insts-without-matching-type'.

runtime/mercury_type_info.h:
	Bump MR_RTTI_VERSION.

NEWS:
	Document the changes.

tests/hard_coded/Mmakefile:
tests/hard_coded/bitmap_test.m:
tests/hard_coded/bitmap_simple.m:
tests/hard_coded/bitmap_tester.m:
tests/hard_coded/bitmap_test.exp:
tests/tabling/Mmakefile:
tests/tabling/expand_bitmap.m:
tests/tabling/expand_bitmap.exp:
tests/hard_coded/version_array_test.m:
tests/hard_coded/version_array_test.exp:
	Test cases.
2007-02-13 01:59:04 +00:00
Zoltan Somogyi
0023d13f18 Fix some layout issues in these files. There are no algorithmic
Estimated hours taken: 0.2
Branches: main

runtime/mercury_calls.h:
runtime/mercury_prof.h:
runtime/mercury_signal.h:
runtime/mercury_string.h:
runtime/mercury_thread.c:
runtime/mercury_thread.h:
	Fix some layout issues in these files. There are no algorithmic
	changes.
2005-06-20 02:16:44 +00:00
Zoltan Somogyi
8190c16181 Get Mercury to work with gcc 3.4. This required fixing several problems.
Estimated hours taken: 16
Branches: main

Get Mercury to work with gcc 3.4. This required fixing several problems.

One problem that caused errors is that gcc 3.4 is smart enough to figure out
that in LLDS grades with gcc gotos, the C functions containing our code are
not referred to, and so it optimizes them away. The fix is to ensure that
mercury_<module>_init is defined always to call those functions, even if
the macro that usually controls this, MR_MAY_NEED_INITIALIZATION, is not
defined. The mercury_<module>_init won't be called from the init_modules
function in the program's _init.c file, so there is no impact on initialization
time, but gcc doesn't know this when compiling a module's .c file, so
it doesn't optimize away the code we need. The cost of this change is thus
only a small amount of code space. It is worth paying this cost even with
compilers other than gcc 3.4 for simplicity. Actually, this size increase seems
to be slightly smaller than the size reduction due to the better optimization
capabilities of gcc 3.4 compared to gcc 3.2.2.

A second problem is that gcc 3.4 warns about casts in lvalues being a
deprecated feature. This gave lots of warnings, since we used to define
several Mercury abstract machine registers, including MR_succip, MR_hp, MR_sp,
MR_maxfr and MR_curfr using lvalue casts. The fix is to have two macros
for each of these abstract machine registers, one of type MR_Word that you can
assign to (e.g. MR_sp_word), and one of the original type that is of the right
type but not an lvalue (e.g. MR_sp). The lvalue itself can't be made the right
type, because MR_sp isn't a variable in its own right, but possibly defined
to be a machine register. The machine register could made the right type,
but only at the cost of a lot of complexity.

This problem doesn't apply to the special-purpose Mercury abstract machine
registers that can't be allocated to machine registers. Instead of #defining
these to slots in MR_fake_reg, we make them global variables of the natural
type. This should also make it easier to debug code using these registers.
We treat these global variables as if they were machine registers in that
MR_save_registers copies values from these global variables to slots reserved
for them in the MR_fake_reg array, to allow code to loop over all Mercury
abstract machine registers. These saved slots must of course be of type
MR_Word, so we again need two macros to refer to them, a lvalue of type
MR_Word and an rvalue with the right type.

A third problem is that gcc 3.4 warns about conditionals in lvalues being a
deprecated feature. This gave a few warnings, since we used to define
MR_virtual_reg and MR_saved_reg using lvalues using conditionals. The fix
is to have one macro (MR_virtual_reg_value) for use in rvalues and a
separate macro which uses an if-then-else instead of a conditional
expression (MR_virtual_reg_assign), for assignments.

A fourth problem is that gcc 3.4 warns about comma operators in lvalues
being a deprecated feature. This gave warnings in the few places where
we refer to MR_r(N) for values of N that can map to fake registers directly,
since in those cases we preceded the reference to the fake_reg array with
a range check of the array index. The fix to this is to move the test to
compile time for compiler-generated code. Hand-written code never refers
to MR_r(N) for these values, and is very unlikely to do so in the future;
instead, it refers to the underlying fake_reg array directly, since that way
it doesn't have to worry about which fake registers have their own MR_rN macro
and which don't. Therefore no check mechanism for hand-written code is
necessary. This change mean that changing the number of MR_rN registers
now requires change to the compiler as well as to the runtime system.

A fifth problem is that gcc 3.4 by default assumes -fstrict-aliasing at -O2.
Since we cast between integers and pointers of different types all the time,
and changing that is not practical, at least in the short term, we need to
disable -fstrict-aliasing when we enable -O2.

NEWS:
	Note that Mercury now works with gcc 3.4.

configure.in:
scripts/mgnuc.in:
	Detect whether the compiler supports -fstrict-aliasing, and if so,
	whether it assumes it by default with -O2. If the answer is yes to
	both, make mgnuc specify -fno-strict-aliasing when it specifies -O2.
	By including it in CFLAGS_FOR_OPT, which gets put into Mercury.config,
	we also get -f-no-strict-aliasing when mmc invokes the C compiler
	directly.

compiler/llds_out.m:
	Don't generate #ifdef MR_MAY_NEED_INITIALIZATION around the definitions
	and calls to the bunch functions, which call the functions we don't
	want the C compiler to optimize away.

	Generate the newly required lvalues on the left sides of assignments.

	We still have code to generate LVALUE_CASTs in some cases, but I don't
	think those cases ever arise.

	Add a compile-time check of register numbers. Ideally, the code
	generator should use stack slots instead of registers beyond the max
	number, but I don't recall us ever bumping into this limit by accident.

compiler/fact_table.m:
	Use the newly required lvalues on the left sides of assignments
	in some hand-written C code included in generated .c files.

runtime/mercury_regs.h:
	Make the changes described above to fix the second, third and fourth
	problems. We still use comma operators in lvalues when counting
	references to registers, but it is OK to require anyone who wants
	to enable this feature to use a compiler version that supports comma
	operators in lvalues or to ignore the warnings.

	Use the same mapping from Mercury abstract machine registers to
	the register count array as to the MR_fake_reg array.

	Have this mapping depend as little as possible on whether we need a
	real machine register to store MR_engine base, even if it costs a
	wasted slot in MR_fake_reg.

	Fix an old inconsistency: treat the Mercury abstract machine registers
	used for trailing the same way as the other Mercury abstract machine
	registers, by making MR_save_registers/MR_restore_registers copy them
	to and from their global variable homes.

	Document the requirement for the match between the runtime's and the
	compiler's notions of the maximum MR_rN register number. This
	requirement makes it harder for users to increase the number of
	virtual registers, but as far as I know noone has wanted to do this.

	Change the names of some of the macros to make them clearer.

	Reorder some parts of this file, and add some documentation, also
	in the interest of clarity.

runtime/mercury_regorder.h:
	Delete this file after moving its contents, in much modified form,
	to mercury_regs.h. mercury_regorder.h was always logically part of
	mercury_regs.h, but was separated out to make it easier to change
	the mapping from Mercury abstract machine registers to machine
	registers. However, the cost of incompatibility caused by any such
	changes would be much greater that any likely performance benefit.

runtime/Mmakefile:
	Remove the reference to mercury_regorder.h.

runtime/mercury_regs.[ch]:
runtime/mercury_memory_zones.[ch]:
	Move some functionality dealing with registers from
	mercury_memory_zones to mercury_regs, since it belongs there.

runtime/mercury_regs.[ch]:
	Add a function to make it easiler to debug changes to map from
	Mercury abstract machine to MR_fake_reg slots.

runtime/mercury_regs.[ch]:
runtime/mercury_wrapper.c:
	Move the code to print counts of register uses from mercury_wrapper.c
	to mercury_regs.c.

	Make mercury_wrapper.c call the debugging function in mercury_regs.c
	if -X is specified in MERCURY_OPTIONS.

runtime/mercury_bootstrap.h:
	Move the old MR_saved_reg and MR_virtual_reg macros from mercury_regs.h
	to mercury_bootstrap.h to prevent their accidental use. Since
	they shouldn't be used by user code, move them to the section
	that is not enabled by default.

runtime/mercury_stacks.[ch]:
	Add _word versions of the macros for stack slots, for the same reason
	why we need them for Mercury abstract machine registers, and use them.

	Add global variables for the Mercury abstract machine registers
	for the gen, cut and pneg stacks.

runtime/mercury_heap.h:
	Change the macros for allocating memory to assign to MR_hp_word instead
	of MR_hp.

runtime/mercury_string.h:
	Change the macros for allocating strings to accomodate the updates to
	mercury_heap.h. Also change the expected type of the target to make it
	MR_String instead of MR_ConstString, since the latter requires casts in
	the caller.

runtime/mercury_trail.h:
runtime/mercury_types.h:
	Move the definition of the type MR_TrailEntry from mercury_trail.h
	to mercury_types.h, since it is now used in mercury_regs.h.

runtime/mercury_accurate_gc.c:
runtime/mercury_agc_debug.c:
runtime/mercury_calls.h:
runtime/mercury_context.[ch]:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_deep_copy_body.h:
runtime/mercury_engine.[ch]:
runtime/mercury_hand_compare_body.h:
runtime/mercury_hand_unify_body.h:
runtime/mercury_ho_call.c:
runtime/mercury_layout_util.c:
runtime/mercury_make_type_info_body.h:
runtime/mercury_minimal_model.c:
runtime/mercury_ml_deconstruct_body.h:
runtime/mercury_ml_functor_body.h:
runtime/mercury_stack_layout.h:
runtime/mercury_type_desc.c:
runtime/mercury_type_info.c:
runtime/mercury_unify_compare_body.h:
runtime/mercury_wrapper.c:
	Conform to the changes in the rest of the runtime.

	In some cases, fix inconsistencies in indentation.

runtime/mercury_stack_trace.c:
	Add some conditionally compiled debugging code controlled by the macro
	MR_ADDR_DEBUG, to help debug some problems with stored stack pointers.

runtime/mercury_grade.h:
	Increment the binary compatibility version number. This is needed to
	avoid potential problems when a Mercury module and the debugger are
	compiled with different versions of the macros in mercury_regs.h.

library/exception.m:
	Update the code that assigns to abstract machine registers.

library/array.m:
library/construct.m:
library/dir.m:
library/io.m:
library/string.m:
	Conform to the new definitions of allocation macros.

library/time.m:
	Delete an unnecessary #include.

trace/mercury_trace.c:
trace/mercury_trace_declarative.c:
trace/mercury_trace_util.c:
	Conform to the changes in the rest of the runtime.

tests/hard_coded/qual_test_is_imported.m:
tests/hard_coded/aditi_private_builtin.m:
	Remove an unnecessary import to avoid a warning.

tools/makebatch:
	Add an option --save-stage2-on-error, that saves the stage2 directory
	if a bootcheck fails.

scripts/ml.in:
	Make ml more robust in the face of garbage files.
2004-07-07 07:11:22 +00:00
Zoltan Somogyi
f007b45df8 Implement the infrastructure for term size profiling.
Estimated hours taken: 400
Branches: main

Implement the infrastructure for term size profiling. This means adding two
new grade components, tsw and tsc, and implementing them in the LLDS code
generator. In grades including tsw (term size words), each term is augmented
with an extra word giving the number of heap words it contains; in grades
including tsc (term size cells), each term is augmented with an extra word
giving the number of heap cells it contains. The extra word is at the start,
at offset -1, to leave almost all of the machinery for accessing the heap
unchanged.

For now, the only way to access term sizes is with a new mdb command,
"term_size <varspec>". Later, we will use term sizes in conjunction with
deep profiling to do experimental complexity analysis, but that requires
a lot more research. This diff is a necessary first step.

The implementation of term size profiling consists of three main parts:

- a source-to-source transform that computes the size of each heap cell
  when it is constructed (and increments it in the rare cases when a free
  argument of an existing heap cell is bound),

- a relatively small change to the code generator that reserves the extra
  slot in new heap cells, and

- extensions to the facilities for creating cells from C code to record
  the extra information we now need.

The diff overhauls polymorphism.m to make the source-to-source transform
possible. This overhaul includes separating type_ctor_infos and type_infos
as strictly as possible from each other, converting type_ctor_infos into
type_infos only as necessary. It also includes separating type_ctor_infos,
type_infos, base_typeclass_infos and typeclass_infos (as well as voids,
for clarity) from plain user-defined type constructors in type categorizations.
This change needs this separation because values of those four types do not
have size slots, but they ought to be treated specially in other situations
as well (e.g. by tabling).

The diff adds a new mdb command, term_size. It also replaces the proc_body
mdb command with new ways of using the existing print and browse commands
("print proc_body" and "browse proc_body") in order to make looking at
procedure bodies more controllable. This was useful in debugging the effect
of term size profiling on some test case outputs. It is not strictly tied
to term size profiling, but turns out to be difficult to disentangle.

compiler/size_prof.m:
	A new module implementing the source-to-source transform.

compiler/notes/compiler_design.html:
	Mention the new module.

compiler/transform_hlds.m:
	Include size_prof as a submodule of transform_hlds.

compiler/mercury_compile.m:
	If term size profiling is enabled, invoke its source-to-source
	transform.

compiler/hlds_goal.m:
	Extend construction unifications with an optional slot for recording
	the size of the term if the size is a constant, or the identity of the
	variable holding the size, if the size is not constant. This is
	needed by the source-to-source transform.

compiler/quantification.m:
	Treat the variable reference that may be in this slot as a nonlocal
	variable of construction unifications, since the code generator needs
	this.

compiler/compile_target_code.m:
	Handle the new grade components.

compiler/options.m:
	Implement the options that control term size profiling.

doc/user_guide.texi:
	Document the options and grade components that control term size
	profiling, and the term_size mdb command. The documentation is
	commented out for now.

	Modify the wording of the 'u' HLDS dump flag to include other details
	of unifications (e.g. term size info) rather than just unification
	categories.

	Document the new alternatives of the print and browse commands. Since
	they are for developers only, the documentation is commented out.

compiler/handle_options.m:
	Handle the implications of term size profiling grades.

	Add a -D flag value to print HLDS components relevant to HLDS
	transformations.

compiler/modules.m:
	Import the new builtin library module that implements the operations
	needed by term size profiling automatically in term size profiling
	grades.

	Switch the predicate involved to use state var syntax.

compiler/prog_util.m:
	Add predicates and functions that return the sym_names of the modules
	needed by term size profiling.

compiler/code_info.m:
compiler/unify_gen.m:
compiler/var_locn.m:
 	Reserve an extra slot in heap cells and fill them in in unifications
	marked by size_prof.

compiler/builtin_ops.m:
	Add term_size_prof_builtin.term_size_plus as a builtin, with the same
	implementation as int.+.

compiler/make_hlds.m:
	Disable warnings about clauses for builtins while the change to
	builtin_ops is bootstrapped.

compiler/polymorphism.m:
	Export predicates that generate goals to create type_infos and
	type_ctor_infos to add_to_construct.m. Rewrite their documentation
	to make it more detailed.

	Make orders of arguments amenable to the use of state variable syntax.

	Consolidate knowledge of which type categories have builtin unify and
	compare predicates in one place.

	Add code to leave the types of type_ctor_infos alone: instead of
	changing their types to type_info when used as arguments of other
	type_infos, create a new variable of type type_info instead, and
	use an unsafe_cast. This would make the HLDS closer to being type
	correct, but this new code is currently commented out, for two
	reasons. First, common.m is currently not smart enough to figure out
	that if X and Y are equal, then similar unsafe_casts of X and Y
	are also equal, and this causes the compiler do not detect some
	duplicate calls it used to detect. Second, the code generators
	are also not smart enough to know that if Z is an unsafe_cast of X,
	then X and Z do not need separate stack slots, but can use the same
	slot.

compiler/type_util.m:
	Add utility predicates for returning the types of type_infos and
	type_ctor_infos, for use by new code in polymorphism.m.

	Move some utility predicates here from other modules, since they
	are now used by more than one module.

	Rename the type `builtin_type' as `type_category', to better reflect
	what it does. Extend it to put the type_info, type_ctor_info,
	typeclass_info, base_typeclass_info and void types into categories
	of their own: treating these types as if they were a user-defined
	type (which is how they used to be classified) is not always correct.
	Rename the functor polymorphic_type to variable_type, since types
	such as list(T) are polymorphic, but they fall into the user-defined
	category. Rename user_type as user_ctor_type, since list(int) is not
	wholly user-defined but falls into this category. Rename pred_type
	as higher_order_type, since it also encompasses functions.

	Replace code that used to check for a few of the alternatives
	of this type with code that does a full switch on the type,
	to ensure that they are updated if the type definition ever
	changes again.

compiler/pseudo_type_info.m:
	Delete a predicate whose updated implementation is now in type_util.m.

compiler/mlds_to_c.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
	Still treat type_infos, type_ctor_infos, typeclass_infos and
	base_typeclass_infos as user-defined types, but prepare for when
	they won't be.

compiler/hlds_pred.m:
	Require interface typeinfo liveness when term size profiling is
	enabled.

	Add term_size_profiling_builtin.increase_size as a
	no_type_info_builtin.

compiler/hlds_out.m:
	Print the size annotations on unifications if HLDS dump flags call
	for unification details. (The flag test is in the caller of the
	modified predicate.)

compiler/llds.m:
	Extend incr_hp instructions and data_addr_consts with optional fields
	that allow the code generator to refer to N words past the start of
	a static or dynamic cell. Term size profiling uses this with N=1.

compiler/llds_out.m:
	When allocating memory on the heap, use the macro variants that
	specify an optional offset, and specify the offset when required.

compiler/bytecode_gen.m:
compiler/dense_switch.m:
compiler/dupelim.m:
compiler/exprn_aux.m:
compiler/goal_form.m:
compiler/goal_util.m:
compiler/higher_order.m:
compiler/inst_match.m:
compiler/intermod.m:
compiler/jumpopt.m:
compiler/lambda.m:
compiler/livemap.m:
compiler/ll_pseudo_type_info.m:
compiler/lookup_switch.m:
compiler/magic_util.m:
compiler/middle_rec.m:
compiler/ml_code_util.m:
compiler/ml_switch_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds.m:
compiler/mlds_to_c.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/modecheck_unify.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/par_conj_gen.m:
compiler/post_typecheck.m:
compiler/reassign.m:
compiler/rl.m:
compiler/rl_key.m:
compiler/special_pred.m:
compiler/stack_layout.m:
compiler/static_term.m:
compiler/string_switch.m:
compiler/switch_gen.m:
compiler/switch_util.m:
compiler/table_gen.m:
compiler/term_util.m:
compiler/type_ctor_info.m:
compiler/unused_args.m:
compiler/use_local_vars.m:
	Minor updates to conform to the changes above.

library/term_size_prof_builtin.m:
	New module containing helper predicates for term size profiling.
	size_prof.m generates call to these predicates.

library/library.m:
	Include the new module in the library.

doc/Mmakefile:
	Do not include the term_size_prof_builtin module in the library
	documentation.

library/array.m:
library/benchmarking.m:
library/construct.m:
library/deconstruct.m:
library/io.m:
library/sparse_bitset.m:
library/store.m:
library/string.m:
	Replace all uses of MR_incr_hp with MR_offset_incr_hp, to ensure
	that we haven't overlooked any places where offsets may need to be
	specified.

	Fix formatting of foreign_procs.

	Use new macros defined by the runtime system when constructing
	terms (which all happen to be lists) in C code. These new macros
	specify the types of the cell arguments, allowing the implementation
	to figure out the size of the new cell based on the sizes of its
	fields.

library/private_builtin.m:
	Define some constant type_info structures for use by these macros.
	They cannot be defined in the runtime, since they refer to types
	defined in the library (list.list and std_util.univ).

util/mkinit.c:
	Make the addresses of these type_info structures available to the
	runtime.

runtime/mercury_init.h:
	Declare these type_info structures, for use in mkinit-generated
	*_init.c files.

runtime/mercury_wrapper.[ch]:
	Declare and define the variables that hold these addresses, for use
	in the new macros for constructing typed lists.

	Since term size profiling can refer to a memory cell by a pointer
	that is offset by one word, register the extra offsets with the Boehm
	collector if is being used.

	Document the incompatibility of MR_HIGHTAGS and the Boehm collector.

runtime/mercury_tags.h:
	Define new macros for constructing typed lists.

	Provide macros for preserving the old interface presented by this file
	to the extent possible. Uses of the old MR_list_cons macro will
	continue to work in grades without term size profiling. In term
	size profiling grades, their use will get a C compiler error.

	Fix a bug caused by a missing backslash.

runtime/mercury_heap.h:
	Change the basic macros for allocating new heap cells to take
	an optional offset argument. If this is nonzero, the macros
	increment the returned address by the given number of words.
	Term size profiling specifies offset=1, reserving the extra
	word at the start (which is ignored by all components of the
	system except term size profiling) for holding the size of the term.

	Provide macros for preserving the old interface presented by this file
	to the extent possible. Since the old MR_create[123] and MR_list_cons
	macros did not specify type information, they had to be changed
	to take additional arguments. This affects only hand-written C code.

	Call new diagnostic macros that can help debug heap allocations.

	Document why the macros in this files must expand to expressions
	instead of statements, evn though the latter would be preferable
	(e.g. by allowing them to declare and use local variables without
	depending on gcc extensions).

runtime/mercury_debug.[ch]:
	Add diagnostic macros to debug heap allocations, and the functions
	behind them if MR_DEBUG_HEAP_ALLOC is defined.

	Update the debugging routines for hand-allocated cells to print the
	values of the term size slot as well as the other slots in the relevant
	grades.

runtime/mercury_string.h:
	Provide some needed variants of the macro for copying strings.

runtime/mercury_deconstruct_macros.h:
runtime/mercury_type_info.c:
	Supply type information when constructing terms.

runtime/mercury_deep_copy_body.h:
	Preserve the term size slot when copying terms.

runtime/mercury_deep_copy_body.h:
runtime/mercury_ho_call.c:
runtime/mercury_ml_expand_body.h:
	Use MR_offset_incr_hp instead of MR_incr_hp to ensure that all places
	that allocate cells also allocate space for the term size slot if
	necessary.

	Reduce code duplication by using a now standard macro for copying
	strings.

runtime/mercury_grade.h:
	Handle the two new grade components.

runtime/mercury_conf_param.h:
	Document the C macros used to control the two new grade components,
	as well as MR_DEBUG_HEAP_ALLOC.

	Detect incompatibilities between high level code and profiling.

runtime/mercury_term_size.[ch]:
	A new module to house a function to find and return term sizes
	stored in heap cells.

runtime/mercury_proc_id.h:
runtime/mercury_univ.h:
	New header files. mercury_proc_id.h contains the (unchanged)
	definition of MR_Proc_Id, while mercury_univ.h contains the
	definitions of the macros for manipulating univs that used to be
	in mercury_type_info.h, updated to use the new macros for allocating
	memory.

	In the absence of these header files, the following circularity
	would ensue:

	mercury_deep_profiling.h includes mercury_stack_layout.h
		- needs definition of MR_Proc_Id
	mercury_stack_layout.h needs mercury_type_info.h
		- needs definition of MR_PseudoTypeInfo
	mercury_type_info.h needs mercury_heap.h
		- needs heap allocation macros for MR_new_univ_on_hp
	mercury_heap.h includes mercury_deep_profiling.h
		- needs MR_current_call_site_dynamic for recording allocations

	Breaking the circular dependency in two places, not just one, is to
	minimize similar problems in the future.

runtime/mercury_stack_layout.h:
	Delete the definition of MR_Proc_Id, which is now in mercury_proc_id.h.

runtime/mercury_type_info.h:
	Delete the macros for manipulating univs, which are now in
	mercury_univ.h.

runtime/Mmakefile:
	Mention the new files.

runtime/mercury_imp.h:
runtime/mercury.h:
runtime/mercury_construct.c:
runtime/mercury_deep_profiling.h:
	Include the new files at appropriate points.

runtime/mercury.c:
	Change the names of the functions that create heap cells for
	hand-written code, since the interface to hand-written code has
	changed to include type information.

runtime/mercury_tabling.h:
	Delete some unused macros.

runtime/mercury_trace_base.c:
runtime/mercury_type_info.c:
	Use the new macros supplying type information when constructing lists.

scripts/canonical_grade_options.sh-subr:
	Fix an undefined sh variable bug that could cause error messages
	to come out without identifying the program they were from.

scripts/init_grade_options.sh-subr:
scripts/parse_grade_options.sh-subr:
scripts/canonical_grade_options.sh-subr:
scripts/mgnuc.in:
	Handle the new grade components and the options controlling them.

trace/mercury_trace_internal.c:
	Implement the mdb command "term_size <varspec>", which is like
	"print <varspec>", but prints the size of a term instead of its value.
	In non-term-size-profiling grades, it prints an error message.

	Replace the "proc_body" command with optional arguments to the "print"
	and "browse" commands.

doc/user_guide.tex:
	Add documentation of the term_size mdb command. Since the command is
	for implementors only, and works only in grades that are not yet ready
	for public consumption, the documentation is commented out.

	Add documentation of the new arguments of the print and browse mdb
	commands. Since they are for implementors only, the documentation
	is commented out.

trace/mercury_trace_vars.[ch]:
	Add the functions needed to implement the term_size command, and
	factor out the code common to the "size" and "print"/"browse" commands.

	Decide whether to print the name of a variable before invoking the
	supplied print or browse predicate on it based on a flag design for
	this purpose, instead of overloading the meaning of the output FILE *
	variable. This arrangement is much clearer.

trace/mercury_trace_browse.c:
trace/mercury_trace_external.c:
trace/mercury_trace_help.c:
	Supply type information when constructing terms.

browser/program_representation.m:
	Since the new library module term_size_prof_builtin never generates
	any events, mark it as such, so that the declarative debugger doesn't
	expect it to generate any.

	Do the same for the deep profiling builtin module.

tests/debugger/term_size_words.{m,inp,exp}:
tests/debugger/term_size_cells.{m,inp,exp}:
	Two new test cases, each testing one of the new grades.

tests/debugger/Mmakefile:
	Enable the two new test cases in their grades.

	Disable the tests sensitive to stack frame sizes in term size profiling
	grades.

tests/debugger/completion.exp:
	Add the new "term_size" mdb command to the list of command completions,
	and delete "proc_body".

tests/debugger/declarative/dependency.{inp,exp}:
	Use "print proc_body" instead of "proc_body".

tests/hard_coded/nondet_c.m:
tests/hard_coded/pragma_inline.m:
	Use MR_offset_incr_hp instead of MR_incr_hp to ensure that all places
	that allocate cells also allocate space for the term size slot if
	necessary.

tests/valid/Mmakefile:
	Disable the IL tests in term size profiling grades, since the term size
	profiling primitives haven't been (and probably won't be) implemented
	for the MLDS backends, and handle_options causes a compiler abort
	for grades that combine term size profiling and any one of IL, Java
	and high level C.
2003-10-20 07:29:59 +00:00
Simon Taylor
40e515eb31 Fix bug which caused the results of MR_hash_string() and
Estimated hours taken: 0.5
Branches: main, release

runtime/mercury_string.h:
runtime/mercury_string.c:
	Fix bug which caused the results of MR_hash_string() and
	string__hash to differ when `sizeof(int) != sizeof(MR_Integer)'.

runtime/mercury_misc.c:
	MR_hash_string() was defined here for non-GCC compilers
	for historical reasons. Move it to mercury_string.c.
2002-11-22 15:01:10 +00:00
Simon Taylor
2b822930c3 Fix a bug which caused the results of MR_hash_string()
Estimated hours taken: 1
Branches: main, release

runtime/mercury_string.h:
	Fix a bug which caused the results of MR_hash_string()
	and string__hash to differ -- cast each character to
	MR_UnsignedChar before combining it with the hash value.

tests/hard_coded/Mmakefile:
tests/hard_coded/string_hash.{m,exp}:
	Test case.
2002-11-21 09:00:28 +00:00
Fergus Henderson
df05f27750 Fix a bug where the C code that we generated contained a type error
Estimated hours taken: 2
Branches: main, version-0_10_y

Fix a bug where the C code that we generated contained a type error
which showed up when using non-GCC C compilers.

runtime/mercury_string.h:
runtime/mercury_misc.c:
	Change the argument type for MR_hash_string() from MR_Word to
	MR_ConstString, to match the code that the MLDS back-end generates
	(and common sense).

runtime/mercury_tabling.c:
compiler/llds.m:
	Update to match the change to runtime/mercury_string.h.
2002-09-30 06:08:22 +00:00
Fergus Henderson
824861d232 Add an `MR_' prefix to a name referred to in a comment.
Estimated hours taken: 0.05
Branches: main

runtime/mercury_string.h:
	Add an `MR_' prefix to a name referred to in a comment.
2002-03-06 04:34:31 +00:00
Simon Taylor
b7c4a317e9 Add MR_ prefixes to the remaining non-prefixed symbols.
Estimated hours taken: 4
Branches: main

Add MR_ prefixes to the remaining non-prefixed symbols.

This change will require all workspaces to be updated
The compiler will start generating references to MR_TRUE,
MR_bool, etc., which are not defined in the old runtime
header files.

runtime/mercury_std.h:
	Add MR_ prefixes to bool, TRUE, FALSE, max, min,
	streq, strdiff, strtest, strntest, strneq, strndiff,
	strntest, NO_RETURN.

	Delete a commented out definition of `reg'.

runtime/mercury_tags.h:
	Add an MR_ prefix to TAGBITS.

configure.in:
runtime/mercury_goto.h:
runtime/machdeps/i386_regs.h/mercury_goto.h:
	Add an MR_ prefix to PIC.

runtime/mercury_conf_param.h:
	Allow non-prefixed PIC and HIGHTAGS to be defined on
	the command line.

runtime/mercury_bootstrap.h:
	Add backwards compatibility definitions.

RESERVED_MACRO_NAMES:
	Remove the renamed macros.

compiler/export.m:
compiler/ml_code_gen.m:
	Use MR_bool rather than MR_Bool (MR_Bool is
	meant to be for references to the Mercury type
	bool__bool).

runtime/mercury_types.h:
	Add a comment the MR_Bool is for references to
	bool__bool.

*/*.c:
*/*.h:
*/*.m:
	Add MR_ prefixes.
2002-02-18 07:01:33 +00:00
Simon Taylor
9dd11b2fc6 Smart recompilation. Record version numbers for each item
Estimated hours taken: 400

Smart recompilation. Record version numbers for each item
in interface files. Record which items are used in each compilation.
Only recompile a module if the output file does not exist or
nothing has changed.

There is still some work to do on this:
- it doesn't work with inter-module optimization.
- it doesn't work when the module name doesn't match the file name.
  (this problem will go away when mmake functionality is moved into
  the compiler.

I'll hold off documenting this change in the NEWS file and
on the web page for a month or so, until I've had a bit more
experience using it.

compiler/options.m:
compiler/handle_options.m:
doc/user_guide.texi:
	Add an option `--smart-recompilation', currently off by default.

	Add an internal option `--generate-version-numbers' to control
	whether version numbers are written to the interface files. If
	`--smart-recompilation' is disabled because the module
	is being compiled with `--intermodule-optimization' (e.g. in the
	standard library), we still want to write the version numbers
	to the interface files.

	Add an option `--verbose-recompilation' (default off)
	to write messages describing why recompilation is needed.

	Add an option `--warn-smart-recompilation' (default on)
	to control warnings relating to the smart recompilation
	system. Warn if smart recompilation will not work with
	the output and inter-module optimization options given.

compiler/recompilation.m:
	Type declarations for smart recompilation.
	Predicates to record program items used by compilation.

compiler/recompilation_version.m:
	Compute version numbers for program items in interface files.

compiler/recompilation_usage.m:
	Find all items used by a compilation.

compiler/recompilation_check.m:
	Check whether recompilation is necessary.

compiler/timestamp.m:
	Timestamp ADT for smart recompilation.

compiler/mercury_compile.m:
	Invoke the smart recompilation passes.

compiler/modules.m:
compiler/prog_io.m:
	Return timestamps for modules read.

	When reading a module make sure the current input stream
	is reset to its old value, not stdin.

	Handle version number items in interface files.

compiler/module_qual.m:
compiler/unify_proc.m:
compiler/make_hlds.m:
	Record all items used by local items.

compiler/make_hlds.m:
	Process `:- pragma type_spec' declarations in
	add_item_list_clauses. The qual_info is needed
	when processing `:- pragma type_spec' declarations
	so that any equivalence types used by the declaration
	can be recorded as used by the predicate or function to
	which the `:- pragma type_spec' applies.

compiler/equiv_type.m:
	For each imported item, record which equivalence types
	are used by that item.

compiler/hlds_module.m:
	Add a field to the module_info to store information about
	items used during compilation of a module.

compiler/check_typeclass.m:
	Make sure any items used in clauses for typeclass method
	implementations are recorded in the `.used' file.

compiler/prog_data.m:
compiler/*.m:
	Factor out some duplicated code by combining the
	pred and func, and pred_mode and func_mode items.

	Make it easier to extract the name of a type, inst or mode
	from its declaration.

	Add an item type to hold the version numbers for an interface file.

	Allow warnings to be reported for `nothing' items (used for
	reporting when version numbers are written using an
	obsolete format).

compiler/prog_io.m:
compiler/prog_io_util.m:
compiler/typecheck.m:
compiler/type_util.m:
compiler/*.m:
	Strip contexts from all types, not just those in class constraints.
	This makes it possible to use ordinary unification to check
	whether items have changed (with the exception of clauses).

	Remove code to create types with contexts in typechecking.

	Remove code scattered through the compiler to remove contexts
	from types in class constraints.

compiler/hlds_pred.m:
compiler/prog_util.m:
	Move hlds_pred__adjust_func_arity to prog_util, so that it
	can be used by the pre-hlds passes.

compiler/typecheck.m:
compiler/hlds_module.m:
	Move typecheck__visible_modules to hlds_module.m, so it can
	be used by recompilation_usage.m.

compiler/typecheck.m:
	Add a comment telling where updates may be required if the
	code to typecheck a var-functor unification changes.

compiler/error_util.m:
	Allow writing messages without contexts (used for the verbose
	recompilation messages).

	Add functions to format sym_name and sym_name_and_arity,
	and to add punctuation to the end of an error message
	without unwanted line breaks before the punctuation.

scripts/Mmake.rules:
compiler/modules.m:
	Don't remove the output file before running the compiler. We need
	to leave the old output file intact if smart recompilation detects
	that recompilation is not needed.

compiler/notes/compiler_design.html:
	Document the new modules.

library/io.m:
NEWS:
	Add predicates to find the modification time of files
	and input_streams.

library/set.m:
NEWS:
	Add a predicate version of set__fold

	Don't sort the output of set__filter, it's already sorted.

library/std_util.m:
NEWS:
	Add a predicate `std_util__map_maybe/3' and a function
  	`std_util__map_maybe/2' to apply a predicate or a function to
    	a value stored in a term of type `std_util__maybe'.

configure.in:
runtime/mercury_conf.h.in:
runtime/RESERVED_MACRO_NAMES:
	When checking whether the compiler is recent enough, check for
	the --warn-smart-recompilation option.

	Check for stat().

library/Mmakefile:
	Disable warnings about smart recompilation not working with
	`--intermodule-optimization'.

browser/Mmakefile:
	Disable warnings about smart recompilation not working when
	the module name doesn't match the file name.

runtime/mercury_string.h:
	Add a macro MR_make_string_const() which automates computation
	of the length of string argument to MR_string_const().

tests/recompilation/Mmakefile:
tests/recompilation/runtests:
tests/recompilation/test_functions:
tests/recompilation/TESTS:
tests/recompilation/README:
	A framework for testing smart recompilation.
	The option currently only works for the recompilation directory.

tests/recompilation/TEST.m.{1,2}:
tests/recompilation/TEST_2.m.{1,2}:
tests/recompilation/TEST.exp.{1,2}:
tests/recompilation/TEST.err_exp.2:
	Test cases, where TEST is one of add_constructor_r, add_instance_r,
	add_instance_2_r, add_type_nr, change_class_r, change_instance_r,
	change_mode_r, field_r, func_overloading_nr, func_overloading_r,
	lambda_mode_r, nested_module_r, no_version_numbers_r,
	pragma_type_spec_r, pred_ctor_ambiguity_r, pred_overloading_r,
	add_type_re, remove_type_re, type_qual_re.

tests/handle_options:
	Add an option `-e' to generate any missing expected output files.
2001-06-27 05:05:21 +00:00
Zoltan Somogyi
ebe9f9a3ec Add tabling of I/O actions for the debugger.
Estimated hours taken: 80

Add tabling of I/O actions for the debugger.

compiler/options.m:
	Add a new option, --trace-table-io, that enables the tabling of I/O
	actions, and another, --trace-table-io-states, that governs whether the
	tabling includes the I/O state variables themselves. (You want to table
	these variables iff they contain meaningful information that is not
	stored in global variables.) These options are for developers only
	for now.

compiler/modules.m:
	Implicitly import table_builtin if --trace-table-io is specified.

compiler/prog_data.m:
	Add eval_table_io as a new eval method.

compiler/hlds_pred.m:
	Add a mechanism for checking whether a predicate has an input/output
	pair of io__state args.

	Extend the tables indexed by eval_method to handle eval_table_io.

compiler/hlds_out.m:
	Print the eval method in HLDS dumps.

compiler/table_gen.m:
	If a procedure has a pair of I/O state args and is defined using pragma
	C code that has the tabled_for_io marker, and --trace-table-io is
	specified, then perform I/O tabling on it and mark it as tabled.

compiler/notes/compiler_design.m:
	Document that table_gen.m can now change the evaluation methods of
	procedures (to eval_table_io).

compiler/stack_layout.m:
runtime/mercury_stack_layout.h:
	Add an extra field to proc layouts. If debugging is enabled and a
	procedure has I/O state arguments, this field gives the number of the
	stack slot which will be filled with the I/O action counter at the
	time of the call, so that on retry the debugger can reset the I/O
	action counter to this value.

compiler/trace.m:
	Add code to reserve and fill this stack slot.

	Make the order of fields in the trace_slots structure match the order
	in proc layouts.

compiler/code_info.m:
compiler/live_vars.m:
	Pass a module_info to trace__setup and trace__reserved_slots.

library/io.m:
	Mark the I/O primitives (i.e. procedures that are defined by pragma C
	code and do I/O) with the tabled_for_io feature. (See the discussion
	of I/O primitives in compiler/table_gen.m.)

	Standardize the formatting of predicates defined by pragma C codes.

library/table_builtin.m:
	Define the predicates that perform I/O tabling, to which calls are
	inserted in I/O tabled predicates. These depend on knowing what the
	maximum MR_Unsigned value is.

library/table_builtin.m:
runtime/mercury_tabling_macros.h:
	Table nodes implementing a simple kind of trie, which can also be
	viewed as a hash table with the hash function hash(n) = hash - start
	were already supported by mercury_tabling.c. They are used to
	implement I/O tabling, since I/O the tabled action numbers form a
	contiguous sequence. Now allow that functionality to be accessed
	from the library through macros.

runtime/mercury_trace_base.[ch]:
	Add the global variables required by I/O tabling.

trace/mercury_trace.c:
	Implement retry across I/O by resetting the I/O counter to the value
	it had on entry to the retried call. However, since this is not safe
	in general, ask the user for permission first.

trace/mercury_trace.h:
	Add two extra arguments to MR_trace_retry to specify the input and
	output streams on which to ask permission.

trace/mercury_trace_internal.c:
	Add commands to start and stop I/O tabling. For now, they are for use
	by developers only and are undocumented; I expect they will change
	significantly before being let loose on users.

trace/mercury_trace_external.c:
trace/mercury_trace_declarative.c:
	Pass extra arguments to MR_trace_retry to indicate that these modules
	are not interested (at least for now) in retry across I/O, since they
	do not (yet) have mechanisms for asking the user for permission.

tests/debugger/tabled_read.{m,inp,exp,data}:
	A new test case to check retry across tabled and non-tabled I/O.

tests/debugger/Mmakefile:
	Enable the new test case.
2000-12-06 06:06:06 +00:00
Fergus Henderson
1e9195b4f3 Add some Mmake rules to the runtime and some code to tools/bootcheck
Estimated hours taken: 16

Add some Mmake rules to the runtime and some code to tools/bootcheck
so that we automatically check that the namespace remains clean.
Also add some `MR_' prefixes that Zoltan missed in his earlier change.

tools/bootcheck:
	Add an option, which is enabled by default, to
	build the check_namespace target in the runtime.

runtime/RESERVED_MACRO_NAMES:
	New file.  Contains a list of the macros names that
	don't start with `MR_' or the like.

runtime/Mmakefile:
	Change the rule for `check_headers' so that it checks for macros
	that don't occur in the RESERVED_MACRO_NAMES files, as well as not
	starting with `MR_' prefixes, and reports errors for such macros.
	Also add a rule for check_objs that checks whether the object
	files define any global symbols that don't have the right prefixes,
	and a rule `check_namespace' that does both of the above.

runtime/mercury_bootstrap.h:
	#include "mercury_types.h" and "mercury_float.h",
	to ensure that this header file is self-contained.
	Also make sure that all the old names are disabled if you
	compile with `-DMR_NO_BACKWARDS_COMPAT'.

runtime/mercury_context.c:
runtime/mercury_thread.h:
runtime/mercury_thread.c:
	Use `bool' rather than `MR_Bool' for the argument to
	MR_check_pending_contexts() and the return type of
	MR_init_thread(), since these are C bools, not Mercury
	bools, and there's no requirement that they have the
	same size as MR_Integer.

runtime/mercury_type_info.h:
runtime/mercury_deep_copy_body.h:
runtime/mercury_tabling.c:
library/std_util.m:
trace/mercury_trace_declarative.c:
trace/mercury_trace_external.c:
trace/mercury_trace_internal.c:
tests/hard_coded/existential_types_test.m:
	Add MR_ prefixes to UNIV_OFFSET_FOR_TYPEINFO and UNIV_OFFSET_FOR_VALUE.

trace/mercury_trace_external.c:
trace/mercury_trace_internal.c:
trace/mercury_trace_tables.c:
	Add MR_ prefixes to do_init_modules().

runtime/mercury_tabling.h:
runtime/mercury_tabling.c:
runtime/mercury_overflow.h:
runtime/mercury_debug.h:
	Add MR_ prefixes to table_*.

runtime/mercury_overflow.h:
runtime/mercury_debug.h:
	Add MR_ prefixes to IF().

runtime/mercury_context.h:
	Add MR_ prefixes to IF_MR_THREAD_SAFE().

runtime/mercury_engine.h:
	Add MR_ prefixes to IF_NOT_CONSERVATIVE_GC().

runtime/mercury_engine.h:
runtime/mercury_engine.c:
library/exception.m:
extras/aditi/aditi.m:
	Add MR_ prefixs to do_fail, do_redo, do_not_reached, etc.

compiler/trace.m:
compiler/fact_table.m:
compiler/llds_out.m:
	Add MR_ prefixes to the generated code.
2000-12-04 18:35:22 +00:00
Zoltan Somogyi
b7290b26f8 Add missing MR_ prefix.
Estimated hours taken: 0.1

runtime/mercury_string.h:
	Add missing MR_ prefix.

runtime/mercury_types.h:
	Add missing blank line.
2000-12-04 04:48:05 +00:00
Zoltan Somogyi
3f18f1c4f4 Add some missing MR_ prefixes.
Estimated hours taken: 0.2

runtime/*.[ch]:
	Add some missing MR_ prefixes.
2000-11-28 04:31:50 +00:00
Zoltan Somogyi
090552c993 Make everything in the runtime use MR_ prefixes, and make the compiler
Estimated hours taken: 10

Make everything in the runtime use MR_ prefixes, and make the compiler
bootstrap with -DMR_NO_BACKWARDS_COMPAT.

runtime/mercury_*.[ch]
	Add MR_ prefixes to all functions, global variables and almost all
	macros that could pollute the namespace. The (intentional) exceptions
	are

	1. some function, variable, type and label names that already start
	   with MR_, mercury_, Mercury or _entry;
	2. some standard C macros in mercury_std.h;
	3. the macros used in autoconfiguration (since they are used in scripts
	   as well as the runtime, the MR_ prefix may not be appropriate for
	   those).

	In some cases, I deleted things instead of adding prefixes
	if the "things" were obsolete and not user visible.

runtime/mercury_bootstrap.h:
	Provide MR_-less forms of the macros for bootstrapping and for
	backward compatibility for user code.

runtime/mercury_debug.[ch]:
	Add a FILE * parameter to a function that needs it.

compiler/code_info.m:
compiler/export.m:
compiler/fact_table.m:
compiler/llds.m:
compiler/llds_out.m:
compiler/pragma_c_gen.m:
compiler/trace.m:
	Add MR_ prefixes to the C code generated by the compiler.

library/*.m:
	Add MR_ prefixes to handwritten code.

trace/mercury_trace_*.c:
util/mkinit.c:
	Add MR_ prefixes as necessary.

extras/concurrency/semaphore.m:
	Add MR_ prefixes as necessary.
2000-11-23 02:01:11 +00:00
Fergus Henderson
31eadee813 Fix a bug in petdr's recent changes to string__format that
Estimated hours taken: 1.5

Fix a bug in petdr's recent changes to string__format that
broke things in non-gc grades on sparcs.

runtime/mercury_string.c:
	In MR_make_string(), wrap calls to restore/save_transient_hp()
	around the call to MR_make_aligned_string_msg(), as mentioned
	in the documentation for MR_make_aligned_string_msg().

runtime/mercury_string.h:
	Document that calls to MR_make_string need to be wrapped inside
	calls to save/restore_transient_hp().

library/string.m:
	Wrap calls to MR_make_string inside calls to
	save/restore_transient_hp().
2000-09-14 15:24:51 +00:00
Peter Ross
c848ef1e93 Reimplement string__format so that it uses less memory and it's
Estimated hours taken: 24

Reimplement string__format so that it uses less memory and it's
behaviour is closer to the C standard.

library/string.m:
    Reimplement string__format.  The new implementation parses the
    format string building a control structure.  The control structure
    is then traversed to generate the output.
    Implement string__to_char_list in C.
    Implement string__append_list in C as this results in no garbage
    being generated due to the creation of the intermediate strings,
    which can be quite a significant saving.
    Remove the string__append_list(out, in) mode as it generates an
    infinite number of solutions.
    Use MR_allocate_aligned_string_msg for all memory allocations.

runtime/mercury_string.h:
    Add a new macro MR_allocate_aligned_string_msg which allocates
    word aligned space for storage of a string in.

runtime/mercury_string.c:
    Define a new function MR_make_string which provides sprintf like
    functionality for creating MR_Strings.  This function is safe from
    buffer overflows providing the vsnprintf function is available.

runtime/Mmakefile:
    Add mercury_string.c

configure.in:
    Check for the vsnprintf function.

runtime/mercury_conf.h.in:
    Define HAVE_VSNPRINTF.
2000-08-10 09:01:17 +00:00
Tyson Dowd
db64a3588d Add MR_ prefixes to the types used when generating C code.
Estimated hours taken: 4

Add MR_ prefixes to the types used when generating C code.
This means types such as Word, String, Bool, Float become MR_Word,
MR_String, MR_Bool, MR_Float.  Also define MR_Box for both the LLDS and
MLDS backends so we can use it uniformly.

This is very important in environments where String or Bool have already
been used as system types (for example, managed C++).  And besides, we
should do it anyway as part of the grand namespace cleanup.

I have fixed all of the uses of the non-prefixed types in the runtime
and trace directories.  I haven't done it for the library and compiler
directories yet (no promises that I will do it in future either).  But
if you see a non-prefixed type in code from now on, please consider it a
bug and fix it.

mercury_bootstrap.h contains #defines to map the non-prefixed types into
the prefixed ones.  Like many of the other namespace cleaning backwards
compatibility macros, this can be turned off with
MR_NO_BACKWARDS_COMPAT.

This shouldn't break any code, but this kind of change affects so many
things that of course there could be problems lurking in there somewhere.

If you start getting errors from the C compiler after this change is
installed, you will want to make sure you at least have the runtime
system updated so that you are getting the backwards compatibility
definitions in mercury_bootstrap.h.  Then if you continue to have
problems you can bug me about it.

compiler/export.m:
compiler/llds_out.m:
compiler/mlds_to_c.m:
	Use MR_Word, MR_Float, MR_Bool, etc when generating C.

doc/reference_manual.texi:
	Update the reference manual to talk about MR_Word, MR_String,
	MR_Char, etc.

runtime/mercury_bootstrap.h:
	Add bootstrapping typedefs.

runtime/*:
trace/*:
	Change Word, Float, Bool, Code, String, etc to
	MR_Word, MR_Float, MR_Bool, MR_Code, MR_String.
2000-08-03 06:19:31 +00:00
Fergus Henderson
6ba459bdaa Add some `MR_' prefixes.
Estimated hours taken: 0.75

Add some `MR_' prefixes.

runtime/mercury_string.h:
runtime/mercury_misc.h:
runtime/mercury_misc.c:
runtime/mercury_make_type_info_body.h:
runtime/mercury_type_info.c:
runtime/mercury.c:
extras/*/*.m:
extras/*/*/*.m:
library/*.m:
trace/*.c:
compiler/fact_table.m:
compiler/ml_code_gen.m:
compiler/stack_layout.m:
	Add `MR_' prefixes to fatal_error(), hash_string(),
	do_hash_string, and HASH_STRING_FUNC_BODY.

runtime/mercury_bootstrap.h:
	Add backwards compatibility macros for
	fatal_error() and hash_string().
2000-05-08 13:48:50 +00:00
Fergus Henderson
f18a923aa3 New module, originally written by Tomas By <T.By@dcs.shef.ac.uk>.
Estimated hours taken: 4
	(plus unknown time by Tomas By)

library/time.m:
	New module, originally written by Tomas By <T.By@dcs.shef.ac.uk>.
	This provides an interface to the ANSI/ISO C time functions.
	I have made the following changes to Tomas By's version:
	    * bug fixes:
		- fixed a bug where it had `will_not_call_mercury'
		  instead of `may_call_mercury' in a couple of places;
		  the fix was to change the code to use
		  MR_make_aligned_string_copy() rather than
		  calling builtin__copy/2 via `pragma export'.
		- fixed a bug where it was casting an `Integer *'
		  to a `time_t *'; that won't have the desired effect
		  if `Integer' and `time_t' are not the same size.
	    * interface changes:
	        - change the return type for difftime/1 from `int' to `float',
		  to match the C `double' return type of the C difftime()
		  function.
	        - add the gmtime/1 function
	        - make the `time_t' type abstract rather than concrete
	        - changed the clocks_per_second procedure so that
		  it did not take any io__state arguments
		  (that's safe, since ANSI/ISO C requires the
		  CLOCKS_PER_SECOND macro to be a constant expression)
		- changed the DST field from `maybe(bool)' to `maybe(dst)',
		  where `dst' is an enumeration type defined as
		  `:- type dst ---> standard_time ; daylight_time'.
		- changed all the predicates with only one output
		  into functions
		- changed the code to throw exceptions on failure rather
		  than returning results of type `io__res(T)'
	    * stylistic changes:
		- declare the fields of the tms struct as type clock_t
		  rather than int
		- modified the layout to match our standard
		  Mercury coding conventions
		- added a more detailed copyright notice
		- added a few XXX comments about potential problems

library/library.m:
	Add `time' to the list of modules in the standard library.

configure.in:
runtime/mercury_conf.h.in:
	Add autoconf checks for <sys/times.h> and for the
	struct tms type and the times() function.

runtime/mercury_string.h:
runtime/mercury_bootstrap.h:
	Add `MR_' prefixes to a few macros, so that the new code
	in library/time.m doesn't need to use the old versions.
	Define the old names as aliases for the new ones
	in mercury_bootstrap.h.

NEWS:
w3/news/newsdb.inc:
	Mention the new module.
1999-10-28 06:27:20 +00:00
Zoltan Somogyi
bca4cb8162 Make the entire Mercury system bootstrap without backwards compatbility.
Estimated hours taken: 12

Make the entire Mercury system bootstrap without backwards compatbility.

compiler/llds_out.m:
compiler/pragma_c_gen.m:
	Add MR_ prefixes on generated code.

library/builtin.m:
library/exception.m:
library/private_builtin.m:
library/std_util.m:
library/store.m:
	Add MR_ prefixes.

runtime/mercury_deep_copy_body.h:
runtime/mercury_heap.h:
runtime/mercury_stacks.h:
runtime/mercury_string.h:
runtime/mercury_tags.h:
runtime/mercury_type_info.h:
	Add MR_ prefixes.

runtime/mercury_bootstrap.c;
runtime/mercury_engine.c;
runtime/mercury_ho_call.c;
runtime/mercury_tabling.c;
runtime/mercury_trace_base.c;
runtime/mercury_type_info.c;
runtime/mercury_wrapper.c;
	Add MR_ prefixes. In some cases, fix indentation; in functions using
	four-space indentation, convert all tabs to spaces.

trace/mercury_trace_browse.m:
trace/mercury_trace_declarative.m:
trace/mercury_trace_external.m:
trace/mercury_trace_help.m:
	Add MR_ prefixes.
1999-09-27 05:20:58 +00:00
Warwick Harvey
3b26c80e4e Changed all type_ctor_info structures to use the MR_TypeCtorInfo type.
Estimated hours taken: 16

Changed all type_ctor_info structures to use the MR_TypeCtorInfo type.  This
is primarily to reduce the number of conflicts when merging independent
changes to the type_ctor_info structures.  As part of this, changed the type
of `string_const' to be `String' rather than `Word *', to avoid type errors
in the initialisers for compiler-generated type_ctor_infos.

compiler/llds_out.m:
	Don't emit definitions for type_ctor_info structs; instead use
	`MR_TypeCtorInfo_struct'.
	Removed a couple of casts of `string_const's to type `String', since
	they are no longer necessary.

compiler/llds.m:
	Changed the entries for `string_const' and `multi_string_const' in
	llds__const_type/2 to be `string' rather than `data_ptr'.

library/array.m:
library/builtin.m:
library/private_builtin.m:
library/std_util.m:
runtime/mercury_bootstrap.c:
runtime/mercury_type_info.c:
	Changed all the hand-defined type_ctor_info structures to just use
	`MR_TypeCtorInfo_struct', and added appropriate casts to the
	initialisers.  This included removing what appears to have been the
	last vestiges of `USE_TYPE_LAYOUT' conditionals since their use was
	so broken that it would probably be easier to re-implement the same
	functionality from scratch than to debug and rebuild on what was left.

runtime/mercury_type_info.h:
	Introduced `struct MR_TypeCtorInfo_struct' as the name of the
	(previously anonymous) struct which `MR_TypeCtorInfo' was a pointer
	to.
	Introduced `MR_DECLARE_TYPE_CTOR_INFO_STRUCT' for declaring
	type_ctor_info structures, since `MR_DECLARE_STRUCT' generates
	old-style names for type_ctor_infos.

runtime/mercury_deep_copy.c:
runtime/mercury_tabling.c:
library/std_util.m:
extras/exceptions/exception.m:
	Changed some uses of `MR_DECLARE_STRUCT' to use
	`MR_DECLARE_TYPE_CTOR_INFO_STRUCT' instead.

runtime/mercury_bootstrap.h:
	Added some `#define's of some old type_ctor_info type names to be
	`MR_TypeCtorInfo_struct', so that during bootstrapping the type
	names generated by the old version of the compiler work with the new
	scheme used in the manual definitions.

runtime/mercury_string.h:
	Changed the type of the macro `string_const/2'.  It used to cast to
	`Word *', now it casts to `String'.
1999-08-12 09:58:49 +00:00
Fergus Henderson
875c9de295 Improve efficiency on SPARCs, by replacing calls to save_transient_registers()
Estimated hours taken: 0.5

Improve efficiency on SPARCs, by replacing calls to save_transient_registers()
with calls to save_transient_hp() where appropriate -- the latter is defined
to do nothing for conservative GC.

runtime/mercury_regs.h:
	Add save_transient_hp() and restore_transient_hp() macros
	which only guarantee to save/restore the heap pointer.

runtime/mercury_deep_copy.c:
runtime/mercury_deep_copy.h:
runtime/mercury_heap.h:
runtime/mercury_layout_util.c:
runtime/mercury_string.h:
runtime/mercury_tabling.h:
runtime/mercury_type_info.c:
	Use save_transient_hp() instead of save_transient_regs()
	and restore_transient_hp() instead of restore_transient_regs().

runtime/mercury_trail.c:
runtime/mercury_trail.h:
	Comment out a pair of calls to save/restore_transient_regs(),
	add a similar pair, also commented out, and document why
	neither pair is needed.

runtime/mercury_type_info.c:
	Add some comments.
1999-03-10 22:05:27 +00:00