mercury

mirror of https://github.com/Mercury-Language/mercury.git synced 2025-12-12 04:14:06 +00:00

Author	SHA1	Message	Date
Peter Wang	dcd969f61e	Optimise some UTF-8 routines in C grades and fix a few bugs. Branches: main, 11.07 Optimise some UTF-8 routines in C grades and fix a few bugs. library/string.m: Avoid function calls in unsafe_index, unsafe_index_next, and unsafe_prev_index in the ASCII case. Handle illegal code unit at start of string in first_char(in, uo, in) and first_char(in, uo, uo) modes. runtime/mercury_string.c: runtime/mercury_string.h: Fix a bug where MR_utf8_next would not advance from pos 0. Fortunately MR_utf8_next is only rarely called, to skip past illegal code units. Delete redundant initial test in MR_utf8_prev. Add MR_utf8_get_mb to extract multibyte code points only. Unroll a loop. Add MR_utf8_get_next_mb to extract multibyte code points only. Make MR_utf8_prev_get avoid an extra function call in the ASCII case. Use MR_Integer consistently for string offsets instead of int.	2012-03-26 06:57:34 +00:00
Peter Wang	7e26b55e74	Implement a new form of memory profiling, which tells the user what memory Branches: main Implement a new form of memory profiling, which tells the user what memory is being retained during a program run. This is done by allocating an extra word before each cell, which is used to "attribute" the cell to an allocation site. The attribution, or "allocation id", is an address to an MR_AllocSiteInfo structure generated by the Mercury compiler, giving the procedure, filename and line number of the allocation, and the type constructor and arity of the cell that it allocates. The user must manually instrument the program with calls to `benchmarking.report_memory_attribution', which forces a GC and summarises the live objects on the heap using the attributions. The mprof tool is extended with a new mode to parse and present that data. Objects which are unattributed (e.g. by hand-written C code which hasn't been updated) are still accounted for, but show up in profiles as "unknown". Currently this profiling mode only works in conjunction with the Boehm garbage collector, though in principle it can work with any memory allocator for which we can access a list of the live objects. Since term size profiling relies on the same technique of using an extra word per memory cell, the two profiling modes are incompatible. The output from `mprof -s' looks like this: ------ [1] some label ------ cells words cumul procedure / type (location) 14150 38872 total * 1949/ 13.8% 4872/ 12.5% 12.5% <predicate `parser.parse_rest/7' mode 0> 975/ 6.9% 1950/ 5.0% list.list/1 (parser.m:502) 487/ 3.4% 1948/ 5.0% term.term/1 (parser.m:501) 487/ 3.4% 974/ 2.5% term.const/0 (parser.m:501) * 1424/ 10.1% 4272/ 11.0% 23.5% <predicate `parser.parse_simple_term_2/6' mode 0> 708/ 5.0% 2832/ 7.3% term.term/1 (parser.m:643) 708/ 5.0% 1416/ 3.6% term.const/0 (parser.m:643) ... boehm_gc/alloc.c: boehm_gc/include/gc.h: boehm_gc/misc.c: boehm_gc/reclaim.c: Add a callback function to be called for every live object after a GC. Add a function to write out the GC_size_map array. compiler/layout.m: Define the alloc_site_info type which is equivalent to the MR_AllocSiteInfo C structure. Add alloc_site_array as a kind of "layout" array. compiler/llds.m: Add allocation sites to `cfile' structure. Replace TypeMsg argument (which was also for profiling) on `incr_hp' instructions by an allocation site identifier. Add a new foreign_proc_component for allocation site ids. compiler/code_info.m: compiler/global_data.m: compiler/proc_gen.m: Keep the set of allocation sites in the code_info and global_data structures. compiler/unify_gen.m: Add allocation sites to LLDS allocation instructions. compiler/layout_out.m: compiler/llds_out_file.m: compiler/llds_out_instr.m: Output MR_AllocSiteInfo arrays in generated C files. Output code to register the MR_AllocSiteInfo array with the Mercury runtime. Output allocation site ids for memory allocation instructions. compiler/llds_out_util.m: Add allocation sites to llds_out_info. compiler/pragma_c_gen.m: compiler/ml_foreign_proc_gen.m: Generate a macro MR_ALLOC_ID which resolves to an allocation site structure, for every foreign_proc whose C code contains the string "MR_ALLOC_ID". This is to be used by hand-written C code which allocates memory. MR_PROC_LABELs are retained for backwards compatibility. Though they were introduced for profiling, they seem to have been co-opted for printf-debugging since then. compiler/ml_global_data.m: Add allocation site structures to the MLDS global data. compiler/mlds.m: compiler/ml_unify_gen.m: Add allocation site id to `new_object' instruction. compiler/mlds_to_c.m: Output allocation site arrays and allocation ids in high-level C code. Output a call to register the allocation site array with the Mercury runtime. Delete an unused predicate. compiler/exprn_aux.m: compiler/jumpopt.m: compiler/livemap.m: compiler/mercury_compile_llds_back_end.m: compiler/middle_rec.m: compiler/ml_accurate_gc.m: compiler/ml_elim_nested.m: compiler/ml_optimize.m: compiler/ml_util.m: compiler/mlds_to_cs.m: compiler/mlds_to_gcc.m: compiler/mlds_to_il.m: compiler/mlds_to_java.m: compiler/mlds_to_managed.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/use_local_vars.m: compiler/var_locn.m: Conform to changes. compiler/pickle.m: compiler/prog_event.m: compiler/timestamp.m: Conform to changes in memory allocation macros. library/benchmarking.m: Add the `report_memory_attribution' instrumentation predicates. Conform to changes to MR_memprof_record. library/array.m: library/bit_buffer.m: library/bitmap.m: library/construct.m: library/deconstruct.m: library/dir.m: library/io.m: library/mutvar.m: library/store.m: library/string.m: library/thread.semaphore.m: library/version_array.m: Use attributed memory allocation throughout the standard library so that objects don't show up in the memory profile as "unknown". Replace MR_PROC_LABEL by MR_ALLOC_ID. mdbcomp/program_representation.m: mdbcomp/rtti_access.m: Replace MR_PROC_LABEL by MR_ALLOC_ID. profiler/Mercury.options: profiler/globals.m: profiler/mercury_profile.m: profiler/options.m: profiler/output.m: profiler/snapshots.m: Add a new mode to `mprof' to parse and present the data from `Prof.Snapshots' files. Add options for the new profiling mode. profiler/process_file.m: Fix a typo. runtime/mercury_conf_param.h: #define MR_MPROF_PROFILE_MEMORY_ATTRIBUTION if memory profiling is enabled and we are using Boehm GC. runtime/mercury.h: Make MR_new_object take an allocation id argument. Conform to changes in memory allocation macros. runtime/mercury_memory.c: runtime/mercury_memory.h: runtime/mercury_types.h: Define MR_AllocSiteInfo. Add memory allocation functions and macros which take into the account the additional word necessary for the new profiling mode. These should be used in preferences to the raw memory allocation functions wherever possible so that objects do not show up in the profile as "unknown". Add analogues of realloc/free which take into account the offset introduced by the attribution word. Add function versions of the MR_new_object macros, which can't be written in standard C. They are only used when necessary. Add built-in allocation site ids, to be used in the runtime and other hand-written code when context-specific ids are unavailable. runtime/mercury_heap.h: Make MR_tag_offset_incr_hp_msg and MR_tag_offset_incr_hp_atomic_msg allocate an extra word when memory attribution is desired, and store the allocation id there. Similarly for MR_create{1,2,3}_msg. Replace proclabel arguments in allocation macros by alloc_id arguments. Replace MR_hp_alloc_atomic by MR_hp_alloc_atomic_msg. It was only used for boxing floats. Conform to change to MR_new_object macro. runtime/mercury_bootstrap.h: Delete obsolete macro hp_alloc_atomic. runtime/mercury_heap_profile.c: runtime/mercury_heap_profile.h: Add the code to summarise the live objects on the Boehm GC heap and writes out the data to `Prof.Snapshots', for display by mprof. Don't store the procedure name in MR_memprof_record: the procedure address is enough and faster to compare. runtime/mercury_prof.c: Finish and close the `Prof.Snapshots' file when the program terminates. Conform to changes in MR_memprof_record. runtime/mercury_misc.h: Add a macro to expand to the name of the allocation sites array in LLDS grades. runtime/mercury_bitmap.c: runtime/mercury_bitmap.h: Pass allocation id through bitmap allocation functions. Delete unused function MR_string_to_bitmap. runtime/mercury_string.h: Add MR_make_aligned_string_copy_msg. Make string allocation macros take allocation id arguments. runtime/mercury.c: runtime/mercury_array_macros.h: runtime/mercury_context.c: runtime/mercury_deconstruct.c: runtime/mercury_deconstruct_macros.h: runtime/mercury_dlist.c: runtime/mercury_engine.c: runtime/mercury_float.h: runtime/mercury_hash_table.c: runtime/mercury_ho_call.c: runtime/mercury_label.c: runtime/mercury_prof_mem.c: runtime/mercury_stacks.c: runtime/mercury_stm.c: runtime/mercury_string.c: runtime/mercury_thread.c: runtime/mercury_trace_base.c: runtime/mercury_trail.c: runtime/mercury_type_desc.c: runtime/mercury_type_info.c: runtime/mercury_wsdeque.c: Use attributed memory allocation throughout the runtime so that objects don't show up in the profile as "unknown". runtime/mercury_memory_zones.c: Attribute memory zones to the Mercury runtime. runtime/mercury_tabling.c: runtime/mercury_tabling.h: Use attributed memory allocation macros for tabling structures. Delete unused MR_table_realloc_* and MR_table_copy_bytes macros. runtime/mercury_deep_copy_body.h: Try to retain the original attribution word when copying values. runtime/mercury_ml_expand_body.h: Conform to changes in memory allocation macros. runtime/mercury_tags.h: Replace proclabel arguments by alloc_id arguments in allocation macros. runtime/mercury_wrapper.c: If memory attribution is enabled, tell Boehm GC that pointers may be displaced by an extra word. trace/mercury_trace.c: trace/mercury_trace_tables.c: Conform to changes in memory allocation macros. extras/net/tcp.m: extras/solver_types/library/any_array.m: extras/trailed_update/tr_array.m: Conform to changes in memory allocation macros. doc/user_guide.texi: Document the new profiling mode. doc/reference_manual.texi: Update a commented out example.	2011-05-20 04:16:58 +00:00
Peter Wang	3788a9d6fb	Improve Unicode support. Branches: main Improve Unicode support. Declare that we use the Unicode character set, and UTF-8 or UTF-16 for the internal string representation (depending on the backend). User code may be written to those assumptions. Other external encodings can be supported in the future by translating to/from Unicode internally. The `char' type now represents a Unicode code point. NOTE: questions about how to handle unpaired surrogate code points, etc. have been left for later. library/char.m: Define a `char' to be a Unicode code point and extend ranges appropriately. Add predicates: to_utf8, to_utf16, is_surrogate, is_noncharacter. Update some documentation. library/io.m: Declare I/O predicates on text streams to read/write code points, not ambiguous "characters". Text files are expected to use UTF-8 encoding. Supporting other encodings is for future work. Update the C and Erlang implementations to understand UTF-8 encoding. Update Java and C# implementations to read/write code points (Mercury char) instead of UTF-16 code units. Add `may_not_duplicate' attributes to some foreign_procs. Improve Erlang implementations of seeking and getting the stream size. library/string.m: Declare the string representations, as described earlier. Distinguish between code units and code points everywhere. Existing functions and predicates which take offset and length arguments continue to take them in terms of code units. Add procedures: count_code_units, count_codepoints, codepoint_offset, to_code_unit_list, from_code_unit_list, index_next, unsafe_index_next, unsafe_prev_index, unsafe_index_code_unit, split_by_codepoint, left_by_codepoint, right_by_codepoint, substring_by_codepoint. Make index, index_det call error/1 if an illegal sequence is detected, as they already do for invalid offsets. Clarify that is_all_alpha, is_all_alnum_or_underscore, is_alnum_or_underscore only succeed for the ASCII characters under each of those categories. Clarify that whitespace stripping functions only strip whitespace characters in the ASCII range. Add comments about the future treatment of surrogate code points (not yet implemented). Use Mercury format implementation when necessary instead of `sprintf'. The %c specifier does not work for code points which require multi-byte representation. The field width modifier for %s only works if the string contains only single-byte code points. library/lexer.m: Conform to string encoding changes. Simplify code dealing with \uNNNN escapes now that encoding/decoding is handled by the string module. library/term_io.m: Allow code points above 126 directly in Mercury source. NOTE: \x and \o codes are treated as code points by this change. runtime/mercury_types.h: Redefine `MR_Char' to be `int' to hold a Unicode code point. `MR_String' has to be defined as a pointer to `char' instead of a pointer to `MR_Char'. Some C foreign code will be affected by this change. runtime/mercury_string.c: runtime/mercury_string.h: Add UTF-8 helper routines and macros. Make hash routines conform to type changes. compiler/c_util.m: Fix output_quoted_string_lang so that it correctly outputs non-ASCII characters for each of the target languages. Fix quote_char for non-ASCII characters. compiler/elds_to_erlang.m: Write out code points above 126 normally instead of using escape syntax. Conform to string encoding changes. compiler/mlds_to_cs.m: Change Mercury `char' to be represented by C# `int'. compiler/mlds_to_java.m: Change Mercury `char' to be represented by Java `int'. doc/reference_manual.texi: Uncomment description of \u and \U escapes in string literals. Update description of C# and Java representations for Mercury `char' which are now `int'. tests/debugger/tailrec1.m: Conform to renaming. tests/general/string_replace.exp: tests/general/string_replace.m: Test non-ASCII characters to string.replace. tests/general/string_test.exp: tests/general/string_test.m: Test non-ASCII characters to string.duplicate_char, string.pad_right, string.pad_left, string.format_table. tests/hard_coded/char_unicode.exp: tests/hard_coded/char_unicode.m: Add test for new procedures in `char' module. tests/hard_coded/contains_char_2.m: Test non-ASCII characters to string.contains_char. tests/hard_coded/nonascii.exp: tests/hard_coded/nonascii.m: tests/hard_coded/nonascii_gen.c: Add code points above 255 to this test case. Change test data encoding to UTF-8. tests/hard_coded/string_class.exp: tests/hard_coded/string_class.m: Add test case for string.is_alpha, etc. tests/hard_coded/string_codepoint.exp: tests/hard_coded/string_codepoint.exp2: tests/hard_coded/string_codepoint.m: Add test case for new string procedures dealing with code points. tests/hard_coded/string_first_char.exp: tests/hard_coded/string_first_char.m: Add test case for all modes of string.first_char. tests/hard_coded/string_hash.m: Don't use buggy random.random/5 predicate which can overflow on a large range (such as the range of code points). tests/hard_coded/string_presuffix.exp: tests/hard_coded/string_presuffix.m: Add test case for string.prefix, string.suffix, etc. tests/hard_coded/string_set_char.m: Test non-ASCII characters to string.set_char. tests/hard_coded/string_strip.exp: tests/hard_coded/string_strip.m: Test non-ASCII characters to string stripping procedures. tests/hard_coded/string_sub_string_search.m: Test non-ASCII characters to string.sub_string_search. tests/hard_coded/unicode_test.exp: Update expected output due to change of behaviour of `string.to_char_list'. tests/hard_coded/unicode_test.m: Test non-ASCII character in separator string argument to string.join_list. tests/hard_coded/utf8_io.exp: tests/hard_coded/utf8_io.m: Add tests for UTF-8 I/O. tests/hard_coded/words_separator.exp: tests/hard_coded/words_separator.m: Add test case for `string.words_separator'. tests/hard_coded/Mmakefile: Add new test cases. Make special_char test case run on all backends. tests/hard_coded/special_char.exp: tests/valid/mercury_java_parser_follow_code_bug.m: Reencode these files in UTF-8. NEWS: Add a news entry.	2011-04-04 07:10:42 +00:00
Peter Wang	382013edd8	Add missing non-macro definitions of MR_hash_string2, MR_hash_string3. Branches: main, 11.01 runtime/mercury_string.c: runtime/mercury_string.h: Add missing non-macro definitions of MR_hash_string2, MR_hash_string3. These are required for non-gcc compilers.	2011-02-09 00:36:01 +00:00
Zoltan Somogyi	c959a1657f	Convert all remaining C source files to four-space indentation. Estimated hours taken: 0.5 Branches: main runtime/*.c: Convert all remaining C source files to four-space indentation. Fix some deviations from our style guide.	2006-11-14 00:15:41 +00:00
Simon Taylor	40e515eb31	Fix bug which caused the results of MR_hash_string() and Estimated hours taken: 0.5 Branches: main, release runtime/mercury_string.h: runtime/mercury_string.c: Fix bug which caused the results of MR_hash_string() and string__hash to differ when `sizeof(int) != sizeof(MR_Integer)'. runtime/mercury_misc.c: MR_hash_string() was defined here for non-GCC compilers for historical reasons. Move it to mercury_string.c.	2002-11-22 15:01:10 +00:00
Simon Taylor	b7c4a317e9	Add MR_ prefixes to the remaining non-prefixed symbols. Estimated hours taken: 4 Branches: main Add MR_ prefixes to the remaining non-prefixed symbols. This change will require all workspaces to be updated The compiler will start generating references to MR_TRUE, MR_bool, etc., which are not defined in the old runtime header files. runtime/mercury_std.h: Add MR_ prefixes to bool, TRUE, FALSE, max, min, streq, strdiff, strtest, strntest, strneq, strndiff, strntest, NO_RETURN. Delete a commented out definition of `reg'. runtime/mercury_tags.h: Add an MR_ prefix to TAGBITS. configure.in: runtime/mercury_goto.h: runtime/machdeps/i386_regs.h/mercury_goto.h: Add an MR_ prefix to PIC. runtime/mercury_conf_param.h: Allow non-prefixed PIC and HIGHTAGS to be defined on the command line. runtime/mercury_bootstrap.h: Add backwards compatibility definitions. RESERVED_MACRO_NAMES: Remove the renamed macros. compiler/export.m: compiler/ml_code_gen.m: Use MR_bool rather than MR_Bool (MR_Bool is meant to be for references to the Mercury type bool__bool). runtime/mercury_types.h: Add a comment the MR_Bool is for references to bool__bool. /.c: /.h: /.m: Add MR_ prefixes.	2002-02-18 07:01:33 +00:00
Simon Taylor	c66cea0665	Add MR_ prefixes to uses of configuration macros. Estimated hours taken: 2.5 Branches: main Add MR_ prefixes to uses of configuration macros. Bootcheck now succeeds with MR_NO_CONF_BACKWARDS_COMPAT. Mmake.common.in: Define MR_NO_CONF_BACKWARDS_COMPAT when checking for namespace cleanliness. RESERVED_MACRO_NAMES: Remove the configuration macros. runtime/mercury_conf_bootstrap.h: Remove a duplicate definition of BOXED_FLOAT. configure.in: /.c: /.h: /.m: Add MR_ prefixes.	2002-02-13 09:56:49 +00:00
Peter Ross	eeee9be32e	Fix a bug where MR_make_string introduced a space leak under windows Estimated hours taken: 1 runtime/mercury_string.c: Fix a bug where MR_make_string introduced a space leak under windows when the created string was greater then the fixed buffer size as the conditional compilation wasn't including the call to MR_free.	2001-01-08 14:46:12 +00:00
Zoltan Somogyi	3f18f1c4f4	Add some missing MR_ prefixes. Estimated hours taken: 0.2 runtime/*.[ch]: Add some missing MR_ prefixes.	2000-11-28 04:31:50 +00:00
Zoltan Somogyi	090552c993	Make everything in the runtime use MR_ prefixes, and make the compiler Estimated hours taken: 10 Make everything in the runtime use MR_ prefixes, and make the compiler bootstrap with -DMR_NO_BACKWARDS_COMPAT. runtime/mercury_.[ch] Add MR_ prefixes to all functions, global variables and almost all macros that could pollute the namespace. The (intentional) exceptions are 1. some function, variable, type and label names that already start with MR_, mercury_, Mercury or _entry; 2. some standard C macros in mercury_std.h; 3. the macros used in autoconfiguration (since they are used in scripts as well as the runtime, the MR_ prefix may not be appropriate for those). In some cases, I deleted things instead of adding prefixes if the "things" were obsolete and not user visible. runtime/mercury_bootstrap.h: Provide MR_-less forms of the macros for bootstrapping and for backward compatibility for user code. runtime/mercury_debug.[ch]: Add a FILE parameter to a function that needs it. compiler/code_info.m: compiler/export.m: compiler/fact_table.m: compiler/llds.m: compiler/llds_out.m: compiler/pragma_c_gen.m: compiler/trace.m: Add MR_ prefixes to the C code generated by the compiler. library/.m: Add MR_ prefixes to handwritten code. trace/mercury_trace_.c: util/mkinit.c: Add MR_ prefixes as necessary. extras/concurrency/semaphore.m: Add MR_ prefixes as necessary.	2000-11-23 02:01:11 +00:00
Fergus Henderson	31eadee813	Fix a bug in petdr's recent changes to string__format that Estimated hours taken: 1.5 Fix a bug in petdr's recent changes to string__format that broke things in non-gc grades on sparcs. runtime/mercury_string.c: In MR_make_string(), wrap calls to restore/save_transient_hp() around the call to MR_make_aligned_string_msg(), as mentioned in the documentation for MR_make_aligned_string_msg(). runtime/mercury_string.h: Document that calls to MR_make_string need to be wrapped inside calls to save/restore_transient_hp(). library/string.m: Wrap calls to MR_make_string inside calls to save/restore_transient_hp().	2000-09-14 15:24:51 +00:00
Peter Ross	8ebcd2f2e5	Fix two bugs: Estimated hours taken: 8 runtime/mercury_string.c: Fix two bugs: - the variable size should hold the size of the fixed array initially - when resizing the array record the pointer to the new resized array.	2000-08-21 17:26:08 +00:00
Peter Ross	ccd39478c2	Simplify the logic of MR_make_string as suggested by Fergus Estimated hours taken: 0.1 runtime/mercury_string.c: Simplify the logic of MR_make_string as suggested by Fergus Henderson <fjh@cs.mu.oz.au>.	2000-08-16 16:48:42 +00:00
Peter Ross	b5c0e0d4cf	Modify MR_make_string so that it first tries printing into a Estimated hours taken: 0.5 runtime/mercury_string.c: Modify MR_make_string so that it first tries printing into a fixed size buffer. Only if that buffer is not big enough do we allocate a buffer on the heap.	2000-08-16 16:02:21 +00:00
Peter Ross	5983494ac0	Check for the availability of _vsnprintf. Estimated hours taken: 0.5 Check for the availability of _vsnprintf. configure.in: Check for the availability of _vsnprintf. runtime/mercury_conf.h.in: Define HAVE__VSNPRINTF. runtime/mercury_string.c: If vsnprintf isn't available and _vsnprintf is, use _vsnprintf in place of vsnprintf.	2000-08-11 13:49:58 +00:00
Peter Ross	522a35ad96	Remove a cast to (char ). Estimated hours taken: 0.1 runtime/mercury_string.c: Remove a cast to (char ).	2000-08-10 09:37:32 +00:00
Peter Ross	6f88f83f55	Obey the mercury C coding convention. Estimated hours taken: 0.1 runtime/mercury_string.c: Obey the mercury C coding convention.	2000-08-10 09:31:22 +00:00
Peter Ross	af42c5093f	s/HAVE_VFPRINTF/HAVE_VSNPRINTF/ Estimated hours taken: 0.1 mercury_string.c: s/HAVE_VFPRINTF/HAVE_VSNPRINTF/	2000-08-10 09:23:36 +00:00
Peter Ross	c848ef1e93	Reimplement string__format so that it uses less memory and it's Estimated hours taken: 24 Reimplement string__format so that it uses less memory and it's behaviour is closer to the C standard. library/string.m: Reimplement string__format. The new implementation parses the format string building a control structure. The control structure is then traversed to generate the output. Implement string__to_char_list in C. Implement string__append_list in C as this results in no garbage being generated due to the creation of the intermediate strings, which can be quite a significant saving. Remove the string__append_list(out, in) mode as it generates an infinite number of solutions. Use MR_allocate_aligned_string_msg for all memory allocations. runtime/mercury_string.h: Add a new macro MR_allocate_aligned_string_msg which allocates word aligned space for storage of a string in. runtime/mercury_string.c: Define a new function MR_make_string which provides sprintf like functionality for creating MR_Strings. This function is safe from buffer overflows providing the vsnprintf function is available. runtime/Mmakefile: Add mercury_string.c configure.in: Check for the vsnprintf function. runtime/mercury_conf.h.in: Define HAVE_VSNPRINTF.	2000-08-10 09:01:17 +00:00

20 Commits