mercury

mirror of https://github.com/Mercury-Language/mercury.git synced 2026-04-15 09:23:44 +00:00

Author	SHA1	Message	Date
Zoltan Somogyi	386160f937	s/dont/do_not/ in the compiler directory. compiler/*.m: Standardize on the do_not spelling over the dont contraction in the compiler directory. (We used to have a lot of both spellings.)	2024-08-12 12:49:23 +02:00
Zoltan Somogyi	28ab8c2ade	Group together related builtin operations. compiler/builtin_ops.m: Replace six individual builtin comparison ops for str_{eq,ne,lt,le,gt,ge} with a single str_cmp/1 function symbol, whose argument is one of {eq,ne,lt,le,gt,ge}. Do the same with comparison operations on integers (including the operations that compare signed integers as if they were unsigned) and floats. The eq and ne operations on integers had names that did not fit into the scheme used by the other binops; this diff fixes that. Replace five individual builtin arithmetic ops for int_{add,sub,mul,mod} with a single int_arity/2 function symbol, one of whose arguments is one of {add,sub,mul,rem}. (This diff renames the "mod" (modulus) op to "rem" (remainder), as an XXX has been asking for a long time.) The other argument specifies which integer type the operation is on. Do a similar change for float arithmetic ops, with the exception that floats don't support the remainder op. The points of the above changes are - to allow us to factor out commonalities between operations, both between e.g. all comparison operations on integers, and between e.g. lt comparisons on values of different types. - to stop forcing switches on binops to make distinctions that they do not actually care about. Rename the old str_cmp op, which returns a negative, zero or positive result (as does strcmp in C) to str_nzp, since the str_cmp name is now used for something else. Add some utility functions here, to allow the deletion of the many existing copies of the bodies of those functions elsewhere in the compiler. compiler/closure_gen.m: compiler/code_util.m: compiler/dense_switch.m: compiler/disj_gen.m: compiler/ite_gen.m: compiler/jumpopt.m: compiler/llds.m: compiler/llds_out_data.m: compiler/lookup_switch.m: compiler/middle_rec.m: compiler/ml_disj_gen.m: compiler/ml_foreign_proc_gen.m: compiler/ml_global_data.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_simplify_switch.m: compiler/ml_string_switch.m: compiler/ml_unify_gen.m: compiler/ml_unify_gen_test.m: compiler/mlds_dump.m: compiler/mlds_to_c_data.m: compiler/mlds_to_cs_data.m: compiler/mlds_to_java_data.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/peephole.m: compiler/pragma_c_gen.m: compiler/string_switch.m: compiler/tag_switch.m: compiler/trace_gen.m: compiler/transform_llds.m: compiler/unify_gen.m: compiler/unify_gen_test.m: Conform to the changes above, by either generating or consuming binops in their new form.	2024-07-13 15:02:08 +02:00
Zoltan Somogyi	2a63738b8e	Implement det/semidet string trie lookup switches. compiler/string_switch.m: Implement single-solution string trie lookup switches. The code managing the lookup table is new, while the code managing the trie search generalizes existing code. The latter required some redrawing of the predicate boundaries within that existing code, as well as adjusting some types and variable names. Include "jump" in the name of the non-lookup versions of string switches. Put state var arguments last in some predicate signatures. compiler/switch_gen.m: Enable single-solution string trie lookup switches. compiler/string_switch_util.m: Delete the call to build_str_case_id_list from the create_trie predicate, since it is needed only by its old caller, the implementation of string trie JUMP switches (which now does it itself), and not by its new caller, the implementation of string trie LOOKUP switches. compiler/lookup_util.m: compiler/code_util.m: Give some predicates more expressive names. compiler/code_loc_dep.m: compiler/disj_gen.m: compiler/jumpopt.m: compiler/lookup_switch.m: compiler/middle_rec.m: compiler/ml_string_switch.m: compiler/tag_switch.m: compiler/unify_gen_test.m: Conform to the changes above. compiler/hlds_goal.m: Fix a comment. tests/hard_coded/space.m: This test case caught a bug in an early version of this diff. Document this fact. Make the code more readable by - aligning the columns in some tables, - renaming some function symbols to avoid ambiguity, - replacing the remnants of calls to Prolog's "is" predicate with idiomatic Mercury code, and - deleting commented-out dead code that duplicated the body of predicate. tests/hard_coded/Mercury.options: Make space.m's role as a test case for string trie switches official by compiling it with options that force trie switches.	2024-04-03 09:19:37 +11:00
Zoltan Somogyi	9dbee8bdb4	Implement trie string switches for the LLDS backend. For now, the implementation covers only non-lookup switches. compiler/builtin_ops.m: Generalize the existing offset_str_eq binary op by adding an optional size parameter, which, if present, restricts the equality test to look at the given number of code units at most. compiler/llds_out_data.m: compiler/mlds_to_c_data.m: Generalize the output of binop rvals whose operation is offset_str_eq. In llds_out_data.m, fix a bug in the original code. (This bug did not lead to problems because before this diff, we never generated this op.) compiler/string_switch_util.m: Add a predicate that recognizes when a trie node that is NOT a leaf nevertheless represents the top of a stick, which means that it has only one possible next code unit, which itself may have only one possible next code unit, and so on, until we reach a node that does have two or more next code units. (One of those may be the code unit of the string-ending NULL character.) compiler/ml_string_switch.m: Use the new predicate in string_switch_util.m to generate better code for sticks. Instead of comparing each character in the stick individually against the relevant code unit of the string being switched on, compare them all at once using the new binary op. compiler/ml_switch_gen.m: Insist on both the host machine and the target machine using the C backend. compiler/string_switch.m: Implement non-lookup trie switches. The code follows the approach used in ml_string_switch.m as much as possible, but there are plenty of differences caused by targeting the LLDS. Rename some predicates to specify which switch implementation method they belong to. Write a comment just once, and refer to it from elsewhere instead of duplicating it at each reference site. compiler/switch_gen.m: Enable the use of trie switches when the option values call for it, and when the switch is not a lookup switch. compiler/cse_detection.m: Do not flood the output of mmc -V with messages that have nothing to do with the module being compiled. compiler/options.m: Add a way to specify --no-allow-inlining on the command line. This can help debug code generator changes like this, by disallowing a transform that can modify the Mercury code whose compilation process you are trying to debug. (The documentation of the --inlining option implies that --no-inlining should do the same job, but it does not.) The option is not documented for users. compiler/string_encoding.m: Provide a version of from_code_unit_list_in_encoding that allows non-well-formed code unit sequences as input, and provide det versions of both versions. This is for use by both string_switch.m and ml_string_switch.m. compiler/hlds_goal.m: Document the properties of case_ids. compiler/llds.m: Document the possibility that string constants are not well formed. compiler/bytecode.m: compiler/code_util.m: compiler/mlds_dump.m: compiler/ml_global_data.m: compiler/mlds_to_cs_data.m: compiler/mlds_to_java_data.m: compiler/opt_debug.m: Conform to the changes above. library/string.m: Replace the non-exported test predicate internal_encoding_is_utf8 with an exported function that returns an enum specifying the string encoding. NEWS.md: Announce the new function. runtime/mercury_string.h: Add the C macro that implements the new form of the offset_str_eq binary op. tests/hard_coded/string_switch4.{m,exp}: We have long had three copies of the exact same code, in string_switch.m, string_switch2.m and string_switch3.m, which were compiled with - no smart switch implementation - smart switch implementation forced to use the hash table method - smart switch implementation forced to use binary search method Add this new copy, which is compiled with - smart switch implementation forced to use the new trie method tests/hard_coded/Mmakefile: Add the new test case. tests/hard_coded/Mercury.options: Update the options of the test cases, and specify them for the new. tests/hard_coded/string_switch.m: tests/hard_coded/string_switch2.m: tests/hard_coded/string_switch3.m: Update the top-of-module comment block to be identical in all four copies of this module.	2024-03-26 21:17:31 +11:00
Zoltan Somogyi	d5190e93c5	Fix LLDS/MLDS diffs in control of string switches. Some of these diffs involve string lookup switches. compiler/ml_lookup_switch.m: When testing whether a switch is a lookup switch for the MLDS, we sometimes need to update the code generator's state. We used to return the updated state whether or not the switch is a lookup switch, which is incorrect if the switch is NOT a lookup switch. (The incorrectness used to show up as allocated but unused entities, such as slots in the global const data table, which are harmless enough not to lead to crashes.) Fix this by putting the updated code generator state as a new argument into the function symbol that we return only when the switch is a lookup switch. compiler/lookup_switch.m: The predicate we used to test whether a switch is a lookup switch for the LLDS used to be semidet, so it did have the above problem. It did have the problem that calls to it returned only the info appropriate for success; they did not return an indication about whether they succeed in a way that could be stored. This meant every method of implementing string switches had to repeat the call. Fix this by changing the predicate to return a success/failure indication, making it det. Make it use the new technique in ml_lookup_switch.m to avoid using inappropriate code generator states. In this case, that also means moving the code that remembers the branch start position of the code generator state here from our caller. This move allows us to delete a reset to the just-remembered position, which was never needed. compiler/ml_string_switch.m: compiler/string_switch.m: Conform to the changes in ml_lookup_switch.m/lookup_switch.m. compiler/ml_switch_gen.m: Move producers of variables to just the code branches that need the value of that variable. Conform to the changes in ml_lookup_switch.m. compiler/switch_gen.m: Move the code that decides how to implement a smart switch on a string value to a predicate of its own, to match ml_switch_gen.m. Change the structure of the moved code to follow the structure in ml_switch_gen.m. Conform to the changes in ml_lookup_switch.m.	2024-03-22 20:18:46 +11:00
Zoltan Somogyi	d385cdca37	Replace reversed lists with cords. Add ml_ prefixes to some predicate names. Put a piece of code into a predicate of its own, to prevent it from distracting readers of the original predicate with its not-very-important detail.	2024-03-22 04:10:15 +11:00
Zoltan Somogyi	0ed6e2d0d4	Prepare for string trie switches in the LLDS. compiler/ml_string_switch.m: compiler/string_switch_util.m: Move the backend-agnostic part of the existing MLDS implementation of string trie switches from ml_string_switch.m to string_switch_util.m. Clean it up a bit for more general use. compiler/string_encoding.m: Document the exported predicates and functions.	2024-03-20 02:16:16 +11:00
Zoltan Somogyi	a6d81a3bb9	Carve three new modules out of switch_util.m. compiler/lookup_switch_util.m: compiler/string_switch_util.m: compiler/tag_switch_util.m: Carve these three new modules out of switch_util.m. As their names imply, they contain the parts of the old switch_util.m that are concerned with lookup switches, switches on strings, and switches on tags respectively. compiler/switch_util.m: Delete the code moved to the new modules. compiler/backend_libs.m: Include the new modules in the backend_libs package. compiler/notes/compiler_design.html: Document the new modules. compiler/dense_switch.m: compiler/lookup_switch.m: compiler/ml_lookup_switch.m: compiler/ml_simplify_switch.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: compiler/simplify_goal_switch.m: compiler/string_switch.m: compiler/switch_case.m: compiler/switch_gen.m: compiler/tag_switch.m: Conform to the changes above by importing one, or sometimes two, of the new modules, usually instead of switch_util.m, sometimes in addition to switch_util.m. In a few cases, delete explicit module qualifications that this diff has made incorrect.	2024-03-12 22:13:59 +11:00
Julien Fischer	f765494ec9	Fix spelling. compiler/ml_string_switch.m: As above.	2023-04-05 00:38:24 +10:00
Zoltan Somogyi	c138bbb632	Fix a bug in string trie jump switches. compiler/ml_string_switch.m: Fix a too-strong sanity check. It insisted on a semidet switch containing code to handle the failure of the switch, but a switch on strings can be both semidet and cannot_fail if - the switched-on variable's inst is known to contain only the strings handled by the arms of the switch, and - one or more of the switch arms containing semidet code. In that case, the switch does not need a default case, since it would be unreachable. compiler/options.m: Provide a way to test for the presence of this fix.	2022-12-08 19:15:56 +11:00
Zoltan Somogyi	72e0014003	Rename more predicates to avoid ambiguities.	2022-07-07 06:24:09 +10:00
Zoltan Somogyi	9012395ec2	Don't let ml_tag_switch.m generate duplicate fields. This fixes the second problem identified by Mantis bug #548. compiler/ml_tag_switch.m: Detect the circumstances in which this problem would arise. In such cases, simply fail, and let ml_switch_gen.m fall back to implementing the switch as an if-then-else chain. compiler/ml_switch_gen.m: Implement that fallback. compiler/switch_util.m: The new code in ml_tag_switch.m needs to thread a fourth piece of state through the predicate it passes to group_cases_by_ptag, so change its argument list to accommodate such predicates. And since some other modules pass the same predicates to group_cases_by_ptag and string_binary_cases, make the same change in the argument list of that predicate as well. Delete one stray comment, and note that another comment seems misplaced. compiler/ml_string_switch.m: compiler/string_switch.m: compiler/switch_case.m: compiler/tag_switch.m: Conform to the changes in switch_util.m. tests/hard_coded/bug548.exp: tests/hard_coded/Mmakefile: Enable the previously-added test case for Mantis #548, after add an .exp file for it.	2022-02-11 21:32:53 +11:00
Zoltan Somogyi	9ddb180757	Handle const_var_maps left by add_trail_ops.m. This fixes Mantis bug #544. The code of add_trail_ops.m can transform <code that adds an entry to const_var_map> into ( ... <code that adds an entry to const_var_map> ... ; ..., fail ) where the const_var_map in the MLDS code generator records which variables' values are available as ground terms. The MLDS code generator used to reset the const_var_map in its main data structure, the ml_gen_info, at the end of every disjunction (actually, at the end of every branched control structure) to the value it had at the start. This was intended to prevent the code following the branched structure from relying on const_var_map entries that were added to the const_var_map on some branches, but not others. However, in this case, it has the effect of forgetting the entry added by the first disjunct, even though - the code after the disjunction can be reached only via the first disjunct, and - the code after the disjunction (legitimately, until add_trail_ops) depended on that entry being available. The fix is to allow the code after a branched control structure to depend on any const_var_map entry that is present in the final const_var_map in every branch of the branched control structure whose end is reachable. The LLDS code generator was not affected by the bug, because it uses totally separate systems both for implementing trailing, and for keeping track of what variables' values are available statically. In particular, it does not rely on operations inserted and the annotations left on unifications by the add_trail_ops and mark_static_term passes, having been written long before either module existed. compiler/hlds_goal.m: Document the update above to what may be marked static. compiler/ml_gen_info.m: Document the updated protocol for handling the const_var_map field. Use a named type instead of its expansion. compiler/ml_code_gen.m: Make the predicates that generate code for a branch in a branched control structure return the final const_var_maps from the branches whose endpoints are reachable. Add a predicate that computes the consensus of all the gathered const_var_maps. Compute consensus const_var_maps for if-then-elses and negations. Fix some inconsistencies in variable naming. Simplify some code. compiler/ml_disj_gen.m: Compute consensus const_var_maps for disjunctions. compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: Compute consensus const_var_maps for various kinds of switches. In some predicates, put related arguments next to each other. compiler/ml_unify_gen_construct.m: Delete "dynamic" from the names of several predicates that also handled non-dynamic construction unifications. Fix an out-of-date comment. compiler/mark_static_terms: Fix grammar in a comment. library/map.m: Fix a careless bug: when doing a merge in map.common_subset_loop, we threw away an entry from the wrong list in one of three cases. Make such bugs harder to overlook by - deleting the common parts from variable names, leaving the differences easier to see, and - replacing numeric suffixes for completely separate data structures with A and B suffixes. tests/valid/bug544.m: A new test case for the bug. tests/valid/Mercury.options: tests/valid/Mmakefile: Enable the bug, and run it with -O5.	2022-02-07 17:30:32 +11:00
Peter Wang	95f59cf7c9	Fix lookup switches on subtype enums. compiler/switch_util.m: Rename dont_need_bit_vec_check variant of need_bit_vec_check to dont_need_bit_vec_check_no_gaps. Add dont_need_bit_vec_check_with_gaps (see below). Make type_range return the correct min and max values used by a subtype enum type. For now, it fails unless the range of values is contiguous. Make find_int_lookup_switch_params use the min and max values for a type returned by type_range, not assuming 0 to the max value. Make find_int_lookup_switch_params return dont_need_bit_vec_check_with_gaps when a bit vector check is not required before a table lookup, yet the table is expected to contain dummy rows. This is the case for a cannot_fail switch on a subtype enum type type, where the subtype does not use some values between the min and max values. compiler/dense_switch.m: Make tagged_case_list_is_dense_switch use the min and max values for a type returned by type_range, not assuming 0 to the max value. compiler/ml_lookup_switch.m: Expect the generated lookup table to contain dummy rows or not depending on dont_need_bit_vec_check_{with_gaps,no_gaps}. Conform to change to need_bit_vec_check. compiler/lookup_switch.m: compiler/ml_string_switch.m: Conform to change to need_bit_vec_check. tests/hard_coded/Mmakefile: tests/hard_coded/dense_lookup_switch4.exp: tests/hard_coded/dense_lookup_switch4.m: tests/hard_coded/dense_lookup_switch_non2.exp: tests/hard_coded/dense_lookup_switch_non2.m: Add test cases.	2021-04-09 17:41:23 +10:00
Zoltan Somogyi	b66f45e4db	Tighten the mlds_type type. compiler/mlds.m: Make two changes to mlds_type. The simpler change is the deletion of the maybe(foreign_type_assertions) field from the MLDS representations of Mercury types. It was never used, because Mercury types that are defined in a foreign language that is acceptable for the current MLDS target platform are represented as mlds_foreign_type, not as mercury_type. The more involved change is to change the representation of builtin types. Until now, we had separate function symbols in mlds_type to represent ints, uints, floats and chars, but not strings or values of the sized types {int,uint}{8,16,32,64}; those had to be represented as Mercury types. This is an unnecessary inconsistency. It also had two allowed representations for ints, uints, floats and chars, which meant that some of the code handling those conceptual types had to be duplicated to handle both representations. This diff provides mlds_builtin_type_{int(_),float,string,char} function symbols to represent every builtin type, and changes mercury_type to mercury_nb_type to make clear that it is NOT to be used for builtins (the nb is short for "not builtin"). compiler/ml_code_util.m: compiler/ml_util.m: Delete functions that used to construct MLDS representations of builtin types. The new representation of those types is so simple that using such functions is no less cumbersome than writing down the representations directly. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_disj_gen.m: compiler/ml_foreign_proc_gen.m: compiler/ml_global_data.m: compiler/ml_lookup_switch.m: compiler/ml_proc_gen.m: compiler/ml_rename_classes.m: compiler/ml_simplify_switch.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: compiler/ml_type_gen.m: compiler/ml_unify_gen_construct.m: compiler/ml_unify_gen_deconstruct.m: compiler/ml_unify_gen_util.m: compiler/mlds_dump.m: compiler/mlds_to_c_data.m: compiler/mlds_to_c_export.m: compiler/mlds_to_c_func.m: compiler/mlds_to_c_global.m: compiler/mlds_to_c_stmt.m: compiler/mlds_to_c_type.m: compiler/mlds_to_cs_data.m: compiler/mlds_to_cs_stmt.m: compiler/mlds_to_cs_type.m: compiler/mlds_to_java_data.m: compiler/mlds_to_java_stmt.m: compiler/mlds_to_java_type.m: compiler/mlds_to_java_wrap.m: compiler/rtti_to_mlds.m: Conform to the changes above.	2018-09-28 23:07:23 +10:00
Zoltan Somogyi	6a915eef05	Optimize field updates inside packed arg words. Since june, we have been copying words containing packed-together sub-word-sized arguments all in one piece if possible, for hlc grades. This means that given a type such as :- type t ---> f1(int8, bool, int8, int, bool, int8, bool). whose first three and last three arguments are packed into one word each, and a predicate such as p(T0, T) :- T0 = f1(A, B, C, _, E, F, G), D = 42, T = f1(A, B, C, D, E, F, G). we generated code such as MR_Integer D_12 = (MR_Integer) 42; MR_Unsigned packed_args_0 = (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 0))); MR_Unsigned packed_args_1 = (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 2))); base = (MR_Word) MR_new_object(MR_Word, ((MR_Integer) 3 * sizeof(MR_Word)), NULL, NULL); T_4 = base; MR_hl_field(MR_mktag(0), base, 0) = (MR_Box) (packed_args_0); MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_12)); MR_hl_field(MR_mktag(0), base, 2) = (MR_Box) (packed_args_1); which does NOT pick up the values A, B, C, E, F and G individually. However, until now, we could reuse packed-together words only in their unchanged form. This diff lifts that limitation, which means that now, we can also* optimize code such as p(T0, T) :- T0 = f1(A, B, _, D, E, _, G), C = 42i8, F = 43i8, T = f1(A, B, C, D, E, F, G). by generating code like this: base = (MR_Word) MR_new_object(MR_Word, (3 * sizeof(MR_Word)), NULL, NULL); T_4 = base; MR_hl_field(MR_mktag(0), base, 0) = (MR_Box) ((((packed_word_0 & (~((MR_Unsigned) 255U)))) \| (MR_Unsigned) ((uint8_t) (C_12)))); MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_8)); MR_hl_field(MR_mktag(0), base, 2) = (MR_Box) ((((packed_word_1 & (~((MR_Unsigned) 510U)))) \| (((MR_Unsigned) ((uint8_t) (F_13)) << 1)))); The general scheme when reusing part* of a word is: first set the bits not being reused to zero, and then OR in new values of those bits. Make this optimization as general as possible by making it work not just for - words in memory cells containing only arguments, but also for - words in memory cells containing a remote sectag as well as arguments, and - words in registers cells containing a ptag, a local sectag as well as arguments. compiler/ml_gen_info.m: Generalize the data structure we use to represent information about packed words to make possible approximate as well as exact lookups. The key in the old map was "these bitfields with the values of these variables in them", while the key in the new map is just "these bitfields", with the associated value being a list, each element of which says "the word with these values in those bitfields is available in this rval". This makes it possible to look for matches words that have some, but not all, of the right values in the bitfields. Since the packed words may now contain tags as well as arguments, rename "packed args" to "packed word". compiler/ml_unify_gen_deconstruct.m: When deconstructing a term containing packed words, add them to the packed word map even when one of the bitfields inside the packed word contains tag information. Move the code that adds a packed word to the map into a separate predicate, now that it is needed from more than one place. compiler/ml_unify_gen_construct.m: Change the code that handles packed words to work in terms of filled bitfields. Use this not only to implement the optimization described at the top, but also to make the handling of bitfields more systematic. At least one previous bug was caused by doing sign extension differently for the bitfield containing the first packed argument in a word than for the later packed arguments in that word; with the new design, such inconsistencies should not happen. compiler/ml_unify_gen_util.m: Add utility predicates now needed for both construct and deconstruct unifications. compiler/mlds.m: Document the new use of lvnc_packed_word (renamed from lvnc_packed_args). compiler/ml_code_gen.m: compiler/ml_code_util.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: Conform to the changes above (mostly the packed_word rename). compiler/mlds_to_c_data.m: compiler/mlds_to_c_stmt.m: Omit unneeded casts from the output. Specifically, don't put (MR_Integer) casts in front of integer constants being used either as shift amounts, or as the number of words that a new_object MLDS operation should allocate. The casts only cluttered the output, making it harder to read, and therefore to judge its correctness.	2018-09-10 16:17:17 +10:00
Zoltan Somogyi	b06b2621b3	Move towards packing args with secondary tags. compiler/hlds_data.m: Add bespoke types to record information about local and remote secondary tags. The one for local secondary tags includes the value of the primary and secondary tag together, since construct unifications need to assign this value, and it is better to compute this once, instead leaving the target language compiler to do it, potentially many times. Use a wrapped uint8 to record primary tag values, and wrapped uints to record secondary tag values. The wrap is to prevent any accidental confusion with other values. The use of uint8 and uint has two purposes. First, using the tighest possible representation. Tags are never negative, and primary tags cannot exceed 7. Second, using these types in the compiler help us eat our own dogfood; if a change causes a problem affecting these types, its bootcheck should fail, alerting us to the problem. Add commented-out types and fields that will be needed for packing sub-word-sized arguments together with both local and remote secondary tags. compiler/du_type_layout.m: Generate references to tags in the new format. compiler/ml_unify_gen.m: compiler/unify_gen.m: compiler/modecheck_goal.m: Conform to the changes above. Fix an old bug: the inst corresponding to a constant with a primary and a local secondary tag is not the secondary tag alone, but both tags together. compiler/bytecode.m: compiler/bytecode_gen.m: compiler/closure_gen.m: compiler/disj_gen.m: compiler/export.m: compiler/hlds_code_util.m: compiler/jumpopt.m: compiler/lco.m: compiler/llds_out_data.m: compiler/llds_out_instr.m: compiler/lookup_switch.m: compiler/lookup_util.m: compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_util.m: compiler/ml_elim_nested.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: compiler/ml_type_gen.m: compiler/mlds_dump.m: compiler/mlds_to_c_data.m: compiler/mlds_to_c_stmt.m: compiler/opt_debug.m: compiler/peephole.m: compiler/rtti.m: compiler/rtti_out.m: compiler/rtti_to_mlds.m: compiler/string_switch.m: compiler/switch_util.m: compiler/tag_switch.m: compiler/type_ctor_info.m: Conform to the change to hlds_data.m. In two places, in rtti_out.m and rtti_to_mlds.m, delete old code that was needed only to implement reserved tags, which we have stopped supporting a few months ago. library/uint8.m: library/uint16.m: library/uint32.m: library/uint64.m: Add predicates to cast from each of these types to uint.	2018-06-06 03:35:20 +02:00
Zoltan Somogyi	ec6a40ed85	Put related args of ml_field next to each other. compiler/mlds.m: Put the type of the pointer next to the value of the pointer. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_util.m: compiler/ml_elim_nested.m: compiler/ml_optimize.m: compiler/ml_rename_classes.m: compiler/ml_string_switch.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/ml_unused_assign.m: compiler/ml_util.m: compiler/mlds_dump.m: compiler/mlds_to_c_data.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/mlds_to_target_util.m: compiler/rtti_to_mlds.m: Conform to the change above.	2018-06-04 23:28:19 +02:00
Zoltan Somogyi	bbe0f28f3b	Copy packed arguments all at once. Copy words containing packed-together sub-word-sized arguments all in one piece if possible, for hlc grades. Given a type such as :- type t ---> f1(int8, bool, int8, int, bool, int8, bool). whose first three and last three arguments are packed into one word each, and a predicate such as p(T0, T) :- T0 = f1(A, B, C, _, E, F, G), D = 42, T = f1(A, B, C, D, E, F, G). we used to generate code that picked up each of the six named arguments from T0, and used them to construct T. With this diff, we now translate the above to MR_Integer D_12 = (MR_Integer) 42; MR_Unsigned packed_args_0 = (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 0))); MR_Unsigned packed_args_1 = (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 2))); base = (MR_Word) MR_new_object(MR_Word, ((MR_Integer) 3 * sizeof(MR_Word)), NULL, NULL); T_4 = base; MR_hl_field(MR_mktag(0), base, 0) = (MR_Box) (packed_args_0); MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_12)); MR_hl_field(MR_mktag(0), base, 2) = (MR_Box) (packed_args_1); compiler/ml_unify_gen.m: Implement the two main parts of this optimization. Part one is the change to deconstruction unifications. When we generate assignments from all the fields packed together into a word to their corresponding argument variables (such as A/B/C or E/F/G above), create a fresh variable (such as packed_args_0 above), assign to it the value of the whole word, and record in a new data structure (the packed_args_map) that these argument variables, in these positions within the word, are now available in the newly created variable. (We still define the argument variables as well, since they may be needed; deleting them if they are not* needed is the job of ml_unused_assign.m.) Part two is the change to construction unifications. When we generate code to OR together the shifted and/or masked values of two or more variables to fill in one word in a new heap cell, we search the packed_args_map to see whether those variables, in the positions we need, are available in one of the variables created in part one. If yes, we discard the whole OR-ing together operation and we use that variable instead. Since part one can now create local variable definitions, return these upwards as needed. compiler/ml_gen_info.m: Add two fields to the ml_gen_info structure (actually, to one of its substructures). One is the packed_args_map described above, the other is a counter we use to give a unique name to all the fresh variables. When creating ml_gen_infos, put the code defining each field of a substructure next to the creation of that substructure. compiler/mlds.m: Add a kind of compiler-generated variable holding packed argument words. It is used in part one above. compiler/ml_code_gen.m: compiler/ml_code_util.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: Save, reset and restore the packed_args_map as necessary to ensure that a construction unification sees an entry in that map only if the deconstruction unification that created that entry had to be executed before execution reaches the construction unification. This means that when we process a branched control structure, we have to make sure that (a) entries created by one branch are not seen when we generate code for the other branches, and (b) that code after the branched control structure sees only the entries created before the branched control structure, since such code following cannot use an entry that was created by a branch that may or may NOT have been executed on the way there. We also reset the packed_args_map to empty when generating code that will end up inside a nested function, for two reasons. First, I am not sure whether the code in ml_elim_nested.m that flattens out nested functions is general enough to handle the new kind of compiler generated variable correctly. And second, even if it is, the additional memory traffic for putting those variables into environments, and later pulling them out again, would definitely reduce and maybe completely eliminate the speedup from optimizing constructions. compiler/ml_closure_gen.m: Conform to the change in ml_unify_gen.m. compiler/ml_proc_gen.m: Invoke ml_unused_assign.m in both branches of an if-then-else. Previously, it was invoked in only the rarely executed branch, which is what hid its bugs. Fix one bug: for model_semi procedures, include the succeeded variable in the set of variables whose values is needed after the generated function body. Work around another bug: the ml_unused_assign.m cannot yet handle nested functions properly, so throw away its output in their presence. compiler/ml_unused_assign.m: As part of the same workaround, if a block contains nested functions, tell ml_proc_gen.m to use the original code. Fix several other bugs. Don't delete variables from the seen_set when the backwards traversal finds an assignment to them, because the variable's absence from the seen_set would lead to the declaration of the variable being deleted. Delete a sanity check that made sense only the presence of such deletions. Never delete assignments to compiler-generated variables; we generate such assignments only when their results will be needed. When exiting the traversal of a block, do delete the variables declared locally in that block from the seen_set; being undeclared there, they cannot possibly be seen before that block. leaving them in does not compromise correctness, but does reduce performance by making operations on the seen_set slower than necessary. If deleting unused assignments makes the else part of an if-then-else empty, then delete the whole else part. compiler/mlds_to_c_stmt.m: Generate a valid C statement even for an MLDS comment. When an buggy version of ml_unused_assign.m (incorrectly) deleted assignments to succeeded, it sometimes left an else part containing only a comment, which lead gcc to report syntax errors.	2018-06-02 18:56:40 +02:00
Zoltan Somogyi	b9afc8b78e	Delete the mlds_unary_op type. compiler/mlds.m: We used to have a function symbol ml_unop in the mlds_rval type that applied one of four kinds of operations to an argument mlds_rval: boxing, unboxing, casting or a standard unary operation, with a value of type mlds_unary_op selecting between the four. Replace this system with four separate function symbols in the mlds_rval type directly, and delete the mlds_unary_op type. The new arrangement requires fewer memory cells to be allocated, and less indirection; it also leads to shorter and somewhat more readable code. compiler/ml_optimize.m: Conform to the change above. Recognize that a cast has negligible cost. compiler/ml_code_util.m: Conform to the change above. Keep private a predicate that is not used by any other module, after merging it with another previously-exported predicate that only it uses. Delete some other predicates that are not used anywhere. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_foreign_proc_gen.m: compiler/ml_global_data.m: compiler/ml_lookup_switch.m: compiler/ml_rename_classes.m: compiler/ml_string_switch.m: compiler/ml_tag_switch.m: compiler/ml_unify_gen.m: compiler/ml_unused_assign.m: compiler/ml_util.m: compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/mlds_to_target_util.m: compiler/rtti_to_mlds.m: Conform to the change above.	2018-05-13 12:23:38 +02:00
Zoltan Somogyi	fcefbb948d	Delete assignments to dead variables in the MLDS. At the moment, we tend not to generate such assignments, with the exception of assignments to the MLDS versions of HLDS variables of dummy types. The reason I am nevertheless adding this optimization is that I intend to soon add code to ml_unify_gen.m that will generate assignments to dead variables. The idea is to optimize field updates involving packed arguments. Given a type such as :- type t ---> f( f1 :: bool, f2 :: bool, f3 :: enum1, f4 :: int ). we currently implement a field update such as "T = T0 ^ f4 := 42", whose HLDS representation is the two unifications T0 = f(T0f1, T0f2, T0f3, _), T = f(T0f1, T0f2, T0f3, 42) using code that looks like this: T0f1 = (T0[0] >> ...) & ... T0f2 = (T0[0] >> ...) & ... T0f3 = (T0[0] >> ...) & ... T = allocate memory for new memory cell, put on primary tag T[0] = (T0f1 << ...) \| (T0f2 << ...) \| (T0f3 << ...) T[1] = 42 I want to implement it using code that looks like this: T0w0 = T0[0] T = allocate memory for new memory cell, put on primary tag T[0] = T0w0 T[1] = 42 where T0w0 contains the entire first word of the memory cell of T0. This code avoids a bunch of shifts, ORs and ANDs. I propose to translate the T0 = f(T0f1, T0f2, T0f3, _) unification into T0w0 = T0[0] T0f1 = (T0[0] >> ...) & ... T0f2 = (T0[0] >> ...) & ... T0f3 = (T0[0] >> ...) & ... while recording in the ml_gen_info/code_info that this specific packing of T0f1, T0f2 and T0f3 is available in T0w0. When translating the following unification, the code generator will see this, and this will allow it to generate T[0] = T0w0 instead of T[0] = (T0f1 << ...) \| (T0f2 << ...) \| (T0f3 << ...) However, by this time the assignments to T0f1, T0f2 and T0f3 have already been generated. Whether or not they are dead assignments depends on whether other code needs the values of those fields of T0. Deciding this requires knowledge that the code generator can't have when translating the deconstruction of T0. Hence the need for a new MLDS-to-MLDS optimization. compiler/ml_unused_assign.m: A new compiler module implementing the new optimization. It is not part of ml_optimize.m because ml_optimize.m traverses the MLDS forwards, while this optimization requires a backwards traversal: you cannot know whether an assignment is dead unless you know that the following code does not need the value of the variable it assigns to. compiler/ml_backend.m: compiler/notes/compiler_design.html: Include the new module. compiler/mlds.m: The new optimization needs extra information about loops. When it enters into the loop body, it knows which variables are needed after the loop, but it does not know which variables the loop body first reads and then writes. Without this knowledge, it would optimize away assignments to loop control variables, such as the increment of i in the loop i = 0; while (...) { ...; i = i+1; ... } Traditionally, compilers have solved this problem by doing fixpoint iteration, adding to the live set at each program point until no more additions are possible. We can do better, because we generate loops in the MLDS in only two kinds of cases: - loops implementing tail recursion, in which case the only extra variables that we need to preserve assignments to in the loop body are the input arguments of the procedure, and - loops created by the compiler itself to loop over a set of alternatives, for which the only extra variables that we need to preserve assignments to in the loop body are the variables the compiler uses to control the loop. To make it possible for ml_unused_assign.m to do its job without a fixpoint iteration, include in the MLDS representation of every while loop a list of these variables. Add a type to represent the identify of an MLDS local var, for use by some of the modules below. They used to store this info in the form of mlds_lvals, but that is not specific enough to be used to fill in the new field in while loops. compiler/ml_proc_gen.m: Compute the information needed by the new pass, and invoke it if the relevant option is set. compiler/options.m: Add this option. It is for developers only, so it is undocumented. compiler/ml_util.m: Add a utility function needed in several places. compiler/ml_accurate_gc.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_rename_classes.m: compiler/ml_string_switch.m: compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/mlds_to_target_util.m: Conform to the changes in mlds.m.	2018-05-09 23:56:28 +02:00
Zoltan Somogyi	dc4196e5af	Separate breaks from loops and breaks from switches in MLDS. compiler/mlds.m: Replace goto_break with goto_break_loop and goto_break_switch, each intended to break from a particular construct. It was confusion between the two kinds of breaks that led to the earlier bug that broke --prefer-while-loop-over-jump-mutual; this separation should make such bugs easy to detect. Rename goto_continue as goto_continue_loop to match the new naming scheme. compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: When emitting goto_break_loop and goto_break_switch, check whether the nearest enclosing break-able scope is a loop or switch respectively. To make this check possible, record the nearest break-able scope. While these additions make the compiler do extra work, the performance impact is negligible. compiler/mlds_to_target_util.m: Add the type that mlds_to_{c,cs,java}.m all use to identify break-able scopes. compiler/ml_call_gen.m: compiler/ml_proc_gen.m: compiler/ml_string_switch.m: Update the code that generates gotos.	2017-11-11 12:35:54 +11:00
Zoltan Somogyi	034cb97988	Don't module- or type-qualify MLDS local variables. Some global variables generated by the MLDS backend need to be visible across module boundaries, and therefore mlds_data definitions, which contained global as well as other variables, used to have their names qualified; usually module-qualified, though sometimes type-qualified. However, since the diff that partitioned mlds_data_defns into the definitions of local variables, global variables and field variables, the qualification of local variables has not been necessary, so this diff removes such qualifications. This makes the MLDS code generating references to local variables simpler, more readable, and slightly faster. The generated code is also shorter and easier to read. There are two exceptional cases in which local variables did need qualification, both of which stretch the meaning of "local". One such case is the "local" variable dummy_var, which (by definition) is only ever assigned to, and never used. It is also never defined in MLDS-generated code; instead, it is defined defined in private_builtin.m (for the Java and C# backends) or the runtime (for C). All three backends currently require references to this variable in the runtime to be module qualified. There are three possible fixes to this problem, which is caused by the fact that this "local" variable is in fact global. - Fix 1a would be to make dummy_vars global, not local. - Fix 1b is to special-case dummy_vars in mlds_to_{c,cs,java}.m, and put the fixed "private_builtin" qualifier in front of it. - Fix 1c would be to modify the compiler to never generate any references to dummy vars at all. This diff uses fix 1b, because it is simple. I (zs) will explore fix 1c in the future, and see if it is viable. The second such case occurs when generating code for unifications involving function symbols represented by the addresses of reserved objects. These addresses used to be represented as the addresses of mlds_data definitions, then as addresses of field variables cast as qualified local variables. Since diff this makes all local variables unqualified, this can't continue. Two possible fixes are - Fix 2a: introduce an mlds const rval representing the address of a field variable, which solves the problem because unlike local variables, field variables can still be either module- or type-qualified. - Fix 2b: prohibit the use of the addresses of reserved objects as tags. After a (short) discussion on m-dev, this diff uses fix 2b. compiler/mlds.m: Delete the qual_local_var_name type, and replace all its uses with the mlds_local_var_name type. Delete the module qualifier field in mlds_data_addr_local_var consts. compiler/ml_code_util.m: Simplify the predicates and functions whose task is to build references to local variables. Delete the arguments that they don't need anymore. Delete one function entirely, since calling it now takes both more characters and more code than its shortened body does. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_foreign_proc_gen.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_rename_classes.m: compiler/ml_string_switch.m: compiler/ml_tailcall.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/ml_util.m: compiler/mlds_to_target_util.m: compiler/rtti_to_mlds.m: Conform to the changes above. Stop qualifying local variable names, and stop passing the parameters that used to be used only for qualifying local variable names. compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: Conform to the changes above, and implement fix 1b. NEWS: compiler/options.m: compiler/make_tags.m: Implement fix 2b by disabling the --num-reserved-objects option. This ensures that we don't use the addresses of reserved objects as tags. library/private_builtin.m: Move the C# definition of dummy_var next to the Java definition, and fix the comments on them.	2017-08-09 18:23:53 +02:00
Zoltan Somogyi	1c01ed85eb	Fix lines.	2017-07-29 14:15:15 +02:00
Zoltan Somogyi	91790794f1	Define the MLDS "succeeded" variable only if needed. This makes the generated MLDS code less cluttered and easier to work on. compiler/ml_gen_info.m: Add a field for recording whether the succeeded variable has been used. compiler/ml_code_util.m: Change the predicates that return references to the succeeded variable to record that it has been used. compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_foreign_proc_gen.m: compiler/ml_lookup_switch.m: compiler/ml_string_switch.m: compiler/ml_unify_gen.m: Use the updated forms of the predicates in ml_code_util.m. compiler/ml_proc_gen.m: Define the succeeded variable only if the new slot says it has been used. compiler/ml_optimize.m: Fix a bug triggered by the above change: when a tail recursive call was the entire body of a MLDS function, ml_optimize.m did not find it, and thus did not do the setup needed to prepare for the tail recursion. Previously, the always-present declaration of "succeeded" made it impossible for the tail call to be the only thing in the body.	2017-07-29 01:40:56 +02:00
Zoltan Somogyi	b390231f22	Use mlds_target_lang in the MLDS backend. The overall compilation target language (which is recorded in the globals) can be C, Java, C# or Erlang. The target language of the MLDS backend can only be the first three. Use the mlds_target_lang type (which has three functors) instead of the compilation_target type (which has four) to make target-specific decisions in the MLDS backend. compiler/mercury_compile_mlds_back_end.m: Compute the MLDS target (which can be C, Java or C#) from the compilation target (which can also be Erlang). compiler/ml_closure_gen.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_foreign_proc_gen.m: compiler/ml_gen_info.m: compiler/ml_global_data.m: compiler/ml_proc_gen.m: compiler/ml_string_switch.m: compiler/ml_tag_switch.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/mlds.m: compiler/rtti_to_mlds.m: Use the mlds_target_lang value computed in mercury_compile_mlds_back_end.m to make decisions. Code in most modules get this from the ml_gen_info; in some others, it is passed around, usually instead of the globals. compiler/ml_code_util.m: Unify two separate copies of a comment.	2017-07-27 03:33:20 +02:00
Zoltan Somogyi	11c232f060	Store different kinds of definitions in blocks separately. An ml_stmt_block contains some definitions and some statements. The definitions were traditionally stored in a single list of mlds_defns, but lots of code knew that some kinds of mlds_defns just couldn't occur in blocks. This diff, by storing the definitions of (a) local variables and (b) continuation functions in separate field in ml_stmt_blocks, gets the type system to enforce the invariant that other kinds of definitions can't occur in blocks. This also allows the compiler to do less work, since definitions don't have to wrapped and then later unwrapped, and code that wants to look at only e.g. the function definitions in a block don't have to traverse the definitions of local variables (of which there are many more). compiler/mlds.m: Make the change described above. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_code_util.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_proc_gen.m: compiler/ml_simplify_switch.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tailcall.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/ml_util.m: compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/mlds_to_target_util.m: Conform to the change above. This allows us to avoid lots of wrapping up definitions. In some cases, after this change, we don't need to process mlds_defns in general, which leaves the predicates that used to do that, and some of the predicates that they used to call, unused. Delete these. In code that generated MLDS code, consistently use names containing the word "Defn", instead of "Decl", for variables that contain mlds_local_var_defns or mlds_function_defns. Some such predicates generate lists of both local var definition and function definitions, but most generate only one, and some generate neither.	2017-07-26 00:57:13 +02:00
Zoltan Somogyi	47f1df4a0a	Split mlds_data_defn into three separate types. We used to use mlds_data_defns to represent three related but nevertheless distinct kinds of entities: global variables, local variables, and fields in classes. This diff replaces the mlds_data_defn type with three separate types: mlds_global_var_defn, mlds_local_var_defn and mlds_field_var_defn respectively, with corresponding changes to related types, such as mlds_data_name. The global variables are completely separate from the other two kinds. Local and field variables are mostly separate from each other, but they are related in one way. When we flatten out nested functions, the child nested function can no longer access its parent function's local variables, so we pass those variables to it as fields of an environment structure. This requires turning local variables to fields of that structure, and the code in the flattened previously-nested function that accesses those fields naturally wants to treat them as if they were local variables (as indeed they sort-of were before the flattening). There are therefore ways to convert each of local and fields vars into the other. This restructuring makes clear several invariants of the MLDS we generate that were previously hidden. For example, variables with certain kinds of names (in the before-this-diff, general version of the mlds_var_name type) could appear only as function arguments or as locals in ml_stmt_blocks, not in ml_global_data, while for some other names the opposite was the case. And in several cases, functions used to take a general mlds_data_defn as argument but aborted if given the "wrong kind" of mlds_data_defn. This diff also makes possible further simplifications. For example, local vars should not need some flags (since e.g. they are never per-instance), and should never need either module or type qualification, while global variables (which are also never per-instance) should never need type qualification (since they are not fields of a type). The definitions in blocks should consist of local variables and (before flattening) functions, not global variables, field variables or classes, while the members in classes should be only field variables and functions (and maybe classes), not global or local variables. Those changes will be in future diffs; this is already large enough. compiler/mlds.m: Make the changes described above. Use tighter types where possible. Use (a generalized version) of the mlconst_named_const functor to represent values of enum types defined in the runtimes of the target platforms. compiler/ml_global_data.m: Store only global variables in fields that previously stored general mlds_datas (that by design were always global). Store only closure wrapper functions in the previous non-flat-defns field. Before this diff, the code generator only put closure wrapper functions in this field, but then ml_elim_nested.m put everything resulting from the expansion of those functions back into those fields as well, some of which were not functions. It now puts those non-function things into the MLDS data structure directly. compiler/ml_code_util.m: compiler/ml_util.m: Conform to the changes above. Use tighter types where possible. If appropriate, change the name of the function or predicate accordingly. Represent references to enum constants defined in the runtime of the target language as named constants (since they is what they are), instead of representing them as MLDS "variables", which required the code of mlds_to_cs.m had to special-case the treatment of those "variables". compiler/ml_elim_nested.m: Conform to the changes above. Use tighter types where possible. Don't put the environment types resulting from flattening nested scopes back into the non-flat-defns slot of the ml_elim_info; instead, return them separately to code that puts them directly in the MLDS. compiler/rtti.m: When returning the names of enum constants in the C runtime, return also the prefixes that you need to place in front of these to obtain their names in the Java and C# runtimes. compiler/mercury_compile_mlds_back_end.m: compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_foreign_proc_gen.m: compiler/ml_gen_info.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_proc_gen.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tailcall.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/mlds_to_target_util.m: compiler/rtti_out.m: compiler/rtti_to_mlds.m: Conform to the changes above. Move a utility function from ml_util.m to mlds_to_target_util.m, since it is used only in mlds_to_*.m.	2017-07-22 00:20:40 +02:00
Julien Fischer	8a240ba3f0	Add builtin 8, 16 and 32 bit integer types -- Part 1. Add the new builtin types: int8, uint8, int16, uint16, int32 and uint32. Support for these new types will need to be bootstrapped over several changes. This is the first such change and does the following: - Extends the compiler to recognise 'int8', 'uint8', 'int16', 'uint16', 'int32' and 'uint32' as builtin types. - Extends the set of builtin arithmetic, bitwise and relational operators to cover the new types. - Extends all of the code generators to handle new types. There currently lots of limitations and placeholders marked by 'XXX FIXED SIZE INT'. These will be lifted in later changes. - Extends the runtimes to support the new types. - Adds new modules to the standard library intended to hold the basic operations on the new types. (These are currently empty and not documented.) This change does not introduce the two 64-bit types, 'int64' and 'uint64'. Their implementation is more complicated and is best left to a separate change. compiler/prog_type.m: compiler/prog_data.m: compiler/builtin_lib_types.m: Recognise int8, uint8, int16, uint16, int32 and uint32 as builtin types. Add new type, int_type/0,that enumerates all the possible integer types. Extend the cons_id/0 type to cover the new types. compiler/builtin_ops.m: Parameterize the integer operations in the unary_op/0 and binary_op/0 types by the new int_type/0 type. Add builtin operations for all the new types. compiler/hlds_data.m: Add new tag types for the new types. compiler/hlds_pred.m: Parameterize integers in the table_trie_step/0 type. compiler/ctgc.selector.m: compiler/dead_proc_elim.m: compiler/export.m: compiler/foreign.m: compiler/goal_util.m: compiler/higher_order.m: compiler/hlds_code_util.m: compiler/hlds_dependency_graph.m: compiler/hlds_out_pred.m: compiler/hlds_out_util.m: compiler/implementation_defined_literals.m: compiler/inst_check.m: compiler/mercury_to_mercury.m: compiler/mode_util.m: compiler/module_qual.qualify_items.m: compiler/opt_debug.m: compiler/opt_util.m: compiler/parse_tree_out_info.m: compiler/parse_tree_to_term.m: compiler/parse_type_name.m: compiler/polymorphism.m: compiler/prog_out.m: compiler/prog_rep.m: compiler/prog_rep_tables.m: compiler/prog_util.m: compiler/rbmm.exection_path.m: compiler/rtti.m: compiler/rtti_to_mlds.m: compiler/switch_util.m: compiler/table_gen.m: compiler/type_constraints.m: compiler/type_ctor_info.m: compiler/type_util.m: compiler/typecheck.m: compiler/unify_gen.m: compiler/unify_proc.m: compiler/unused_imports.m: compiler/xml_documentation.m: Conform to the above changes to the parse tree and HLDS. compiler/c_util.m: Support generating the builtin operations for the new types. doc/reference_manual.texi: Add the new types to the list of reserved type names. Add the mapping from the new types to their target language types. These are commented out for now. compiler/llds.m: Replace the lt_integer/0 and lt_unsigned functors of the llds_type/0, with a single lt_int/1 functor that is parameterized by the int_type/0 type. Add a representations for constants of the new types to the LLDS. compiler/call_gen.m: compiler/dupproc.m: compiler/exprn_aux.m: compiler/global_data.m: compiler/jumpopt.m: compiler/llds_out_data.m: compiler/llds_out_global.m: compiler/llds_out_instr.m: compiler/lookup_switch.m: compiler/middle_rec.m: compiler/peephole.m: compiler/pragma_c_gen.m: compiler/stack_layout.m: compiler/string_switch.m: compiler/switch_gen.m: compiler/tag_switch.m: compiler/trace_gen.m: compiler/transform_llds.m: Support the new types in the LLDS code generator. compiler/mlds.m: Support constants of the new types in the MLDS. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_code_util.m: compiler/ml_disj_gen.m: compiler/ml_foreign_proc_gen.m: compiler/ml_global_data.m: compiler/ml_lookup_switch.m: compiler/ml_simplify_switch.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tailcall.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/ml_util.m: compiler/mlds_to_target_util.m: Conform to the above changes to the MLDS. compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: Generate the appropriate target code for constants of the new types and operations involving them. compiler/bytecode.m: compiler/bytecode_gen.m: Handle the new types in the bytecode generator; we just abort if we encounter them for now. compiler/elds.m: compiler/elds_to_erlang.m: compiler/erl_call_gen.m: compiler/erl_code_util.m: compiler/erl_rtti.m: compiler/erl_unify_gen.m: Handle the new types in the Erlang code generator. library/private_builtin.m: Add placeholders for the builtin unify and compare operations for the new types. Since the bootstrapping compiler will not recognise the new types we give the polymorphic arguments. These can be replaced after this change has bootstrapped. Update the Java list of TypeCtorRep constants. library/int8.m: library/int16.m: library/int32.m: library/uint8.m: library/uint16.m: library/uint32.m: New modules that will eventually contain builtin operations on the new types. library/library.m: library/MODULES_UNDOC: Do not include the above modules in the library documentation for now. library/construct.m: library/erlang_rtti_implementation.m: library/rtti_implementation.m: deep_profiler/program_representation_utils.m: mdbcomp/program_representation.m: Handle the new types. runtime/mercury_dotnet.cs.in: java/runtime/TypeCtorRep.java: runtime/mercury_type_info.h: Update the list of TypeCtorReps. configure.ac: runtime/mercury_conf.h.in: Check for the header stdint.h. runtime/mercury_std.h: Include stdint.h; abort if that header is no present. runtime/mercury_builtin_types.[ch]: runtime/mercury_builtin_types_proc_layouts.h: runtime/mercury_construct.c: runtime/mercury_deconstruct.c: runtime/mercury_deep_copy_body.h: runtime/mercury_ml_expand_body.h runtime/mercury_table_type_body.h: runtime/mercury_tabling_macros.h: runtime/mercury_tabling_preds.h: runtime/mercury_term_size.c: runtime/mercury_unify_compare_body.h: Add the new builtin types and handle them throughout the runtime.	2017-07-18 01:31:01 +10:00
Zoltan Somogyi	30ec420984	Fix an anomaly in how in MLDS treats scalar commons. compiler/mlds.m: The MLDS used to have two different ways to refer to scalar common data structures. It had an rval for the name of the scalar common, and an mlds_name for its address. The name could then be wrapped up inside a mlconst_data_adr function symbol to convert it to rval. An mlds_name is intended to be used for the names of data definitions in the MLDS, but scalar commons were never defined in this way. And the name and address of a scalar common differ in C only by the addition of an "&" operator in front, so the fact that they had to be processed by different code (due to them having different types) required double maintenance. This diff fixes this anomaly by making both the name and the address of a scalar common its own specific function symbol in the mlds_rval type. They differ in the presence or absence of an "_addr" suffix. Since all references to a vector common are to its address, give the existing mlds_rval function symbol for vector commons the "_addr suffix as well, for consistency. Replace the general mlconst_data_addr function symbol in the mlds_rval_const with its remaining instances. This allows the code constructing them to be smaller and simpler, and enables them to be treated differently in the future, if needed. compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: Conform to the changes in mlds.m. Put the code translating the various common structures next to each other, where they werent' before. Add XXXs about the differences between them that are probably unnecessary and may possibly be latent problems. compiler/ml_util.m: Conform to the changes in mlds.m. Change the interface to a set of predicates that looks for variables inside various MLDS constructs to take a variable name, not a data name, as the thing being looked for. compiler/ml_closure_gen.m: compiler/ml_code_util.m: compiler/ml_elim_nested.m: compiler/ml_global_data.m: compiler/ml_optimize.m: compiler/ml_proc_gen.m: compiler/ml_string_switch.m: compiler/ml_tailcall.m: compiler/ml_unify_gen.m: compiler/mlds_to_target_util.m: compiler/rtti_to_mlds.m: Conform to the changes in mlds.m, and maybe ml_util.m. In ml_proc_gen.m, put related arguments of some predicates and functions next to each other.	2017-07-13 13:36:51 +02:00
Zoltan Somogyi	0d5dac8018	Delete output args that always return the same value.	2017-07-10 00:51:41 +02:00
Zoltan Somogyi	083f990dbb	Simplify the use of contexts in the MLDS. compiler/mlds.m: This diff fixes two minor annoyances imposed by the old use of the mlds_context type in the MLDS. The first annoyance was that the mlds_context type used to be an abstract type that was privately defined to be a trivial wrapper around a prog_context. It had the exact same information content as a prog_context, but you had to go through translation functions to translate prog_contexts to mlds_contexts and vice versa. I think Fergus's idea was that we may want to add other information to the mlds_context type. However, since we haven't felt the need to anything like that in the 18 years (almost to the day) that the mlds_context type existed, I think this turned out to be a classic case of YAGNI (you ain't gonna need it). This diff deletes the mlds_context type, and replaces its uses with prog_context. The second annoyance was that actual MLDS code, i.e. values of the mlds_stmt type, always had to wrapped up inside a term of the statement type, a term which paired a context with the mlds_stmt. This diff moves the context information (now prog_context, not mlds_context) into each function symbol of the mlds_stmt type, deletes the statement type, and replaces its uses with the now-expanded mlds_stmt type. This simplifies most code that deals with MLDS code. compiler/ml_util.m: Add a function, get_mlds_stmt_context, for the (very rare) occasions where we want to know the context of an mlds_stmt before testing to see what function symbol it is bound to. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_code_util.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_foreign_proc_gen.m: compiler/ml_global_data.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_proc_gen.m: compiler/ml_simplify_switch.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: compiler/ml_tailcall.m: compiler/ml_type_gen.m: compiler/ml_unify_gen.m: compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/rtti_to_mlds.m: Conform to the changes above. In some cases, a function was given two separate contexts, sometimes from two separate sources; a prog_context and an mlds_context. In such cases, keep only one source. Standardize on Stmt as the variable name for "statement". Delete redundant $module references from unexpected and other abort predicates. In one case, delete a function that was a duplicate of another function. Give some predicates and functions more meaningful names.	2017-07-09 18:44:05 +02:00
Zoltan Somogyi	869605956c	Make MLDS definitions self-contained. Until now, we used a single type, mlds_defn, to contain both - generic information that we need for all MLDS definitions, such as name and context, and - information that is specific to each different kind of MLDS definition, such as a variable's initializer or a function's list of parameter types. The former were contained in three fields in the mlds_defns directly, while the latter were contained in a fourth field that was a discriminated union of mlds_data_defn, mlds_function_defn and mlds_class_defn. While seemingly parsimonious, this design meant that if we had e.g. a list of variable definitions, we would have to wrap the mlds_defn/4 wrapper around them to give them their names, and thereafter, any code that processed that list would have to be prepared to process not just variables but also functions and classes. This diff moves the three generic fields into each of the mlds_data_defn, mlds_function_defn and mlds_class_defn types, making each those types self-contained, and leaving mlds_defn as nothing more than a discriminated union of those types. In the few places that want to look at the generic fields without caring about what kind of entity is being defined, this design requires a bit of extra work compared to the old design, but in many other places, the new design allows us to return mlds_data_defns, mlds_function_defns or mlds_class_defns instead of just mlds_defns. compiler/mlds.m: Make the change described above. Store type definions (for high level data) and table structures definitions separately from other definitions in the MLDS type, since we can now give them tighter types. compiler/ml_global_data.m: Change the fields that store flat cells from storing mlds_defns to storing mlds_data_defns, since we can now do so. Add an XXX about an obsolete comment. compiler/mercury_compile_mlds_back_end.m: compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_code_gen.m: compiler/ml_code_util.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_foreign_proc_gen.m: compiler/ml_gen_info.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_proc_gen.m: compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tailcall.m: compiler/ml_type_gen.m: compiler/ml_util.m: compiler/mlds_to_c.m: compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/rtti_to_mlds.m: Conform to the changes above. Where possible with only local changes, return mlds_data_defns mlds_function_defns or mlds_class_defns instead of just mlds_defns. Put the mlds_data(_), mlds_function(_) or mlds_class(_) wrapper around those definitions as late as possible (typically, when our current code wants to put it into the same list as some other kind of definition), in the hope that in the future, that wrapping can be delayed even later, or even avoided altogether. Make the places where such improvements may be possible with "XXX MLDS_DEFN". In some places, the tighter data representation allows us to delete "XXX MLDS_DEFN" markers. Move some common code from mlds_to_{cs,java}.m to ml_util.m. In mlds_to_{cs,java}.m, add prefixes to the function symbols in a type to reduce ambiguity.	2017-05-24 09:45:21 +02:00
Zoltan Somogyi	8dbea9f096	Use a structured representation for MLDS variables. compiler/mlds.m: Replace the old definition of mlds_var_name, which was a string with an optional integer. The integer was intended to be the number of a HLDS variable, while auxiliary variables created by the compiler, which do not correspond to a HLDS variable, would not have the optional integer. This design has a couple of minor problems. The first is that there is no place in the compiler where all the variable names are visible at once, and without such a place, we cannot be sure that two names constructed for different purposes don't accidentally end up with the same name. The probability of such a clash used to be astronomically small (which is why this hasn't been a problem), but it was not zero. The second problem is that several kinds of compiler-created MLDS variables want to have numerical suffixes too, usually with the suffix being a unique sequence number used as a means of disambiguation. Most of the places where these were created put the numerical suffix into the name string itself, while some put the sequence number as the optional integer. As it happens, neither of those actions is good when one wants to take the independently generated MLDS code of several procedures in an SCC and merge them into a single piece of MLDS code. For this, we want to rename apart both the HLDS variable numbers and the sequence numbers. Having the sequence number baked into the strings themselves obviously makes such renumbering unnecessarily hard, while having sequence numbers in the slots intended for HLDS variable numbers makes the job impossible to do safely. This diff switches to a new representation of mlds_var_names that has a separate function symbol for each different "origin story" that is possible for MLDS variables. This addresses both problems. The single predicate that converts this structured representation to a string is the place where we can ensure that two semantically different MLDS variables never get translated to the same string. The current version of this predicate does not offer this guarantee, but later versions could. And having all the integers used in mlds_var_names for different purposes stored as arguments of different function symbols (that clearly indicate their meaning) makes it possible to rename apart different sets of MLDS variables easily and reliably. Move the code for converting mlds_var_names from ml_code_util.m to here, to make it easier to maintain it together with the mlds_var_name type. compiler/ml_code_util.m: Conform to the above change by generating structured MLDS var names. Delete a predicate that is not needed with structured var names. Delete the code moved to mlds.m. Delete a predicate that has been unused since we deleted the IL backend. Add ml_make_boxed_type as a version of ml_make_boxed_types that returns exactly one type. This simplifies some code elsewhere. Add "hld" to some predicate names to make clear that they are intended for use only with --high-level-data. compiler/ml_type_gen.m: Conform to the above change by generating structured MLDS var names. Add "hld" to the names of the (many) predicates here that are used only with --high-level-data to make clear that fact. compiler/mlds_to_cs.m: compiler/mlds_to_java.m: Conform to the above change by generating structured MLDS var names. Add a "for_csharp" or "for_java" suffix to some predicate names to avoid ambiguities. compiler/ml_accurate_gc.m: compiler/ml_call_gen.m: compiler/ml_closure_gen.m: compiler/ml_commit_gen.m: compiler/ml_disj_gen.m: compiler/ml_elim_nested.m: compiler/ml_foreign_proc_gen.m: compiler/ml_gen_info.m: compiler/ml_global_data.m: compiler/ml_lookup_switch.m: compiler/ml_optimize.m: compiler/ml_string_switch.m: compiler/ml_unify_gen.m: compiler/ml_util.m: compiler/mlds_to_c.m: Conform to the above change by generating structured MLDS var names. compiler/prog_type.m: Add var_to_type, as a version of var_list_to_type_list that returns exactly one type. This simplifies some code elsewhere. compiler/java_names.m: Give some predicates and functions better names. compiler/ml_code_gen.m: Fix typo.	2017-04-24 15:16:36 +10:00
Zoltan Somogyi	5de235065d	Fix too-long lines.	2015-11-16 00:09:26 +11:00
Zoltan Somogyi	cc9912faa8	Don't import anything in packages. Packages are modules whose only job is to serve as a container for submodules. Modules like top_level.m, hlds.m, parse_tree.m and ll_backend.m are packages in this (informal) sense. Besides the include_module declarations for their submodules, most of the packages in the compiler used to import some modules, mostly other packages whose component modules their submodules may need. For example, ll_backend.m used to import parse_tree.m. This meant that modules in the ll_backend package did not have to import parse_tree.m before importing modules in the parse_tree package. However, this had a price. When we add a new module to the parse_tree package, parse_tree.int would change, and this would require the recompilation of ALL the modules in the ll_backend package, even the ones that did NOT import ANY of the modules in the parse_tree package. This happened even at one remove. Pretty much all modules in every one of the backend have to import one or more modules in the hlds package, and they therefore have import hlds.m. Since hlds.m imported transform_hlds.m, any addition of a new middle pass to the transform_hlds package required the recompilation of all backend modules, even in the usual case of the two having nothing to do with each other. This diff removes all import_module declarations from the packages, and replaces them with import_module declarations in the modules that need them. This includes only a SUBSET of their child modules and of the non-child modules that import them.	2015-11-13 15:03:20 +11:00
Zoltan Somogyi	7654ec847e	Convert (C->T;E) to (if C then T else E).	2015-09-18 09:37:29 +10:00
Zoltan Somogyi	c1e0499140	Fix the fail code for model_non trie string switches. This was Mantis bug #383. compiler/ml_string_switch.m: For model_non switches in MLDS grades, a failure is indicated by a fall through. This can be represented by an empty sequence of MLDS statements, but the code that generated string trie switches took such an empty sequence to mean that the switch could not fail. Fix this incorrect assumption. tests/hard_coded/bug383.{m,inp,exp}: A regression test for the bug. tests/hard_coded/Mmakefile: Enable the new test case.	2015-03-25 19:51:08 +11:00
Peter Wang	9979764072	Build string switch tries in the target string encoding. The compiler should work in code units of the TARGET string encoding when building tries for string switches. Using its own string encoding would be incorrect if it differs from the target encoding. Currently that would only occur if the compiler is built in a java/csharp grade (uses UTF-16 internally) and invoked to target high-level C (uses UTF-8). Another motivation for this change is to remove a place where the compiler behaviour depends on the setting of `--cross-compiling'. As of now, the `--cross-compiling' option has no effect. compiler/backend_libs.m: compiler/string_encoding.m: Add new module with helper predicates. compiler/ml_string_switch.m: Convert strings to/from code units in the target string encoding. compiler/ml_switch_gen.m: Remove restriction on compiling string switches using tries when `--cross-compiling' is enabled. compiler/notes/compiler_design.html: Document the new module.	2015-03-23 14:16:20 +11:00
Zoltan Somogyi	d041b83943	Implement string switches via tries for the MLDS backend. The code we emit to decide which arm of the switch is selected looks like this: case_num = -1; switch (MR_nth_code_unit(switchvar, 0)) { case '98': switch (MR_nth_code_unit(switchvar, 1)) { case '99': if (MR_offset_streq(2, switchvar, "abc")) case_num = 0; break; case '100': if (MR_offset_streq(2, switchvar, "aceg")) case_num = 1; break; } break; case '99': if (MR_offset_streq(2, switchvar, "bbb")) case_num = 2; break; } The part that acts on this will look like this for lookup switches: if (case_num < 0) succeeded = MR_FALSE; else { outvar1 = vector_common[case_num].f1; ... outvarn = vector_common[case_num].fn; succeeded = MR_TRUE; } and like this for non-lookup switches: switch (case_num) { case 0: <code for case 0> break; ... case n: <code for case n> break; default: /* if the switch is can_fail / <code for failure> break; } compiler/ml_string_switch.m: Implement both non-lookup and lookup string switches via tries, along the lines shown above. compiler/ml_switch_gen.m: Invoke the predicates that implement string switches via tries in the circumstances in which option values call for them. For now, we generate tries only for the C backend. Once the problems identified for mlds_to_{cs,java,managed} below are fixed, we can enable them on those backends as well. compiler/options.m: doc/user_guide.texi: Add an option that governs the minimum size of trie switches. compiler/ml_lookup_switch.m: Factor out the code common to the implementation of all model-non lookup switches, both in ml_lookup_switch.m and ml_string_switch.m, and put it all into a new exported predicate. The previously existing MLDS implementation methods for lookup switches all build their lookup tables from maps that maps each cons_id in the switch cases to the values of the output arguments of those cases. For switch cases that apply to more than one cons_id, this map had one entry for each of those cons_ids. For tries, we need a map from case ids, not cons ids* to the outputs. Since it is easier to convert the one-to-one case_id->outputs map to the many-to-one cons_id->outputs map than vice versa, change the main data structure from which lookup tables are built to store data in a case_id->outputs format, and provide predicates for its conversion to the other (previously the only) format. Rename ml_gen_lookup_switch to ml_gen_atomic_lookup_swith to distinguish it from other predicates that also generate (other kinds of) lookup switches. compiler/switch_util.m: Have the types representating lookup tables represent their contents as a map, not as the assoc list derived from the map. Previously, we didn't do anything with the map other than flatten it to the assoc list, but for the MLDS backend, we may now also need to convert it to another form of map (see immediately above). compiler/builtin_ops.m: Add two new builtin ops. The first, string_unsafe_index_code_unit, returns the nth code unit in a string; the second, offset_str_eq, does a string equality test on the nth and later code units of two strings. They are used in the implementation of tries. compiler/c_util.m: Add a new binop category for each new binop, since they are not like existing binops. Put some existing binops into their own categories as well, since bundling them with the other ops they were bundled with seems like a bad idea. compiler/hlds_goal.m: Make the identifier of switch arms in tagged_cases a separate type from int. compiler/mlds_to_c.m: compiler/llds_out_data.m: Handle the new kinds of binops. When writing out binop expressions, we used to do a switch on the binop to get its category, and then another switch on the category. We now switch on the binop directory, since this much harder to write out code using new binops badly, and should be faster to boot. In mlds_to_c.m, also make some cosmetic changes to the output to make it easier to read, and thus to debug. compiler/mlds_to_il.m: Handle the new kinds of binops. compiler/mlds_to_cs.m: compiler/mlds_to_java.m: compiler/mlds_to_managed.m: Do not handle the new kinds of binops, since doing so would require changing the whole approach of how these modules handle binops. Clean up some predicates. compiler/bytecode.m: compiler/erl_call_gen.m: compiler/lookup_switch.m: compiler/ml_global_data.m: compiler/ml_optimize.m: compiler/ml_tag_switch.m: compiler/opt_debug.m: compiler/string_switch.m: Conform to the changes above. compiler/ml_code_gen.m: Put the predicates of this module into a consistent order. library/string.m: Fix white space. runtime/mercury_string.h: Add a macro for each of the two new builtin operations.	2015-02-24 16:03:30 +11:00
Zoltan Somogyi	7ca1a07296	Allow the MLDS backend to generate indexing switches (switches implemented Estimated hours taken: 16 Branches: main Allow the MLDS backend to generate indexing switches (switches implemented more efficiently than just a if-then-else chain) for strings even if the target language does not support gotos. Previously, we use always used gotos to break out of search loops after we found a match: do { if (we have a match) { ... handle the match ... goto end } else { ... handle nonmatches ... } } while (loop should continue); maybe some code to handle the failure of the search end: Now, if the "maybe some code" is empty, we prefer to use break statements if the target language supports this: do { if (we have a match) { ... handle the match ... break; } else { ... handle nonmatches ... } } while (loop should continue) If we cannot use either gotos or break statements, we instead use a boolean variable named "stop_loop": stop_loop = 0; do { if (we have a match) { ... handle the match ... stop_loop = 1; } else { ... handle nonmatches ... } } while (stop_loop == 0 && loop should continue) if (stop_loop == 0) { maybe some code to handle the failure of the search } We omit the final if statement if the then-part would be empty. The break method generates the smallest code, followed by the goto code. I don't have information on speed, since we don't have a benchmark that runs long enough, and the compiler itself does not spend any significant amount of time on string switches. Probably the break method is also the fastest, simply because it leaves the code looking most like normal C code. (Some optimizations are harder to apply to code containing gotos, and some optimizer writers do not bother.) For C, we now normally prefer to generate code using the second method (breaks), if we can, though normally "maybe some code" is not empty, in which case we use the first method (goto). However, if the value of the --experiment option is set to "use_stop_loop", we always use the third method, and if it is set to "use_end_label", we always use the first, even when we could use the second. This allow us to test all three approaches using the C back end. With backends that support neither gotos nor break, we always use the third method (stop_loop). With backends that don't support gotos but do support breaks, we also always use the third method. This is because trying to use the second method would require us to commit to not creating the stop_loop variable BEFORE we know that the "maybe some code to handle the failure of the search" is empty, and if it isn't empty, then we don't have the goto method to fall back on. compiler/ml_string_switch.m: Make the change described above. Where possible, make the required change not to the original code, but to a version in which common parts have been factored out. (Previously, the duplicated code was small; now, it would be big.) compiler/ml_target_util.m: A new module containing existing functions that test various properties of the target language. Keeping some of those functions in their original modules would have introduced a circular dependency. compiler/ml_switch_gen.m: Enable the new functionality by removing the tests that previously prevented the compiler from using indexing switches on strings if the target language did not support gotos. Remove the code moved to ml_target_util.m. compiler/ml_optimize.m: compiler/ml_unify_gen.m: Remove the code moved to ml_target_util.m. compiler/ml_backend.m: compiler/notes/compiler_design.m: Add the new module. compiler/ml_proc_gen.m: Delete a predicate that hasn't been used for a long time. tools/makebatch: Fix an old pair of typos.	2011-08-15 06:23:20 +00:00
Zoltan Somogyi	de56f9b77c	Implement lookup table versions of hash and binary search switches for strings Estimated hours taken: 24 Branches: main Implement lookup table versions of hash and binary search switches for strings in the MLDS backend (those versions already exist in the LLDS backend). compiler/ml_string_switch.m: Make the above change. Where possible, factor out and reuse existing code. compiler/ml_lookup_switch.m: Break up the predicate that used to both test a switch whether it is a lookup switch and also generate code for it if it was, into two parts, each doing just one job. The first part is now useful for switches on strings as well. Group auxiliary predicates with the main predicates they support. Factor out some code into new predicates, and export them for use by the new code in ml_string_switch.m. Make some predicates tail recursive. Remove some predicates made unnecessary by changes to lookup_switch.m. compiler/ml_switch_gen.m: Invoke the new code when appropriate, and conform to the updated interface of ml_lookup_switch.m. compiler/switch_util.m: Change some types, and the predicates that operate on them, to make them useful for lookup switches for the MLDS backend as well the LLDS backend. Add some utility predicates. compiler/lookup_switch.m: Change the interface of some of the predicates in this module to allow us to factor out some common code from the higher order values passed by callers. Conform to the changes in switch_util.m. compiler/string_switch.m: Conform to changes in switch_util.m. compiler/switch_gen.m: Conform to changes in lookup_switch.m.	2011-08-09 05:34:35 +00:00
Zoltan Somogyi	6dabcc0aa1	Implement binary search switches for strings in the MLDS backend (they already Estimated hours taken: 16 Branches: main Implement binary search switches for strings in the MLDS backend (they already exist in the LLDS backend). Binary search switches have higher big-O complexity than hash table search switches, but lower startup costs, and so are appropriate for switches involving a smaller tables of strings. compiler/ml_string_switch.m: Implement binary search switches. Where possible, factor out and reuse code that already existed for implementing hash switches. compiler/ml_switch_gen.m: Invoke the new code when appropriate. compiler/switch_gen.m: Avoid executing the same test (NumArms > 1) more than once. compiler/mlds.m: Fix a typo in a comment. compiler/string_switch.m: Delete stray text from a comment.	2011-08-02 03:02:05 +00:00
Zoltan Somogyi	b4092d2e4e	Further improvements in the implementation of string switches, along with Estimated hours taken: 12 Branches: main Further improvements in the implementation of string switches, along with some bug fixes. If the chosen hash function does not yield any collisions for the strings in the switch arms, then we can optimize away the table column that we would otherwise need for open addressing. This was implemented in a previous diff. For an ordinary (non-lookup) string switch, the hash table has two columns in the presence of collisions and one column in their absence. Therefore if doubling the size of the table allows us to eliminate collisions, the table size is unaffected, though the corresponding array of labels we have to put into the computed_goto instruction we generate has to double as well. Thus the only cost of such doubling is an increase in "code" size, and for small tables, the elimination of the open addressing loop may compensate for this, at least partially. For lookup string switches, doubling the table size this way has a bigger space cost, but the elimination of the open addressing loop still brings a useful speed boost. We therefore now DO double the table size if this eliminates collisions. In the library, compiler etc directories, this eliminates collisions in 19 out of 47 switch switches that had collisions with the standard table size. compiler/switch_util.m: Replace the separate sets of predicates we used to have for computing hash maps (one for lookup switches and one for non-lookup switches) with a single set that works for both. Change this set to double the table size if this eliminates collisions. This requires it to decide the table size, a task previously done separately by each of its callers. One version of this set had an old bug, which caused it to effectively ignore the second and third string hash functions. This diff fixes it. There were two bugs in my previous diff: the unneeded table column was not being optimized away from several_soln lookup switches, and the lookup code for one_soln lookup switches used the wrong column offset. This diff fixes these too. Since doubling the table size requires recalculating all the hash values, decouple the computation of the hash values from generating code for each switch arm, since the latter shouldn't be done more than once. Add a note on an old problem. compiler/ml_string_switch.m: compiler/string_switch.m: Bring the code for generating code for the arms of string switches here from switch_util.m. tests/hard_coded/Mmakefile: Fix the reason why the bugs mentioned above were not detected: the relevant test cases weren't enabled. tests/hard_coded/string_hash.m: Update this test case to test the correspondence of the compiler's and the runtime's versions of not just the first hash function, but also the second and third. runtime/mercury_string.h: Fix a typo in a comment.	2011-08-02 00:05:44 +00:00
Zoltan Somogyi	065a440492	Simplify some code. Estimated hours taken: 0.1 Branches: main compiler/ml_string_switch.m: Simplify some code.	2011-07-27 01:16:15 +00:00
Zoltan Somogyi	fe566dbf42	When doing hash table lookup as part of the implementation of switches on Estimated hours taken: 8 Branches: main When doing hash table lookup as part of the implementation of switches on strings, we use open addressing to handle collisions. However, if the chosen hash function does not yield any collisions for the strings in the switch arms, then open addressing is unnecessary: if a lookup does not find the string bound to the switch variable in its home bucket, it won't be in the hash table at all. This diff optimizes such cases, by not generating for them the loop we would otherwise use for open addressing, and optimizing away the table column telling that loop where to check next. compiler/string_switch.m: Implement the above optimization both for ordinary switches on strings, and for lookup table switches (both one_soln and several_soln) on strings. compiler/ml_string_switch.m: Implement the above optimization for ordinary switches on strings. This module does not (yet) implement lookup table switches on strings. compiler/switch_util.m: When deciding what hash function to use, return the number of collisions for string_switch and ml_string_switch to use. Rename the other_switch category to float_switch, since the only type category it covers is switches on floats. compiler/switch_gen.m: compiler/ml_switch_gen.m: Make the module header comments more organized, and use the same template for both, so one can see the differences more easily. Put the switch arms for the smart indexing methods into the same order in both files. Fix an old problem in ml_switch_gen.m: the test to see whether we can apply a smart indexing method that uses switches on integers was testing not the availability of int switches in the target, but the availability of computed gotos. While ml_simplify_switch would transform the int-switch-using code to computed-goto-using code or an if-then-else chain in some cases, it would not do so in all cases. In ml_switch_gen.m, remove a test that could not succeed, and a procedure that was used only in that test. Conform to the changes in switch_util.m. compiler/lookup_switch.m: compiler/ml_simplify_switch.m: Update comments.	2011-07-26 00:25:22 +00:00
Zoltan Somogyi	295415090e	Convert almost all remaining modules in the compiler to use Estimated hours taken: 6 Branches: main compiler/*.m: Convert almost all remaining modules in the compiler to use "$module, $pred" instead of "this_file" in error messages. In a few cases, the old error message was misleading, since it contained an incorrect, out-of-date or cut-and-pasted predicate name. tests/invalid/unresolved_overloading.err_exp: Update an expected output containing an updated error message.	2011-05-23 05:08:24 +00:00
Julien Fischer	9f68c330f0	Change the argument order of many of the predicates in the map, bimap, and Branches: main Change the argument order of many of the predicates in the map, bimap, and multi_map modules so they are more conducive to the use of state variable notation, i.e. make the order the same as in the sv* modules. Prepare for the deprecation of the sv{bimap,map,multi_map} modules by removing their use throughout the system. library/bimap.m: library/map.m: library/multi_map.m: As above. NEWS: Announce the change. Separate out the "highlights" from the "detailed listing" for the post-11.01 NEWS. Reorganise the announcement of the Unicode support. benchmarks//.m: browser/.m: compiler/.m: deep_profiler/.m: extras//.m: mdbcomp/.m: profiler/.m: tests//.m: ssdb/.m: samples//.m slice/*.m: Conform to the above change. Remove any dependencies on the sv{bimap,map,multi_map} modules.	2011-05-03 04:35:04 +00:00
Zoltan Somogyi	022b559584	Make error messages for require_complete_switch scopes report the missing Estimated hours taken: 8 Branches: main Make error messages for require_complete_switch scopes report the missing functors. Knowing which functors are missing requires knowing not only the set of functors in the switched-on variable's type, but also which of these functors have been eliminated by earlier tests, which requires having the instmap at the point of entry to the switch. Simplification, which initially detected unmet require_complete_switch requirements, does not have the instmap, and threading the instmap through it would make it significantly less efficient. So instead we now detect any problems with require_complete_switch scopes (and require_detism scopes, which are similar) during determinism checking. compiler/det_report.m: Factor out the code for finding the missing functors in conventional determinism errors, to allow it to be used for this new purpose. Check whether the requirements of require_complete_switch and require_detism scopes are met IF the predicate has any such scopes. compiler/det_analysis.m: compiler/det_util.m: Record whether the predicate has any such scopes. compiler/hlds_pred.m: Add a predicate marker that allows this recording. compiler/simplify.m: Delete the code that checks the require_complete_switch and require_detism scopes. Keep the code that deletes those scopes. (We have to do that here because determinism error reporting never updates the goal). compiler/prog_out.m: Delete an unused predicate. compiler/*.m: Remove unnecesary imports as flagged by --warn-unused-imports.	2011-01-02 14:38:08 +00:00
Zoltan Somogyi	8a28e40c9b	Add the predicates sorry, unexpected and expect to library/error.m. Estimated hours taken: 2 Branches: main Add the predicates sorry, unexpected and expect to library/error.m. compiler/compiler_util.m: library/error.m: Move the predicates sorry, unexpected and expect from compiler_util to error. Put the predicates in error.m into the same order as their declarations. compiler/.m: Change imports as needed. compiler/lp.m: compiler/lp_rational.m: Change imports as needed, and some minor cleanups. deep_profiler/.m: Switch to using the new library predicates, instead of calling error directly. Some other minor cleanups. NEWS: Mention the new predicates in the standard library.	2010-12-15 06:30:36 +00:00

1 2

96 Commits