mirror of
https://github.com/Mercury-Language/mercury.git
synced 2026-04-15 09:23:44 +00:00
7dccb03be114fbbc0b6c514b7f7fc94696e42487
96 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
386160f937 |
s/dont/do_not/ in the compiler directory.
compiler/*.m:
Standardize on the do_not spelling over the dont contraction
in the compiler directory. (We used to have a lot of both spellings.)
|
||
|
|
28ab8c2ade |
Group together related builtin operations.
compiler/builtin_ops.m:
Replace six individual builtin comparison ops for str_{eq,ne,lt,le,gt,ge}
with a single str_cmp/1 function symbol, whose *argument*
is one of {eq,ne,lt,le,gt,ge}. Do the same with comparison operations
on integers (including the operations that compare signed integers
as if they were unsigned) and floats. The eq and ne operations
on integers had names that did not fit into the scheme used by the
other binops; this diff fixes that.
Replace five individual builtin arithmetic ops for int_{add,sub,mul,mod}
with a single int_arity/2 function symbol, one of whose arguments
is one of {add,sub,mul,rem}. (This diff renames the "mod" (modulus)
op to "rem" (remainder), as an XXX has been asking for a long time.)
The other argument specifies *which* integer type the operation is on.
Do a similar change for float arithmetic ops, with the exception that
floats don't support the remainder op.
The points of the above changes are
- to allow us to factor out commonalities between operations,
both between e.g. all comparison operations on integers,
and between e.g. lt comparisons on values of different types.
- to stop forcing switches on binops to make distinctions that
they do not actually care about.
Rename the old str_cmp op, which returns a negative, zero or positive
result (as does strcmp in C) to str_nzp, since the str_cmp name
is now used for something else.
Add some utility functions here, to allow the deletion of the
many existing copies of the bodies of those functions elsewhere
in the compiler.
compiler/closure_gen.m:
compiler/code_util.m:
compiler/dense_switch.m:
compiler/disj_gen.m:
compiler/ite_gen.m:
compiler/jumpopt.m:
compiler/llds.m:
compiler/llds_out_data.m:
compiler/lookup_switch.m:
compiler/middle_rec.m:
compiler/ml_disj_gen.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_simplify_switch.m:
compiler/ml_string_switch.m:
compiler/ml_unify_gen.m:
compiler/ml_unify_gen_test.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_data.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_java_data.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/peephole.m:
compiler/pragma_c_gen.m:
compiler/string_switch.m:
compiler/tag_switch.m:
compiler/trace_gen.m:
compiler/transform_llds.m:
compiler/unify_gen.m:
compiler/unify_gen_test.m:
Conform to the changes above, by either generating or consuming
binops in their new form.
|
||
|
|
2a63738b8e |
Implement det/semidet string trie lookup switches.
compiler/string_switch.m:
Implement single-solution string trie lookup switches.
The code managing the lookup table is new, while the code managing
the trie search generalizes existing code. The latter required
some redrawing of the predicate boundaries within that existing code,
as well as adjusting some types and variable names.
Include "jump" in the name of the non-lookup versions of string switches.
Put state var arguments last in some predicate signatures.
compiler/switch_gen.m:
Enable single-solution string trie lookup switches.
compiler/string_switch_util.m:
Delete the call to build_str_case_id_list from the create_trie predicate,
since it is needed only by its old caller, the implementation of string
trie JUMP switches (which now does it itself), and not by its new caller,
the implementation of string trie LOOKUP switches.
compiler/lookup_util.m:
compiler/code_util.m:
Give some predicates more expressive names.
compiler/code_loc_dep.m:
compiler/disj_gen.m:
compiler/jumpopt.m:
compiler/lookup_switch.m:
compiler/middle_rec.m:
compiler/ml_string_switch.m:
compiler/tag_switch.m:
compiler/unify_gen_test.m:
Conform to the changes above.
compiler/hlds_goal.m:
Fix a comment.
tests/hard_coded/space.m:
This test case caught a bug in an early version of this diff.
Document this fact.
Make the code more readable by
- aligning the columns in some tables,
- renaming some function symbols to avoid ambiguity,
- replacing the remnants of calls to Prolog's "is" predicate
with idiomatic Mercury code, and
- deleting commented-out dead code that duplicated the body of predicate.
tests/hard_coded/Mercury.options:
Make space.m's role as a test case for string trie switches official
by compiling it with options that force trie switches.
|
||
|
|
9dbee8bdb4 |
Implement trie string switches for the LLDS backend.
For now, the implementation covers only non-lookup switches.
compiler/builtin_ops.m:
Generalize the existing offset_str_eq binary op by adding an optional
size parameter, which, if present, restricts the equality test to look at
the given number of code units at most.
compiler/llds_out_data.m:
compiler/mlds_to_c_data.m:
Generalize the output of binop rvals whose operation is offset_str_eq.
In llds_out_data.m, fix a bug in the original code. (This bug did not
lead to problems because before this diff, we never generated this op.)
compiler/string_switch_util.m:
Add a predicate that recognizes when a trie node that is NOT a leaf
nevertheless represents the top of a stick, which means that it has
only one possible next code unit, which itself may have only one
possible next code unit, and so on, until we reach a node that *does*
have two or more next code units. (One of those may be the code unit
of the string-ending NULL character.)
compiler/ml_string_switch.m:
Use the new predicate in string_switch_util.m to generate better code
for sticks. Instead of comparing each character in the stick individually
against the relevant code unit of the string being switched on, compare
them all at once using the new binary op.
compiler/ml_switch_gen.m:
Insist on both the host machine and the target machine
using the C backend.
compiler/string_switch.m:
Implement non-lookup trie switches. The code follows the approach used
in ml_string_switch.m as much as possible, but there are plenty of
differences caused by targeting the LLDS.
Rename some predicates to specify which switch implementation method
they belong to.
Write a comment just once, and refer to it from elsewhere instead of
duplicating it at each reference site.
compiler/switch_gen.m:
Enable the use of trie switches when the option values call for it,
and when the switch is not a lookup switch.
compiler/cse_detection.m:
Do not flood the output of mmc -V with messages that have nothing to do
with the module being compiled.
compiler/options.m:
Add a way to specify --no-allow-inlining on the command line.
This can help debug code generator changes like this, by disallowing
a transform that can modify the Mercury code whose compilation process
you are trying to debug. (The documentation of the --inlining option
implies that --no-inlining should do the same job, but it does not.)
The option is not documented for users.
compiler/string_encoding.m:
Provide a version of from_code_unit_list_in_encoding that allows
non-well-formed code unit sequences as input, and provide det versions
of both versions. This is for use by both string_switch.m and
ml_string_switch.m.
compiler/hlds_goal.m:
Document the properties of case_ids.
compiler/llds.m:
Document the possibility that string constants are not well formed.
compiler/bytecode.m:
compiler/code_util.m:
compiler/mlds_dump.m:
compiler/ml_global_data.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_java_data.m:
compiler/opt_debug.m:
Conform to the changes above.
library/string.m:
Replace the non-exported test predicate internal_encoding_is_utf8 with
an exported function that returns an enum specifying the string encoding.
NEWS.md:
Announce the new function.
runtime/mercury_string.h:
Add the C macro that implements the new form of the offset_str_eq
binary op.
tests/hard_coded/string_switch4.{m,exp}:
We have long had three copies of the exact same code, in string_switch.m,
string_switch2.m and string_switch3.m, which were compiled with
- no smart switch implementation
- smart switch implementation forced to use the hash table method
- smart switch implementation forced to use binary search method
Add this new copy, which is compiled with
- smart switch implementation forced to use the new trie method
tests/hard_coded/Mmakefile:
Add the new test case.
tests/hard_coded/Mercury.options:
Update the options of the test cases, and specify them for the new.
tests/hard_coded/string_switch.m:
tests/hard_coded/string_switch2.m:
tests/hard_coded/string_switch3.m:
Update the top-of-module comment block to be identical in all four copies
of this module.
|
||
|
|
d5190e93c5 |
Fix LLDS/MLDS diffs in control of string switches.
Some of these diffs involve string lookup switches.
compiler/ml_lookup_switch.m:
When testing whether a switch is a lookup switch for the MLDS,
we sometimes need to update the code generator's state. We used to
return the updated state whether or not the switch is a lookup switch,
which is incorrect if the switch is NOT a lookup switch.
(The incorrectness used to show up as allocated but unused entities,
such as slots in the global const data table, which are harmless enough
not to lead to crashes.)
Fix this by putting the updated code generator state as a new argument
into the function symbol that we return only when the switch *is*
a lookup switch.
compiler/lookup_switch.m:
The predicate we used to test whether a switch is a lookup switch
for the LLDS used to be semidet, so it did have the above problem.
It did have the problem that calls to it returned only the info
appropriate for success; they did not return an indication about
*whether* they succeed in a way that could be stored. This meant
every method of implementing string switches had to repeat the call.
Fix this by changing the predicate to return a success/failure indication,
making it det. Make it use the new technique in ml_lookup_switch.m
to avoid using inappropriate code generator states. In this case,
that also means moving the code that remembers the branch start position
of the code generator state here from our caller. This move allows us
to delete a reset to the just-remembered position, which was never needed.
compiler/ml_string_switch.m:
compiler/string_switch.m:
Conform to the changes in ml_lookup_switch.m/lookup_switch.m.
compiler/ml_switch_gen.m:
Move producers of variables to just the code branches that need
the value of that variable.
Conform to the changes in ml_lookup_switch.m.
compiler/switch_gen.m:
Move the code that decides how to implement a smart switch
on a string value to a predicate of its own, to match
ml_switch_gen.m. Change the structure of the moved code
to follow the structure in ml_switch_gen.m.
Conform to the changes in ml_lookup_switch.m.
|
||
|
|
d385cdca37 |
Replace reversed lists with cords.
Add ml_ prefixes to some predicate names. Put a piece of code into a predicate of its own, to prevent it from distracting readers of the original predicate with its not-very-important detail. |
||
|
|
0ed6e2d0d4 |
Prepare for string trie switches in the LLDS.
compiler/ml_string_switch.m:
compiler/string_switch_util.m:
Move the backend-agnostic part of the existing MLDS implementation
of string trie switches from ml_string_switch.m to string_switch_util.m.
Clean it up a bit for more general use.
compiler/string_encoding.m:
Document the exported predicates and functions.
|
||
|
|
a6d81a3bb9 |
Carve three new modules out of switch_util.m.
compiler/lookup_switch_util.m:
compiler/string_switch_util.m:
compiler/tag_switch_util.m:
Carve these three new modules out of switch_util.m. As their names imply,
they contain the parts of the old switch_util.m that are concerned with
lookup switches, switches on strings, and switches on tags respectively.
compiler/switch_util.m:
Delete the code moved to the new modules.
compiler/backend_libs.m:
Include the new modules in the backend_libs package.
compiler/notes/compiler_design.html:
Document the new modules.
compiler/dense_switch.m:
compiler/lookup_switch.m:
compiler/ml_lookup_switch.m:
compiler/ml_simplify_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
compiler/simplify_goal_switch.m:
compiler/string_switch.m:
compiler/switch_case.m:
compiler/switch_gen.m:
compiler/tag_switch.m:
Conform to the changes above by importing one, or sometimes two, of
the new modules, usually instead of switch_util.m, sometimes
in addition to switch_util.m.
In a few cases, delete explicit module qualifications that
this diff has made incorrect.
|
||
|
|
f765494ec9 |
Fix spelling.
compiler/ml_string_switch.m:
As above.
|
||
|
|
c138bbb632 |
Fix a bug in string trie jump switches.
compiler/ml_string_switch.m:
Fix a too-strong sanity check. It insisted on a semidet switch
containing code to handle the failure of the switch, but a switch
on strings can be both semidet and cannot_fail if
- the switched-on variable's inst is known to contain only the strings
handled by the arms of the switch, and
- one or more of the switch arms containing semidet code.
In that case, the switch does not need a default case, since it would be
unreachable.
compiler/options.m:
Provide a way to test for the presence of this fix.
|
||
|
|
72e0014003 | Rename more predicates to avoid ambiguities. | ||
|
|
9012395ec2 |
Don't let ml_tag_switch.m generate duplicate fields.
This fixes the second problem identified by Mantis bug #548. compiler/ml_tag_switch.m: Detect the circumstances in which this problem would arise. In such cases, simply fail, and let ml_switch_gen.m fall back to implementing the switch as an if-then-else chain. compiler/ml_switch_gen.m: Implement that fallback. compiler/switch_util.m: The new code in ml_tag_switch.m needs to thread a fourth piece of state through the predicate it passes to group_cases_by_ptag, so change its argument list to accommodate such predicates. And since some other modules pass the same predicates to group_cases_by_ptag and string_binary_cases, make the same change in the argument list of that predicate as well. Delete one stray comment, and note that another comment seems misplaced. compiler/ml_string_switch.m: compiler/string_switch.m: compiler/switch_case.m: compiler/tag_switch.m: Conform to the changes in switch_util.m. tests/hard_coded/bug548.exp: tests/hard_coded/Mmakefile: Enable the previously-added test case for Mantis #548, after add an .exp file for it. |
||
|
|
9ddb180757 |
Handle const_var_maps left by add_trail_ops.m.
This fixes Mantis bug #544. The code of add_trail_ops.m can transform <code that adds an entry to const_var_map> into ( ... <code that adds an entry to const_var_map> ... ; ..., fail ) where the const_var_map in the MLDS code generator records which variables' values are available as ground terms. The MLDS code generator used to reset the const_var_map in its main data structure, the ml_gen_info, at the end of every disjunction (actually, at the end of every branched control structure) to the value it had at the start. This was intended to prevent the code following the branched structure from relying on const_var_map entries that were added to the const_var_map on *some* branches, but not others. However, in this case, it has the effect of forgetting the entry added by the first disjunct, even though - the code after the disjunction can be reached *only* via the first disjunct, and - the code after the disjunction (legitimately, until add_trail_ops) depended on that entry being available. The fix is to allow the code after a branched control structure to depend on any const_var_map entry that is present in the final const_var_map in every branch of the branched control structure whose end is reachable. The LLDS code generator was not affected by the bug, because it uses totally separate systems both for implementing trailing, and for keeping track of what variables' values are available statically. In particular, it does not rely on operations inserted and the annotations left on unifications by the add_trail_ops and mark_static_term passes, having been written long before either module existed. compiler/hlds_goal.m: Document the update above to what may be marked static. compiler/ml_gen_info.m: Document the updated protocol for handling the const_var_map field. Use a named type instead of its expansion. compiler/ml_code_gen.m: Make the predicates that generate code for a branch in a branched control structure return the final const_var_maps from the branches whose endpoints are reachable. Add a predicate that computes the consensus of all the gathered const_var_maps. Compute consensus const_var_maps for if-then-elses and negations. Fix some inconsistencies in variable naming. Simplify some code. compiler/ml_disj_gen.m: Compute consensus const_var_maps for disjunctions. compiler/ml_string_switch.m: compiler/ml_switch_gen.m: compiler/ml_tag_switch.m: Compute consensus const_var_maps for various kinds of switches. In some predicates, put related arguments next to each other. compiler/ml_unify_gen_construct.m: Delete "dynamic" from the names of several predicates that also handled non-dynamic construction unifications. Fix an out-of-date comment. compiler/mark_static_terms: Fix grammar in a comment. library/map.m: Fix a careless bug: when doing a merge in map.common_subset_loop, we threw away an entry from the wrong list in one of three cases. Make such bugs harder to overlook by - deleting the common parts from variable names, leaving the differences easier to see, and - replacing numeric suffixes for completely separate data structures with A and B suffixes. tests/valid/bug544.m: A new test case for the bug. tests/valid/Mercury.options: tests/valid/Mmakefile: Enable the bug, and run it with -O5. |
||
|
|
95f59cf7c9 |
Fix lookup switches on subtype enums.
compiler/switch_util.m:
Rename dont_need_bit_vec_check variant of need_bit_vec_check to
dont_need_bit_vec_check_no_gaps.
Add dont_need_bit_vec_check_with_gaps (see below).
Make type_range return the correct min and max values used by a
subtype enum type. For now, it fails unless the range of values
is contiguous.
Make find_int_lookup_switch_params use the min and max values for a
type returned by type_range, not assuming 0 to the max value.
Make find_int_lookup_switch_params return
dont_need_bit_vec_check_with_gaps when a bit vector check is not
required before a table lookup, yet the table is expected to contain
dummy rows. This is the case for a cannot_fail switch on a subtype
enum type type, where the subtype does not use some values between
the min and max values.
compiler/dense_switch.m:
Make tagged_case_list_is_dense_switch use the min and max values for
a type returned by type_range, not assuming 0 to the max value.
compiler/ml_lookup_switch.m:
Expect the generated lookup table to contain dummy rows or not
depending on dont_need_bit_vec_check_{with_gaps,no_gaps}.
Conform to change to need_bit_vec_check.
compiler/lookup_switch.m:
compiler/ml_string_switch.m:
Conform to change to need_bit_vec_check.
tests/hard_coded/Mmakefile:
tests/hard_coded/dense_lookup_switch4.exp:
tests/hard_coded/dense_lookup_switch4.m:
tests/hard_coded/dense_lookup_switch_non2.exp:
tests/hard_coded/dense_lookup_switch_non2.m:
Add test cases.
|
||
|
|
b66f45e4db |
Tighten the mlds_type type.
compiler/mlds.m:
Make two changes to mlds_type.
The simpler change is the deletion of the maybe(foreign_type_assertions)
field from the MLDS representations of Mercury types. It was never used,
because Mercury types that are defined in a foreign language that is
acceptable for the current MLDS target platform are represented
as mlds_foreign_type, not as mercury_type.
The more involved change is to change the representation of builtin types.
Until now, we had separate function symbols in mlds_type to represent
ints, uints, floats and chars, but not strings or values of the sized
types {int,uint}{8,16,32,64}; those had to be represented as Mercury types.
This is an unnecessary inconsistency. It also had two allowed
representations for ints, uints, floats and chars, which meant that
some of the code handling those conceptual types had to be duplicated
to handle both representations.
This diff provides mlds_builtin_type_{int(_),float,string,char} function
symbols to represent every builtin type, and changes mercury_type
to mercury_nb_type to make clear that it is NOT to be used for builtins
(the nb is short for "not builtin").
compiler/ml_code_util.m:
compiler/ml_util.m:
Delete functions that used to construct MLDS representations of builtin
types. The new representation of those types is so simple that using
such functions is no less cumbersome than writing down the representations
directly.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_proc_gen.m:
compiler/ml_rename_classes.m:
compiler/ml_simplify_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen_construct.m:
compiler/ml_unify_gen_deconstruct.m:
compiler/ml_unify_gen_util.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_data.m:
compiler/mlds_to_c_export.m:
compiler/mlds_to_c_func.m:
compiler/mlds_to_c_global.m:
compiler/mlds_to_c_stmt.m:
compiler/mlds_to_c_type.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_cs_stmt.m:
compiler/mlds_to_cs_type.m:
compiler/mlds_to_java_data.m:
compiler/mlds_to_java_stmt.m:
compiler/mlds_to_java_type.m:
compiler/mlds_to_java_wrap.m:
compiler/rtti_to_mlds.m:
Conform to the changes above.
|
||
|
|
6a915eef05 |
Optimize field updates inside packed arg words.
Since june, we have been copying words containing packed-together
sub-word-sized arguments all in one piece if possible, for hlc grades.
This means that given a type such as
:- type t
---> f1(int8, bool, int8, int, bool, int8, bool).
whose first three and last three arguments are packed into one word each,
and a predicate such as
p(T0, T) :-
T0 = f1(A, B, C, _, E, F, G),
D = 42,
T = f1(A, B, C, D, E, F, G).
we generated code such as
MR_Integer D_12 = (MR_Integer) 42;
MR_Unsigned packed_args_0 =
(MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 0)));
MR_Unsigned packed_args_1 =
(MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 2)));
base = (MR_Word) MR_new_object(MR_Word,
((MR_Integer) 3 * sizeof(MR_Word)), NULL, NULL);
*T_4 = base;
MR_hl_field(MR_mktag(0), base, 0) = (MR_Box) (packed_args_0);
MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_12));
MR_hl_field(MR_mktag(0), base, 2) = (MR_Box) (packed_args_1);
which does NOT pick up the values A, B, C, E, F and G individually.
However, until now, we could reuse packed-together words only in their
unchanged form.
This diff lifts that limitation, which means that now, we can *also*
optimize code such as
p(T0, T) :-
T0 = f1(A, B, _, D, E, _, G),
C = 42i8,
F = 43i8,
T = f1(A, B, C, D, E, F, G).
by generating code like this:
base = (MR_Word) MR_new_object(MR_Word,
(3 * sizeof(MR_Word)), NULL, NULL);
*T_4 = base;
MR_hl_field(MR_mktag(0), base, 0) = (MR_Box)
((((packed_word_0 & (~((MR_Unsigned) 255U)))) |
(MR_Unsigned) ((uint8_t) (C_12))));
MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_8));
MR_hl_field(MR_mktag(0), base, 2) = (MR_Box)
((((packed_word_1 & (~((MR_Unsigned) 510U)))) |
(((MR_Unsigned) ((uint8_t) (F_13)) << 1))));
The general scheme when reusing *part* of a word is: first set the bits
not being reused to zero, and then OR in new values of those bits.
Make this optimization as general as possible by making it work
not just for
- words in memory cells containing only arguments,
but also for
- words in memory cells containing a remote sectag as well as arguments, and
- words in registers cells containing a ptag, a local sectag as well as
arguments.
compiler/ml_gen_info.m:
Generalize the data structure we use to represent information about
packed words to make possible approximate as well as exact lookups.
The key in the old map was "these bitfields with the values of these
variables in them", while the key in the new map is just "these bitfields",
with the associated value being a list, each element of which says
"the word with these values in those bitfields is available in this rval".
This makes it possible to look for matches words that have some, but not
all, of the right values in the bitfields.
Since the packed words may now contain tags as well as arguments,
rename "packed args" to "packed word".
compiler/ml_unify_gen_deconstruct.m:
When deconstructing a term containing packed words, add them to the
packed word map even when one of the bitfields inside the packed word
contains tag information.
Move the code that adds a packed word to the map into a separate predicate,
now that it is needed from more than one place.
compiler/ml_unify_gen_construct.m:
Change the code that handles packed words to work in terms of filled
bitfields. Use this not only to implement the optimization described
at the top, but also to make the handling of bitfields more systematic.
At least one previous bug was caused by doing sign extension differently
for the bitfield containing the first packed argument in a word than for
the later packed arguments in that word; with the new design, such
inconsistencies should not happen.
compiler/ml_unify_gen_util.m:
Add utility predicates now needed for both construct and deconstruct
unifications.
compiler/mlds.m:
Document the new use of lvnc_packed_word (renamed from lvnc_packed_args).
compiler/ml_code_gen.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
Conform to the changes above (mostly the packed_word rename).
compiler/mlds_to_c_data.m:
compiler/mlds_to_c_stmt.m:
Omit unneeded casts from the output. Specifically, don't put (MR_Integer)
casts in front of integer constants being used either as shift amounts,
or as the number of words that a new_object MLDS operation should allocate.
The casts only cluttered the output, making it harder to read, and
therefore to judge its correctness.
|
||
|
|
b06b2621b3 |
Move towards packing args with secondary tags.
compiler/hlds_data.m:
Add bespoke types to record information about local and remote secondary
tags. The one for local secondary tags includes the value of the
primary and secondary tag together, since construct unifications
need to assign this value, and it is better to compute this once,
instead leaving the target language compiler to do it, potentially
many times.
Use a wrapped uint8 to record primary tag values, and wrapped uints
to record secondary tag values. The wrap is to prevent any accidental
confusion with other values. The use of uint8 and uint has two purposes.
First, using the tighest possible representation. Tags are never negative,
and primary tags cannot exceed 7. Second, using these types in the compiler
help us eat our own dogfood; if a change causes a problem affecting
these types, its bootcheck should fail, alerting us to the problem.
Add commented-out types and fields that will be needed for packing
sub-word-sized arguments together with both local and remote secondary
tags.
compiler/du_type_layout.m:
Generate references to tags in the new format.
compiler/ml_unify_gen.m:
compiler/unify_gen.m:
compiler/modecheck_goal.m:
Conform to the changes above.
Fix an old bug: the inst corresponding to a constant with a primary
and a local secondary tag is not the secondary tag alone, but both tags
together.
compiler/bytecode.m:
compiler/bytecode_gen.m:
compiler/closure_gen.m:
compiler/disj_gen.m:
compiler/export.m:
compiler/hlds_code_util.m:
compiler/jumpopt.m:
compiler/lco.m:
compiler/llds_out_data.m:
compiler/llds_out_instr.m:
compiler/lookup_switch.m:
compiler/lookup_util.m:
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_util.m:
compiler/ml_elim_nested.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
compiler/ml_type_gen.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_data.m:
compiler/mlds_to_c_stmt.m:
compiler/opt_debug.m:
compiler/peephole.m:
compiler/rtti.m:
compiler/rtti_out.m:
compiler/rtti_to_mlds.m:
compiler/string_switch.m:
compiler/switch_util.m:
compiler/tag_switch.m:
compiler/type_ctor_info.m:
Conform to the change to hlds_data.m.
In two places, in rtti_out.m and rtti_to_mlds.m, delete old code
that was needed only to implement reserved tags, which we have
stopped supporting a few months ago.
library/uint8.m:
library/uint16.m:
library/uint32.m:
library/uint64.m:
Add predicates to cast from each of these types to uint.
|
||
|
|
ec6a40ed85 |
Put related args of ml_field next to each other.
compiler/mlds.m:
Put the *type* of the pointer next to the *value* of the pointer.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_util.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_rename_classes.m:
compiler/ml_string_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/ml_unused_assign.m:
compiler/ml_util.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_data.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
compiler/rtti_to_mlds.m:
Conform to the change above.
|
||
|
|
bbe0f28f3b |
Copy packed arguments all at once.
Copy words containing packed-together sub-word-sized arguments all
in one piece if possible, for hlc grades.
Given a type such as
:- type t
---> f1(int8, bool, int8, int, bool, int8, bool).
whose first three and last three arguments are packed into one word each,
and a predicate such as
p(T0, T) :-
T0 = f1(A, B, C, _, E, F, G),
D = 42,
T = f1(A, B, C, D, E, F, G).
we used to generate code that picked up each of the six named arguments
from T0, and used them to construct T. With this diff, we now translate
the above to
MR_Integer D_12 = (MR_Integer) 42;
MR_Unsigned packed_args_0 =
(MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 0)));
MR_Unsigned packed_args_1 =
(MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 2)));
base = (MR_Word) MR_new_object(MR_Word,
((MR_Integer) 3 * sizeof(MR_Word)), NULL, NULL);
*T_4 = base;
MR_hl_field(MR_mktag(0), base, 0) = (MR_Box) (packed_args_0);
MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_12));
MR_hl_field(MR_mktag(0), base, 2) = (MR_Box) (packed_args_1);
compiler/ml_unify_gen.m:
Implement the two main parts of this optimization.
Part one is the change to deconstruction unifications. When we generate
assignments from all the fields packed together into a word to their
corresponding argument variables (such as A/B/C or E/F/G above),
create a fresh variable (such as packed_args_0 above), assign to it
the value of the whole word, and record in a new data structure (the
packed_args_map) that these argument variables, in these positions
within the word, are now available in the newly created variable.
(We still define the argument variables as well, since they may be needed;
deleting them if they are *not* needed is the job of ml_unused_assign.m.)
Part two is the change to construction unifications. When we generate code
to OR together the shifted and/or masked values of two or more variables
to fill in one word in a new heap cell, we search the packed_args_map
to see whether those variables, in the positions we need, are available
in one of the variables created in part one. If yes, we discard
the whole OR-ing together operation and we use that variable instead.
Since part one can now create local variable definitions, return these
upwards as needed.
compiler/ml_gen_info.m:
Add two fields to the ml_gen_info structure (actually, to one of its
substructures). One is the packed_args_map described above, the other
is a counter we use to give a unique name to all the fresh variables.
When creating ml_gen_infos, put the code defining each field of a
substructure next to the creation of that substructure.
compiler/mlds.m:
Add a kind of compiler-generated variable holding packed argument words.
It is used in part one above.
compiler/ml_code_gen.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
Save, reset and restore the packed_args_map as necessary to ensure that
a construction unification sees an entry in that map only if the
deconstruction unification that created that entry *had* to be executed
before execution reaches the construction unification.
This means that when we process a branched control structure, we have to
make sure that (a) entries created by one branch are not seen when
we generate code for the other branches, and (b) that code *after* the
branched control structure sees only the entries created *before* the
branched control structure, since such code following cannot use an entry
that was created by a branch that may or may NOT have been executed
on the way there.
We also reset the packed_args_map to empty when generating code
that will end up inside a nested function, for two reasons. First,
I am not sure whether the code in ml_elim_nested.m that flattens out
nested functions is general enough to handle the new kind of compiler
generated variable correctly. And second, even if it is, the additional
memory traffic for putting those variables into environments, and later
pulling them out again, would definitely reduce and maybe completely
eliminate the speedup from optimizing constructions.
compiler/ml_closure_gen.m:
Conform to the change in ml_unify_gen.m.
compiler/ml_proc_gen.m:
Invoke ml_unused_assign.m in both branches of an if-then-else.
Previously, it was invoked in only the rarely executed branch,
which is what hid its bugs.
Fix one bug: for model_semi procedures, include the succeeded variable
in the set of variables whose values is needed after the generated
function body.
Work around another bug: the ml_unused_assign.m cannot yet handle
nested functions properly, so throw away its output in their presence.
compiler/ml_unused_assign.m:
As part of the same workaround, if a block contains nested functions,
tell ml_proc_gen.m to use the original code.
Fix several other bugs.
Don't delete variables from the seen_set when the backwards traversal
finds an assignment to them, because the variable's absence from
the seen_set would lead to the declaration of the variable being deleted.
Delete a sanity check that made sense only the presence of such deletions.
Never delete assignments to compiler-generated variables; we generate
such assignments only when their results *will* be needed.
When exiting the traversal of a block, *do* delete the variables
declared locally in that block from the seen_set; being undeclared there,
they cannot possibly be seen before that block. leaving them in
does not compromise correctness, but does reduce performance
by making operations on the seen_set slower than necessary.
If deleting unused assignments makes the else part of an if-then-else
empty, then delete the whole else part.
compiler/mlds_to_c_stmt.m:
Generate a valid C statement even for an MLDS comment. When an buggy
version of ml_unused_assign.m (incorrectly) deleted assignments to
succeeded, it sometimes left an else part containing only a comment,
which lead gcc to report syntax errors.
|
||
|
|
b9afc8b78e |
Delete the mlds_unary_op type.
compiler/mlds.m:
We used to have a function symbol ml_unop in the mlds_rval type
that applied one of four kinds of operations to an argument mlds_rval:
boxing, unboxing, casting or a standard unary operation, with a value
of type mlds_unary_op selecting between the four. Replace this system
with four separate function symbols in the mlds_rval type directly,
and delete the mlds_unary_op type.
The new arrangement requires fewer memory cells to be allocated,
and less indirection; it also leads to shorter and somewhat
more readable code.
compiler/ml_optimize.m:
Conform to the change above.
Recognize that a cast has negligible cost.
compiler/ml_code_util.m:
Conform to the change above.
Keep private a predicate that is not used by any other module,
after merging it with another previously-exported predicate
that only *it* uses.
Delete some other predicates that are not used anywhere.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_rename_classes.m:
compiler/ml_string_switch.m:
compiler/ml_tag_switch.m:
compiler/ml_unify_gen.m:
compiler/ml_unused_assign.m:
compiler/ml_util.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
compiler/rtti_to_mlds.m:
Conform to the change above.
|
||
|
|
fcefbb948d |
Delete assignments to dead variables in the MLDS.
At the moment, we tend not to generate such assignments, with the exception
of assignments to the MLDS versions of HLDS variables of dummy types.
The reason I am nevertheless adding this optimization is that I intend
to soon add code to ml_unify_gen.m that *will* generate assignments
to dead variables.
The idea is to optimize field updates involving packed arguments.
Given a type such as
:- type t
---> f(
f1 :: bool,
f2 :: bool,
f3 :: enum1,
f4 :: int
).
we currently implement a field update such as "T = T0 ^ f4 := 42",
whose HLDS representation is the two unifications
T0 = f(T0f1, T0f2, T0f3, _),
T = f(T0f1, T0f2, T0f3, 42)
using code that looks like this:
T0f1 = (T0[0] >> ...) & ...
T0f2 = (T0[0] >> ...) & ...
T0f3 = (T0[0] >> ...) & ...
T = allocate memory for new memory cell, put on primary tag
T[0] = (T0f1 << ...) | (T0f2 << ...) | (T0f3 << ...)
T[1] = 42
I want to implement it using code that looks like this:
T0w0 = T0[0]
T = allocate memory for new memory cell, put on primary tag
T[0] = T0w0
T[1] = 42
where T0w0 contains the entire first word of the memory cell of T0.
This code avoids a bunch of shifts, ORs and ANDs.
I propose to translate the T0 = f(T0f1, T0f2, T0f3, _) unification into
T0w0 = T0[0]
T0f1 = (T0[0] >> ...) & ...
T0f2 = (T0[0] >> ...) & ...
T0f3 = (T0[0] >> ...) & ...
while recording in the ml_gen_info/code_info that this *specific* packing of
T0f1, T0f2 and T0f3 is available in T0w0. When translating the following
unification, the code generator will see this, and this will allow it to
generate
T[0] = T0w0
instead of
T[0] = (T0f1 << ...) | (T0f2 << ...) | (T0f3 << ...)
However, by this time the assignments to T0f1, T0f2 and T0f3 have already
been generated. Whether or not they are dead assignments depends on whether
other code needs the values of those fields of T0. Deciding this
requires knowledge that the code generator can't have when translating
the deconstruction of T0. Hence the need for a new MLDS-to-MLDS optimization.
compiler/ml_unused_assign.m:
A new compiler module implementing the new optimization.
It is not part of ml_optimize.m because ml_optimize.m traverses
the MLDS forwards, while this optimization requires a backwards traversal:
you cannot know whether an assignment is dead unless you know that the
following code does not need the value of the variable it assigns to.
compiler/ml_backend.m:
compiler/notes/compiler_design.html:
Include the new module.
compiler/mlds.m:
The new optimization needs extra information about loops.
When it enters into the loop body, it knows which variables
are needed *after* the loop, but it does not know which variables
the loop body first reads and then writes. Without this knowledge,
it would optimize away assignments to loop control variables,
such as the increment of i in the loop
i = 0;
while (...) {
...; i = i+1; ...
}
Traditionally, compilers have solved this problem by doing fixpoint
iteration, adding to the live set at each program point until
no more additions are possible. We can do better, because we generate
loops in the MLDS in only two kinds of cases:
- loops implementing tail recursion, in which case the only extra
variables that we need to preserve assignments to in the loop body
are the input arguments of the procedure, and
- loops created by the compiler itself to loop over a set of alternatives,
for which the only extra variables that we need to preserve assignments
to in the loop body are the variables the compiler uses to control
the loop.
To make it possible for ml_unused_assign.m to do its job without
a fixpoint iteration, include in the MLDS representation of every
while loop a list of these variables.
Add a type to represent the identify of an MLDS local var,
for use by some of the modules below. They used to store this info
in the form of mlds_lvals, but that is not specific enough
to be used to fill in the new field in while loops.
compiler/ml_proc_gen.m:
Compute the information needed by the new pass, and invoke it
if the relevant option is set.
compiler/options.m:
Add this option. It is for developers only, so it is undocumented.
compiler/ml_util.m:
Add a utility function needed in several places.
compiler/ml_accurate_gc.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_rename_classes.m:
compiler/ml_string_switch.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
Conform to the changes in mlds.m.
|
||
|
|
dc4196e5af |
Separate breaks from loops and breaks from switches in MLDS.
compiler/mlds.m:
Replace goto_break with goto_break_loop and goto_break_switch, each
intended to break from a particular construct. It was confusion between
the two kinds of breaks that led to the earlier bug that broke
--prefer-while-loop-over-jump-mutual; this separation should make
such bugs easy to detect.
Rename goto_continue as goto_continue_loop to match the new naming scheme.
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
When emitting goto_break_loop and goto_break_switch, check whether
the nearest enclosing break-able scope is a loop or switch respectively.
To make this check possible, record the nearest break-able scope.
While these additions make the compiler do extra work, the performance
impact is negligible.
compiler/mlds_to_target_util.m:
Add the type that mlds_to_{c,cs,java}.m all use to identify
break-able scopes.
compiler/ml_call_gen.m:
compiler/ml_proc_gen.m:
compiler/ml_string_switch.m:
Update the code that generates gotos.
|
||
|
|
034cb97988 |
Don't module- or type-qualify MLDS local variables.
Some global variables generated by the MLDS backend need to be visible
across module boundaries, and therefore mlds_data definitions, which
contained global as well as other variables, used to have their names
qualified; usually module-qualified, though sometimes type-qualified.
However, since the diff that partitioned mlds_data_defns into the
definitions of local variables, global variables and field variables,
the qualification of local variables has *not* been necessary, so this diff
removes such qualifications. This makes the MLDS code generating references
to local variables simpler, more readable, and slightly faster.
The generated code is also shorter and easier to read.
There are two exceptional cases in which local variables *did* need
qualification, both of which stretch the meaning of "local".
One such case is the "local" variable dummy_var, which (by definition)
is only ever assigned to, and never used. It is also never defined
in MLDS-generated code; instead, it is defined defined in private_builtin.m
(for the Java and C# backends) or the runtime (for C). All three backends
currently require references to this variable in the runtime to be module
qualified. There are three possible fixes to this problem, which is caused
by the fact that this "local" variable is in fact global.
- Fix 1a would be to make dummy_vars global, not local.
- Fix 1b is to special-case dummy_vars in mlds_to_{c,cs,java}.m, and put
the fixed "private_builtin" qualifier in front of it.
- Fix 1c would be to modify the compiler to never generate any references
to dummy vars at all.
This diff uses fix 1b, because it is simple. I (zs) will explore fix 1c
in the future, and see if it is viable.
The second such case occurs when generating code for unifications
involving function symbols represented by the addresses of reserved objects.
These addresses used to be represented as the addresses of mlds_data
definitions, then as addresses of field variables cast as qualified
local variables. Since diff this makes all local variables unqualified,
this can't continue. Two possible fixes are
- Fix 2a: introduce an mlds const rval representing the address of a field
variable, which solves the problem because unlike local variables,
field variables can still be either module- or type-qualified.
- Fix 2b: prohibit the use of the addresses of reserved objects as tags.
After a (short) discussion on m-dev, this diff uses fix 2b.
compiler/mlds.m:
Delete the qual_local_var_name type, and replace all its uses
with the mlds_local_var_name type. Delete the module qualifier field
in mlds_data_addr_local_var consts.
compiler/ml_code_util.m:
Simplify the predicates and functions whose task is to build references
to local variables. Delete the arguments that they don't need anymore.
Delete one function entirely, since calling it now takes both more
characters and more code than its shortened body does.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_rename_classes.m:
compiler/ml_string_switch.m:
compiler/ml_tailcall.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/ml_util.m:
compiler/mlds_to_target_util.m:
compiler/rtti_to_mlds.m:
Conform to the changes above. Stop qualifying local variable names,
and stop passing the parameters that used to be used *only* for
qualifying local variable names.
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
Conform to the changes above, and implement fix 1b.
NEWS:
compiler/options.m:
compiler/make_tags.m:
Implement fix 2b by disabling the --num-reserved-objects option.
This ensures that we don't use the addresses of reserved objects as tags.
library/private_builtin.m:
Move the C# definition of dummy_var next to the Java definition,
and fix the comments on them.
|
||
|
|
1c01ed85eb | Fix lines. | ||
|
|
91790794f1 |
Define the MLDS "succeeded" variable only if needed.
This makes the generated MLDS code less cluttered and easier to work on.
compiler/ml_gen_info.m:
Add a field for recording whether the succeeded variable has been used.
compiler/ml_code_util.m:
Change the predicates that return references to the succeeded variable
to record that it has been used.
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_lookup_switch.m:
compiler/ml_string_switch.m:
compiler/ml_unify_gen.m:
Use the updated forms of the predicates in ml_code_util.m.
compiler/ml_proc_gen.m:
Define the succeeded variable only if the new slot says it has been used.
compiler/ml_optimize.m:
Fix a bug triggered by the above change: when a tail recursive call
was the *entire body* of a MLDS function, ml_optimize.m did not find it,
and thus did not do the setup needed to prepare for the tail recursion.
Previously, the always-present declaration of "succeeded" made it
impossible for the tail call to be the only thing in the body.
|
||
|
|
b390231f22 |
Use mlds_target_lang in the MLDS backend.
The overall compilation target language (which is recorded in the globals)
can be C, Java, C# or Erlang. The target language of the MLDS backend
can only be the first three. Use the mlds_target_lang type (which has
three functors) instead of the compilation_target type (which has four)
to make target-specific decisions in the MLDS backend.
compiler/mercury_compile_mlds_back_end.m:
Compute the MLDS target (which can be C, Java or C#) from the compilation
target (which can also be Erlang).
compiler/ml_closure_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_gen_info.m:
compiler/ml_global_data.m:
compiler/ml_proc_gen.m:
compiler/ml_string_switch.m:
compiler/ml_tag_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds.m:
compiler/rtti_to_mlds.m:
Use the mlds_target_lang value computed in mercury_compile_mlds_back_end.m
to make decisions. Code in most modules get this from the ml_gen_info;
in some others, it is passed around, usually instead of the globals.
compiler/ml_code_util.m:
Unify two separate copies of a comment.
|
||
|
|
11c232f060 |
Store different kinds of definitions in blocks separately.
An ml_stmt_block contains some definitions and some statements.
The definitions were traditionally stored in a single list of mlds_defns,
but lots of code knew that some kinds of mlds_defns just couldn't occur
in blocks. This diff, by storing the definitions of (a) local variables
and (b) continuation functions in separate field in ml_stmt_blocks,
gets the type system to enforce the invariant that other kinds of definitions
can't occur in blocks.
This also allows the compiler to do less work, since definitions
don't have to wrapped and then later unwrapped, and code that wants to look
at only e.g. the function definitions in a block don't have to traverse
the definitions of local variables (of which there are many more).
compiler/mlds.m:
Make the change described above.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_proc_gen.m:
compiler/ml_simplify_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tailcall.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/ml_util.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
Conform to the change above. This allows us to avoid lots of wrapping
up definitions.
In some cases, after this change, we don't need to process mlds_defns
*in general*, which leaves the predicates that used to do that,
and some of the predicates that they used to call, unused. Delete these.
In code that generated MLDS code, consistently use names containing
the word "Defn", instead of "Decl", for variables that contain
mlds_local_var_defns or mlds_function_defns. Some such predicates
generate lists of both local var definition and function definitions,
but most generate only one, and some generate neither.
|
||
|
|
47f1df4a0a |
Split mlds_data_defn into three separate types.
We used to use mlds_data_defns to represent three related but nevertheless
distinct kinds of entities: global variables, local variables, and fields
in classes. This diff replaces the mlds_data_defn type with three separate
types: mlds_global_var_defn, mlds_local_var_defn and mlds_field_var_defn
respectively, with corresponding changes to related types, such as
mlds_data_name.
The global variables are completely separate from the other two kinds.
Local and field variables are *mostly* separate from each other, but they
are related in one way. When we flatten out nested functions, the child
nested function can no longer access its parent function's local variables,
so we pass those variables to it as fields of an environment structure.
This requires turning local variables to fields of that structure,
and the code in the flattened previously-nested function that accesses
those fields naturally wants to treat them as if they were local variables
(as indeed they sort-of were before the flattening). There are therefore
ways to convert each of local and fields vars into the other.
This restructuring makes clear several invariants of the MLDS we generate
that were previously hidden. For example, variables with certain kinds of
names (in the before-this-diff, general version of the mlds_var_name type)
could appear only as function arguments or as locals in ml_stmt_blocks,
not in ml_global_data, while for some other names the opposite was the case.
And in several cases, functions used to take a general mlds_data_defn
as argument but aborted if given the "wrong kind" of mlds_data_defn.
This diff also makes possible further simplifications. For example,
local vars should not need some flags (since e.g. they are never per-instance),
and should never need either module or type qualification, while global
variables (which are also never per-instance) should never need type
qualification (since they are not fields of a type). The definitions
in blocks should consist of local variables and (before flattening) functions,
not global variables, field variables or classes, while the members in classes
should be only field variables and functions (and maybe classes), not
global or local variables. Those changes will be in future diffs;
this is already large enough.
compiler/mlds.m:
Make the changes described above.
Use tighter types where possible.
Use (a generalized version) of the mlconst_named_const functor
to represent values of enum types defined in the runtimes
of the target platforms.
compiler/ml_global_data.m:
Store *only* global variables in fields that previously stored general
mlds_datas (that by design were always global).
Store *only* closure wrapper functions in the previous non-flat-defns
field. Before this diff, the code generator only put closure wrapper
functions in this field, but then ml_elim_nested.m put everything
resulting from the expansion of those functions back into those fields
as well, some of which were not functions. It now puts those non-function
things into the MLDS data structure directly.
compiler/ml_code_util.m:
compiler/ml_util.m:
Conform to the changes above.
Use tighter types where possible. If appropriate, change the name
of the function or predicate accordingly.
Represent references to enum constants defined in the runtime of the
target language as named constants (since they is what they are),
instead of representing them as MLDS "variables", which required
the code of mlds_to_cs.m had to special-case the treatment
of those "variables".
compiler/ml_elim_nested.m:
Conform to the changes above.
Use tighter types where possible.
Don't put the environment types resulting from flattening nested scopes
back into the non-flat-defns slot of the ml_elim_info; instead, return
them separately to code that puts them directly in the MLDS.
compiler/rtti.m:
When returning the names of enum constants in the C runtime, return also
the prefixes that you need to place in front of these to obtain their names
in the Java and C# runtimes.
compiler/mercury_compile_mlds_back_end.m:
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_gen_info.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_proc_gen.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tailcall.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
compiler/rtti_out.m:
compiler/rtti_to_mlds.m:
Conform to the changes above.
Move a utility function from ml_util.m to mlds_to_target_util.m,
since it is used only in mlds_to_*.m.
|
||
|
|
8a240ba3f0 |
Add builtin 8, 16 and 32 bit integer types -- Part 1.
Add the new builtin types: int8, uint8, int16, uint16, int32 and uint32.
Support for these new types will need to be bootstrapped over several changes.
This is the first such change and does the following:
- Extends the compiler to recognise 'int8', 'uint8', 'int16', 'uint16', 'int32'
and 'uint32' as builtin types.
- Extends the set of builtin arithmetic, bitwise and relational operators to
cover the new types.
- Extends all of the code generators to handle new types. There currently lots
of limitations and placeholders marked by 'XXX FIXED SIZE INT'. These will
be lifted in later changes.
- Extends the runtimes to support the new types.
- Adds new modules to the standard library intended to hold the basic
operations on the new types. (These are currently empty and not documented.)
This change does not introduce the two 64-bit types, 'int64' and 'uint64'.
Their implementation is more complicated and is best left to a separate change.
compiler/prog_type.m:
compiler/prog_data.m:
compiler/builtin_lib_types.m:
Recognise int8, uint8, int16, uint16, int32 and uint32 as builtin types.
Add new type, int_type/0,that enumerates all the possible integer types.
Extend the cons_id/0 type to cover the new types.
compiler/builtin_ops.m:
Parameterize the integer operations in the unary_op/0 and binary_op/0
types by the new int_type/0 type.
Add builtin operations for all the new types.
compiler/hlds_data.m:
Add new tag types for the new types.
compiler/hlds_pred.m:
Parameterize integers in the table_trie_step/0 type.
compiler/ctgc.selector.m:
compiler/dead_proc_elim.m:
compiler/export.m:
compiler/foreign.m:
compiler/goal_util.m:
compiler/higher_order.m:
compiler/hlds_code_util.m:
compiler/hlds_dependency_graph.m:
compiler/hlds_out_pred.m:
compiler/hlds_out_util.m:
compiler/implementation_defined_literals.m:
compiler/inst_check.m:
compiler/mercury_to_mercury.m:
compiler/mode_util.m:
compiler/module_qual.qualify_items.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/parse_tree_out_info.m:
compiler/parse_tree_to_term.m:
compiler/parse_type_name.m:
compiler/polymorphism.m:
compiler/prog_out.m:
compiler/prog_rep.m:
compiler/prog_rep_tables.m:
compiler/prog_util.m:
compiler/rbmm.exection_path.m:
compiler/rtti.m:
compiler/rtti_to_mlds.m:
compiler/switch_util.m:
compiler/table_gen.m:
compiler/type_constraints.m:
compiler/type_ctor_info.m:
compiler/type_util.m:
compiler/typecheck.m:
compiler/unify_gen.m:
compiler/unify_proc.m:
compiler/unused_imports.m:
compiler/xml_documentation.m:
Conform to the above changes to the parse tree and HLDS.
compiler/c_util.m:
Support generating the builtin operations for the new types.
doc/reference_manual.texi:
Add the new types to the list of reserved type names.
Add the mapping from the new types to their target language types.
These are commented out for now.
compiler/llds.m:
Replace the lt_integer/0 and lt_unsigned functors of the llds_type/0,
with a single lt_int/1 functor that is parameterized by the int_type/0
type.
Add a representations for constants of the new types to the LLDS.
compiler/call_gen.m:
compiler/dupproc.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/jumpopt.m:
compiler/llds_out_data.m:
compiler/llds_out_global.m:
compiler/llds_out_instr.m:
compiler/lookup_switch.m:
compiler/middle_rec.m:
compiler/peephole.m:
compiler/pragma_c_gen.m:
compiler/stack_layout.m:
compiler/string_switch.m:
compiler/switch_gen.m:
compiler/tag_switch.m:
compiler/trace_gen.m:
compiler/transform_llds.m:
Support the new types in the LLDS code generator.
compiler/mlds.m:
Support constants of the new types in the MLDS.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_code_util.m:
compiler/ml_disj_gen.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_simplify_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tailcall.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/ml_util.m:
compiler/mlds_to_target_util.m:
Conform to the above changes to the MLDS.
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
Generate the appropriate target code for constants of the new
types and operations involving them.
compiler/bytecode.m:
compiler/bytecode_gen.m:
Handle the new types in the bytecode generator; we just abort if we
encounter them for now.
compiler/elds.m:
compiler/elds_to_erlang.m:
compiler/erl_call_gen.m:
compiler/erl_code_util.m:
compiler/erl_rtti.m:
compiler/erl_unify_gen.m:
Handle the new types in the Erlang code generator.
library/private_builtin.m:
Add placeholders for the builtin unify and compare operations for
the new types. Since the bootstrapping compiler will not recognise
the new types we give the polymorphic arguments. These can be
replaced after this change has bootstrapped.
Update the Java list of TypeCtorRep constants.
library/int8.m:
library/int16.m:
library/int32.m:
library/uint8.m:
library/uint16.m:
library/uint32.m:
New modules that will eventually contain builtin operations
on the new types.
library/library.m:
library/MODULES_UNDOC:
Do not include the above modules in the library documentation
for now.
library/construct.m:
library/erlang_rtti_implementation.m:
library/rtti_implementation.m:
deep_profiler/program_representation_utils.m:
mdbcomp/program_representation.m:
Handle the new types.
runtime/mercury_dotnet.cs.in:
java/runtime/TypeCtorRep.java:
runtime/mercury_type_info.h:
Update the list of TypeCtorReps.
configure.ac:
runtime/mercury_conf.h.in:
Check for the header stdint.h.
runtime/mercury_std.h:
Include stdint.h; abort if that header is no present.
runtime/mercury_builtin_types.[ch]:
runtime/mercury_builtin_types_proc_layouts.h:
runtime/mercury_construct.c:
runtime/mercury_deconstruct.c:
runtime/mercury_deep_copy_body.h:
runtime/mercury_ml_expand_body.h
runtime/mercury_table_type_body.h:
runtime/mercury_tabling_macros.h:
runtime/mercury_tabling_preds.h:
runtime/mercury_term_size.c:
runtime/mercury_unify_compare_body.h:
Add the new builtin types and handle them throughout the runtime.
|
||
|
|
30ec420984 |
Fix an anomaly in how in MLDS treats scalar commons.
compiler/mlds.m:
The MLDS used to have two different ways to refer to scalar common
data structures. It had an rval for the *name* of the scalar common,
and an mlds_name for its *address*. The name could then be wrapped up
inside a mlconst_data_adr function symbol to convert it to rval.
An mlds_name is intended to be used for the names of data definitions
in the MLDS, but scalar commons were never defined in this way.
And the name and address of a scalar common differ in C only by
the addition of an "&" operator in front, so the fact that they
had to be processed by different code (due to them having different types)
*required* double maintenance.
This diff fixes this anomaly by making both the name and the address
of a scalar common its own specific function symbol in the mlds_rval type.
They differ in the presence or absence of an "_addr" suffix.
Since all references to a vector common are to its address, give
the existing mlds_rval function symbol for vector commons the "_addr
suffix as well, for consistency.
Replace the general mlconst_data_addr function symbol in the
mlds_rval_const with its remaining instances. This allows the code
constructing them to be smaller and simpler, and enables them
to be treated differently in the future, if needed.
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
Conform to the changes in mlds.m.
Put the code translating the various common structures next to each other,
where they werent' before. Add XXXs about the differences between them
that are probably unnecessary and may possibly be latent problems.
compiler/ml_util.m:
Conform to the changes in mlds.m.
Change the interface to a set of predicates that looks for variables
inside various MLDS constructs to take a variable name, not a data name,
as the thing being looked for.
compiler/ml_closure_gen.m:
compiler/ml_code_util.m:
compiler/ml_elim_nested.m:
compiler/ml_global_data.m:
compiler/ml_optimize.m:
compiler/ml_proc_gen.m:
compiler/ml_string_switch.m:
compiler/ml_tailcall.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_target_util.m:
compiler/rtti_to_mlds.m:
Conform to the changes in mlds.m, and maybe ml_util.m.
In ml_proc_gen.m, put related arguments of some predicates and functions
next to each other.
|
||
|
|
0d5dac8018 | Delete output args that always return the same value. | ||
|
|
083f990dbb |
Simplify the use of contexts in the MLDS.
compiler/mlds.m:
This diff fixes two minor annoyances imposed by the old use of the
mlds_context type in the MLDS.
The first annoyance was that the mlds_context type used to be an
abstract type that was privately defined to be a trivial wrapper
around a prog_context. It had the exact same information content
as a prog_context, but you had to go through translation functions
to translate prog_contexts to mlds_contexts and vice versa.
I think Fergus's idea was that we may want to add other information
to the mlds_context type. However, since we haven't felt the need
to anything like that in the 18 years (almost to the day) that the
mlds_context type existed, I think this turned out to be a classic
case of YAGNI (you ain't gonna need it).
This diff deletes the mlds_context type, and replaces its uses
with prog_context.
The second annoyance was that actual MLDS code, i.e. values of the
mlds_stmt type, always had to wrapped up inside a term of the statement
type, a term which paired a context with the mlds_stmt.
This diff moves the context information (now prog_context, not
mlds_context) into each function symbol of the mlds_stmt type,
deletes the statement type, and replaces its uses with the now-expanded
mlds_stmt type. This simplifies most code that deals with MLDS code.
compiler/ml_util.m:
Add a function, get_mlds_stmt_context, for the (very rare) occasions
where we want to know the context of an mlds_stmt *before* testing
to see what function symbol it is bound to.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_proc_gen.m:
compiler/ml_simplify_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
compiler/ml_tailcall.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/rtti_to_mlds.m:
Conform to the changes above.
In some cases, a function was given two separate contexts, sometimes from
two separate sources; a prog_context and an mlds_context. In such cases,
keep only one source.
Standardize on Stmt as the variable name for "statement".
Delete redundant $module references from unexpected and other abort
predicates.
In one case, delete a function that was a duplicate of another function.
Give some predicates and functions more meaningful names.
|
||
|
|
869605956c |
Make MLDS definitions self-contained.
Until now, we used a single type, mlds_defn, to contain both
- generic information that we need for all MLDS definitions, such as
name and context, and
- information that is specific to each different kind of MLDS definition,
such as a variable's initializer or a function's list of parameter types.
The former were contained in three fields in the mlds_defns directly,
while the latter were contained in a fourth field that was a discriminated
union of mlds_data_defn, mlds_function_defn and mlds_class_defn.
While seemingly parsimonious, this design meant that if we had e.g. a list
of variable definitions, we would have to wrap the mlds_defn/4 wrapper around
them to give them their names, and thereafter, any code that processed
that list would have to be prepared to process not just variables but also
functions and classes.
This diff moves the three generic fields into each of the mlds_data_defn,
mlds_function_defn and mlds_class_defn types, making each those types
self-contained, and leaving mlds_defn as nothing more than a discriminated
union of those types.
In the few places that want to look at the generic fields *without*
caring about what kind of entity is being defined, this design requires
a bit of extra work compared to the old design, but in many other places,
the new design allows us to return mlds_data_defns, mlds_function_defns
or mlds_class_defns instead of just mlds_defns.
compiler/mlds.m:
Make the change described above.
Store type definions (for high level data) and table structures definitions
separately from other definitions in the MLDS type, since we can now
give them tighter types.
compiler/ml_global_data.m:
Change the fields that store flat cells from storing mlds_defns to
storing mlds_data_defns, since we can now do so.
Add an XXX about an obsolete comment.
compiler/mercury_compile_mlds_back_end.m:
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_gen_info.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_proc_gen.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tailcall.m:
compiler/ml_type_gen.m:
compiler/ml_util.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/rtti_to_mlds.m:
Conform to the changes above. Where possible with only local changes,
return mlds_data_defns mlds_function_defns or mlds_class_defns instead
of just mlds_defns. Put the mlds_data(_), mlds_function(_) or mlds_class(_)
wrapper around those definitions as late as possible (typically, when
our current code wants to put it into the same list as some other kind
of definition), in the hope that in the future, that wrapping can be
delayed even later, or even avoided altogether. Make the places where
such improvements may be possible with "XXX MLDS_DEFN".
In some places, the tighter data representation allows us to *delete*
"XXX MLDS_DEFN" markers.
Move some common code from mlds_to_{cs,java}.m to ml_util.m.
In mlds_to_{cs,java}.m, add prefixes to the function symbols in a type
to reduce ambiguity.
|
||
|
|
8dbea9f096 |
Use a structured representation for MLDS variables.
compiler/mlds.m:
Replace the old definition of mlds_var_name, which was a string
with an optional integer. The integer was intended to be the number
of a HLDS variable, while auxiliary variables created by the compiler,
which do not correspond to a HLDS variable, would not have the optional
integer.
This design has a couple of minor problems. The first is that there is
no place in the compiler where all the variable names are visible at once,
and without such a place, we cannot be sure that two names constructed
for different purposes don't accidentally end up with the same name.
The probability of such a clash used to be astronomically small
(which is why this hasn't been a problem), but it was not zero.
The second problem is that several kinds of compiler-created MLDS variables
want to have numerical suffixes too, usually with the suffix being a
unique sequence number used as a means of disambiguation. Most of the
places where these were created put the numerical suffix into the name
string itself, while some put the sequence number as the optional integer.
As it happens, neither of those actions is good when one wants to take
the independently generated MLDS code of several procedures in an SCC
and merge them into a single piece of MLDS code. For this, we want to
rename apart both the HLDS variable numbers and the sequence numbers.
Having the sequence number baked into the strings themselves obviously
makes such renumbering unnecessarily hard, while having sequence numbers
in the slots intended for HLDS variable numbers makes the job impossible
to do safely.
This diff switches to a new representation of mlds_var_names that
has a separate function symbol for each different "origin story"
that is possible for MLDS variables. This addresses both problems.
The single predicate that converts this structured representation
to a string is the place where we can ensure that two semantically
different MLDS variables never get translated to the same string.
The current version of this predicate does *not* offer this guarantee,
but later versions could.
And having all the integers used in mlds_var_names for different purposes
stored as arguments of different function symbols (that clearly indicate
their meaning) makes it possible to rename apart different sets
of MLDS variables easily and reliably.
Move the code for converting mlds_var_names from ml_code_util.m to here,
to make it easier to maintain it together with the mlds_var_name type.
compiler/ml_code_util.m:
Conform to the above change by generating structured MLDS var names.
Delete a predicate that is not needed with structured var names.
Delete the code moved to mlds.m.
Delete a predicate that has been unused since we deleted the IL backend.
Add ml_make_boxed_type as a version of ml_make_boxed_types that returns
exactly one type. This simplifies some code elsewhere.
Add "hld" to some predicate names to make clear that they are intended
for use only with --high-level-data.
compiler/ml_type_gen.m:
Conform to the above change by generating structured MLDS var names.
Add "hld" to the names of the (many) predicates here that are used
only with --high-level-data to make clear that fact.
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
Conform to the above change by generating structured MLDS var names.
Add a "for_csharp" or "for_java" suffix to some predicate names
to avoid ambiguities.
compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_gen_info.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_string_switch.m:
compiler/ml_unify_gen.m:
compiler/ml_util.m:
compiler/mlds_to_c.m:
Conform to the above change by generating structured MLDS var names.
compiler/prog_type.m:
Add var_to_type, as a version of var_list_to_type_list that returns
exactly one type. This simplifies some code elsewhere.
compiler/java_names.m:
Give some predicates and functions better names.
compiler/ml_code_gen.m:
Fix typo.
|
||
|
|
5de235065d | Fix too-long lines. | ||
|
|
cc9912faa8 |
Don't import anything in packages.
Packages are modules whose only job is to serve as a container for submodules. Modules like top_level.m, hlds.m, parse_tree.m and ll_backend.m are packages in this (informal) sense. Besides the include_module declarations for their submodules, most of the packages in the compiler used to import some modules, mostly other packages whose component modules their submodules may need. For example, ll_backend.m used to import parse_tree.m. This meant that modules in the ll_backend package did not have to import parse_tree.m before importing modules in the parse_tree package. However, this had a price. When we add a new module to the parse_tree package, parse_tree.int would change, and this would require the recompilation of ALL the modules in the ll_backend package, even the ones that did NOT import ANY of the modules in the parse_tree package. This happened even at one remove. Pretty much all modules in every one of the backend have to import one or more modules in the hlds package, and they therefore have import hlds.m. Since hlds.m imported transform_hlds.m, any addition of a new middle pass to the transform_hlds package required the recompilation of all backend modules, even in the usual case of the two having nothing to do with each other. This diff removes all import_module declarations from the packages, and replaces them with import_module declarations in the modules that need them. This includes only a SUBSET of their child modules and of the non-child modules that import them. |
||
|
|
7654ec847e | Convert (C->T;E) to (if C then T else E). | ||
|
|
c1e0499140 |
Fix the fail code for model_non trie string switches.
This was Mantis bug #383. compiler/ml_string_switch.m: For model_non switches in MLDS grades, a failure is indicated by a fall through. This can be represented by an empty sequence of MLDS statements, but the code that generated string trie switches took such an empty sequence to mean that the switch could not fail. Fix this incorrect assumption. tests/hard_coded/bug383.{m,inp,exp}: A regression test for the bug. tests/hard_coded/Mmakefile: Enable the new test case. |
||
|
|
9979764072 |
Build string switch tries in the target string encoding.
The compiler should work in code units of the TARGET string encoding when building tries for string switches. Using its own string encoding would be incorrect if it differs from the target encoding. Currently that would only occur if the compiler is built in a java/csharp grade (uses UTF-16 internally) and invoked to target high-level C (uses UTF-8). Another motivation for this change is to remove a place where the compiler behaviour depends on the setting of `--cross-compiling'. As of now, the `--cross-compiling' option has no effect. compiler/backend_libs.m: compiler/string_encoding.m: Add new module with helper predicates. compiler/ml_string_switch.m: Convert strings to/from code units in the target string encoding. compiler/ml_switch_gen.m: Remove restriction on compiling string switches using tries when `--cross-compiling' is enabled. compiler/notes/compiler_design.html: Document the new module. |
||
|
|
d041b83943 |
Implement string switches via tries for the MLDS backend.
The code we emit to decide which arm of the switch is selected looks like this:
case_num = -1;
switch (MR_nth_code_unit(switchvar, 0)) {
case '98':
switch (MR_nth_code_unit(switchvar, 1)) {
case '99':
if (MR_offset_streq(2, switchvar, "abc"))
case_num = 0;
break;
case '100':
if (MR_offset_streq(2, switchvar, "aceg"))
case_num = 1;
break;
}
break;
case '99':
if (MR_offset_streq(2, switchvar, "bbb"))
case_num = 2;
break;
}
The part that acts on this will look like this for lookup switches:
if (case_num < 0)
succeeded = MR_FALSE;
else {
outvar1 = vector_common[case_num].f1;
...
outvarn = vector_common[case_num].fn;
succeeded = MR_TRUE;
}
and like this for non-lookup switches:
switch (case_num) {
case 0:
<code for case 0>
break;
...
case n:
<code for case n>
break;
default: /* if the switch is can_fail */
<code for failure>
break;
}
compiler/ml_string_switch.m:
Implement both non-lookup and lookup string switches via tries,
along the lines shown above.
compiler/ml_switch_gen.m:
Invoke the predicates that implement string switches via tries
in the circumstances in which option values call for them.
For now, we generate tries only for the C backend. Once the
problems identified for mlds_to_{cs,java,managed} below are fixed,
we can enable them on those backends as well.
compiler/options.m:
doc/user_guide.texi:
Add an option that governs the minimum size of trie switches.
compiler/ml_lookup_switch.m:
Factor out the code common to the implementation of all model-non
lookup switches, both in ml_lookup_switch.m and ml_string_switch.m,
and put it all into a new exported predicate.
The previously existing MLDS implementation methods for lookup switches
all build their lookup tables from maps that maps each cons_id
in the switch cases to the values of the output arguments of those cases.
For switch cases that apply to more than one cons_id, this map had
one entry for each of those cons_ids. For tries, we need a map
from *case ids*, not *cons ids* to the outputs. Since it is
easier to convert the one-to-one case_id->outputs map to the
many-to-one cons_id->outputs map than vice versa, change the
main data structure from which lookup tables are built to store data
in a case_id->outputs format, and provide predicates for its conversion
to the other (previously the only) format.
Rename ml_gen_lookup_switch to ml_gen_atomic_lookup_swith to distinguish
it from other predicates that also generate (other kinds of) lookup
switches.
compiler/switch_util.m:
Have the types representating lookup tables represent their contents
as a map, not as the assoc list derived from the map. Previously,
we didn't do anything with the map other than flatten it to the assoc list,
but for the MLDS backend, we may now also need to convert it to another
form of map (see immediately above).
compiler/builtin_ops.m:
Add two new builtin ops. The first, string_unsafe_index_code_unit,
returns the nth code unit in a string; the second, offset_str_eq,
does a string equality test on the nth and later code units of
two strings. They are used in the implementation of tries.
compiler/c_util.m:
Add a new binop category for each new binop, since they are not like
existing binops.
Put some existing binops into their own categories as well, since
bundling them with the other ops they were bundled with seems like
a bad idea.
compiler/hlds_goal.m:
Make the identifier of switch arms in tagged_cases a separate type
from int.
compiler/mlds_to_c.m:
compiler/llds_out_data.m:
Handle the new kinds of binops.
When writing out binop expressions, we used to do a switch on the binop
to get its category, and then another switch on the category. We now
switch on the binop directory, since this much harder to write out
code using new binops badly, and should be faster to boot.
In mlds_to_c.m, also make some cosmetic changes to the output to make it
easier to read, and thus to debug.
compiler/mlds_to_il.m:
Handle the new kinds of binops.
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_managed.m:
Do not handle the new kinds of binops, since doing so would require
changing the whole approach of how these modules handle binops.
Clean up some predicates.
compiler/bytecode.m:
compiler/erl_call_gen.m:
compiler/lookup_switch.m:
compiler/ml_global_data.m:
compiler/ml_optimize.m:
compiler/ml_tag_switch.m:
compiler/opt_debug.m:
compiler/string_switch.m:
Conform to the changes above.
compiler/ml_code_gen.m:
Put the predicates of this module into a consistent order.
library/string.m:
Fix white space.
runtime/mercury_string.h:
Add a macro for each of the two new builtin operations.
|
||
|
|
7ca1a07296 |
Allow the MLDS backend to generate indexing switches (switches implemented
Estimated hours taken: 16
Branches: main
Allow the MLDS backend to generate indexing switches (switches implemented
more efficiently than just a if-then-else chain) for strings even if the target
language does not support gotos.
Previously, we use always used gotos to break out of search loops
after we found a match:
do {
if (we have a match) {
... handle the match ...
goto end
} else {
... handle nonmatches ...
}
} while (loop should continue);
maybe some code to handle the failure of the search
end:
Now, if the "maybe some code" is empty, we prefer to use break statements
if the target language supports this:
do {
if (we have a match) {
... handle the match ...
break;
} else {
... handle nonmatches ...
}
} while (loop should continue)
If we cannot use either gotos or break statements, we instead use
a boolean variable named "stop_loop":
stop_loop = 0;
do {
if (we have a match) {
... handle the match ...
stop_loop = 1;
} else {
... handle nonmatches ...
}
} while (stop_loop == 0 && loop should continue)
if (stop_loop == 0) {
maybe some code to handle the failure of the search
}
We omit the final if statement if the then-part would be empty.
The break method generates the smallest code, followed by the goto code.
I don't have information on speed, since we don't have a benchmark that
runs long enough, and the compiler itself does not spend any significant
amount of time on string switches. Probably the break method is also the
fastest, simply because it leaves the code looking most like normal C code.
(Some optimizations are harder to apply to code containing gotos, and some
optimizer writers do not bother.)
For C, we now normally prefer to generate code using the second method
(breaks), if we can, though normally "maybe some code" is not empty,
in which case we use the first method (goto).
However, if the value of the --experiment option is set to "use_stop_loop",
we always use the third method, and if it is set to "use_end_label", we always
use the first, even when we could use the second. This allow us to test all
three approaches using the C back end.
With backends that support neither gotos nor break, we always use the third
method (stop_loop).
With backends that don't support gotos but do support breaks, we also always
use the third method. This is because trying to use the second method would
require us to commit to not creating the stop_loop variable BEFORE we know
that the "maybe some code to handle the failure of the search" is empty,
and if it isn't empty, then we don't have the goto method to fall back on.
compiler/ml_string_switch.m:
Make the change described above. Where possible, make the required
change not to the original code, but to a version in which common
parts have been factored out. (Previously, the duplicated code was
small; now, it would be big.)
compiler/ml_target_util.m:
A new module containing existing functions that test various properties
of the target language. Keeping some of those functions in their
original modules would have introduced a circular dependency.
compiler/ml_switch_gen.m:
Enable the new functionality by removing the tests that previously
prevented the compiler from using indexing switches on strings
if the target language did not support gotos.
Remove the code moved to ml_target_util.m.
compiler/ml_optimize.m:
compiler/ml_unify_gen.m:
Remove the code moved to ml_target_util.m.
compiler/ml_backend.m:
compiler/notes/compiler_design.m:
Add the new module.
compiler/ml_proc_gen.m:
Delete a predicate that hasn't been used for a long time.
tools/makebatch:
Fix an old pair of typos.
|
||
|
|
de56f9b77c |
Implement lookup table versions of hash and binary search switches for strings
Estimated hours taken: 24 Branches: main Implement lookup table versions of hash and binary search switches for strings in the MLDS backend (those versions already exist in the LLDS backend). compiler/ml_string_switch.m: Make the above change. Where possible, factor out and reuse existing code. compiler/ml_lookup_switch.m: Break up the predicate that used to both test a switch whether it is a lookup switch and also generate code for it if it was, into two parts, each doing just one job. The first part is now useful for switches on strings as well. Group auxiliary predicates with the main predicates they support. Factor out some code into new predicates, and export them for use by the new code in ml_string_switch.m. Make some predicates tail recursive. Remove some predicates made unnecessary by changes to lookup_switch.m. compiler/ml_switch_gen.m: Invoke the new code when appropriate, and conform to the updated interface of ml_lookup_switch.m. compiler/switch_util.m: Change some types, and the predicates that operate on them, to make them useful for lookup switches for the MLDS backend as well the LLDS backend. Add some utility predicates. compiler/lookup_switch.m: Change the interface of some of the predicates in this module to allow us to factor out some common code from the higher order values passed by callers. Conform to the changes in switch_util.m. compiler/string_switch.m: Conform to changes in switch_util.m. compiler/switch_gen.m: Conform to changes in lookup_switch.m. |
||
|
|
6dabcc0aa1 |
Implement binary search switches for strings in the MLDS backend (they already
Estimated hours taken: 16 Branches: main Implement binary search switches for strings in the MLDS backend (they already exist in the LLDS backend). Binary search switches have higher big-O complexity than hash table search switches, but lower startup costs, and so are appropriate for switches involving a smaller tables of strings. compiler/ml_string_switch.m: Implement binary search switches. Where possible, factor out and reuse code that already existed for implementing hash switches. compiler/ml_switch_gen.m: Invoke the new code when appropriate. compiler/switch_gen.m: Avoid executing the same test (NumArms > 1) more than once. compiler/mlds.m: Fix a typo in a comment. compiler/string_switch.m: Delete stray text from a comment. |
||
|
|
b4092d2e4e |
Further improvements in the implementation of string switches, along with
Estimated hours taken: 12 Branches: main Further improvements in the implementation of string switches, along with some bug fixes. If the chosen hash function does not yield any collisions for the strings in the switch arms, then we can optimize away the table column that we would otherwise need for open addressing. This was implemented in a previous diff. For an ordinary (non-lookup) string switch, the hash table has two columns in the presence of collisions and one column in their absence. Therefore if doubling the size of the table allows us to eliminate collisions, the table size is unaffected, though the corresponding array of labels we have to put into the computed_goto instruction we generate has to double as well. Thus the only cost of such doubling is an increase in "code" size, and for small tables, the elimination of the open addressing loop may compensate for this, at least partially. For lookup string switches, doubling the table size this way has a bigger space cost, but the elimination of the open addressing loop still brings a useful speed boost. We therefore now DO double the table size if this eliminates collisions. In the library, compiler etc directories, this eliminates collisions in 19 out of 47 switch switches that had collisions with the standard table size. compiler/switch_util.m: Replace the separate sets of predicates we used to have for computing hash maps (one for lookup switches and one for non-lookup switches) with a single set that works for both. Change this set to double the table size if this eliminates collisions. This requires it to decide the table size, a task previously done separately by each of its callers. One version of this set had an old bug, which caused it to effectively ignore the second and third string hash functions. This diff fixes it. There were two bugs in my previous diff: the unneeded table column was not being optimized away from several_soln lookup switches, and the lookup code for one_soln lookup switches used the wrong column offset. This diff fixes these too. Since doubling the table size requires recalculating all the hash values, decouple the computation of the hash values from generating code for each switch arm, since the latter shouldn't be done more than once. Add a note on an old problem. compiler/ml_string_switch.m: compiler/string_switch.m: Bring the code for generating code for the arms of string switches here from switch_util.m. tests/hard_coded/Mmakefile: Fix the reason why the bugs mentioned above were not detected: the relevant test cases weren't enabled. tests/hard_coded/string_hash.m: Update this test case to test the correspondence of the compiler's and the runtime's versions of not just the first hash function, but also the second and third. runtime/mercury_string.h: Fix a typo in a comment. |
||
|
|
065a440492 |
Simplify some code.
Estimated hours taken: 0.1 Branches: main compiler/ml_string_switch.m: Simplify some code. |
||
|
|
fe566dbf42 |
When doing hash table lookup as part of the implementation of switches on
Estimated hours taken: 8 Branches: main When doing hash table lookup as part of the implementation of switches on strings, we use open addressing to handle collisions. However, if the chosen hash function does not yield any collisions for the strings in the switch arms, then open addressing is unnecessary: if a lookup does not find the string bound to the switch variable in its home bucket, it won't be in the hash table at all. This diff optimizes such cases, by not generating for them the loop we would otherwise use for open addressing, and optimizing away the table column telling that loop where to check next. compiler/string_switch.m: Implement the above optimization both for ordinary switches on strings, and for lookup table switches (both one_soln and several_soln) on strings. compiler/ml_string_switch.m: Implement the above optimization for ordinary switches on strings. This module does not (yet) implement lookup table switches on strings. compiler/switch_util.m: When deciding what hash function to use, return the number of collisions for string_switch and ml_string_switch to use. Rename the other_switch category to float_switch, since the only type category it covers is switches on floats. compiler/switch_gen.m: compiler/ml_switch_gen.m: Make the module header comments more organized, and use the same template for both, so one can see the differences more easily. Put the switch arms for the smart indexing methods into the same order in both files. Fix an old problem in ml_switch_gen.m: the test to see whether we can apply a smart indexing method that uses switches on integers was testing not the availability of int switches in the target, but the availability of computed gotos. While ml_simplify_switch would transform the int-switch-using code to computed-goto-using code or an if-then-else chain in *some* cases, it would not do so in *all* cases. In ml_switch_gen.m, remove a test that could not succeed, and a procedure that was used only in that test. Conform to the changes in switch_util.m. compiler/lookup_switch.m: compiler/ml_simplify_switch.m: Update comments. |
||
|
|
295415090e |
Convert almost all remaining modules in the compiler to use
Estimated hours taken: 6 Branches: main compiler/*.m: Convert almost all remaining modules in the compiler to use "$module, $pred" instead of "this_file" in error messages. In a few cases, the old error message was misleading, since it contained an incorrect, out-of-date or cut-and-pasted predicate name. tests/invalid/unresolved_overloading.err_exp: Update an expected output containing an updated error message. |
||
|
|
9f68c330f0 |
Change the argument order of many of the predicates in the map, bimap, and
Branches: main
Change the argument order of many of the predicates in the map, bimap, and
multi_map modules so they are more conducive to the use of state variable
notation, i.e. make the order the same as in the sv* modules.
Prepare for the deprecation of the sv{bimap,map,multi_map} modules by
removing their use throughout the system.
library/bimap.m:
library/map.m:
library/multi_map.m:
As above.
NEWS:
Announce the change.
Separate out the "highlights" from the "detailed listing" for
the post-11.01 NEWS.
Reorganise the announcement of the Unicode support.
benchmarks/*/*.m:
browser/*.m:
compiler/*.m:
deep_profiler/*.m:
extras/*/*.m:
mdbcomp/*.m:
profiler/*.m:
tests/*/*.m:
ssdb/*.m:
samples/*/*.m
slice/*.m:
Conform to the above change.
Remove any dependencies on the sv{bimap,map,multi_map} modules.
|
||
|
|
022b559584 |
Make error messages for require_complete_switch scopes report the missing
Estimated hours taken: 8 Branches: main Make error messages for require_complete_switch scopes report the missing functors. Knowing which functors are missing requires knowing not only the set of functors in the switched-on variable's type, but also which of these functors have been eliminated by earlier tests, which requires having the instmap at the point of entry to the switch. Simplification, which initially detected unmet require_complete_switch requirements, does not have the instmap, and threading the instmap through it would make it significantly less efficient. So instead we now detect any problems with require_complete_switch scopes (and require_detism scopes, which are similar) during determinism checking. compiler/det_report.m: Factor out the code for finding the missing functors in conventional determinism errors, to allow it to be used for this new purpose. Check whether the requirements of require_complete_switch and require_detism scopes are met IF the predicate has any such scopes. compiler/det_analysis.m: compiler/det_util.m: Record whether the predicate has any such scopes. compiler/hlds_pred.m: Add a predicate marker that allows this recording. compiler/simplify.m: Delete the code that checks the require_complete_switch and require_detism scopes. Keep the code that deletes those scopes. (We have to do that here because determinism error reporting never updates the goal). compiler/prog_out.m: Delete an unused predicate. compiler/*.m: Remove unnecesary imports as flagged by --warn-unused-imports. |
||
|
|
8a28e40c9b |
Add the predicates sorry, unexpected and expect to library/error.m.
Estimated hours taken: 2 Branches: main Add the predicates sorry, unexpected and expect to library/error.m. compiler/compiler_util.m: library/error.m: Move the predicates sorry, unexpected and expect from compiler_util to error. Put the predicates in error.m into the same order as their declarations. compiler/*.m: Change imports as needed. compiler/lp.m: compiler/lp_rational.m: Change imports as needed, and some minor cleanups. deep_profiler/*.m: Switch to using the new library predicates, instead of calling error directly. Some other minor cleanups. NEWS: Mention the new predicates in the standard library. |