Files
mercury/tests/hard_coded/string_hash.m
Zoltan Somogyi b4092d2e4e Further improvements in the implementation of string switches, along with
Estimated hours taken: 12
Branches: main

Further improvements in the implementation of string switches, along with
some bug fixes.

If the chosen hash function does not yield any collisions for the strings
in the switch arms, then we can optimize away the table column that we would
otherwise need for open addressing. This was implemented in a previous diff.

For an ordinary (non-lookup) string switch, the hash table has two columns
in the presence of collisions and one column in their absence. Therefore if
doubling the size of the table allows us to eliminate collisions, the table
size is unaffected, though the corresponding array of labels we have to put
into the computed_goto instruction we generate has to double as well.
Thus the only cost of such doubling is an increase in "code" size, and
for small tables, the elimination of the open addressing loop may compensate
for this, at least partially.

For lookup string switches, doubling the table size this way has a bigger
space cost, but the elimination of the open addressing loop still brings
a useful speed boost.

We therefore now DO double the table size if this eliminates collisions.
In the library, compiler etc directories, this eliminates collisions in
19 out of 47 switch switches that had collisions with the standard table size.

compiler/switch_util.m:
	Replace the separate sets of predicates we used to have for computing
	hash maps (one for lookup switches and one for non-lookup switches)
	with a single set that works for both.

	Change this set to double the table size if this eliminates collisions.
	This requires it to decide the table size, a task previously done
	separately by each of its callers.

	One version of this set had an old bug, which caused it to effectively
	ignore the second and third string hash functions. This diff fixes it.

	There were two bugs in my previous diff: the unneeded table column
	was not being optimized away from several_soln lookup switches, and the
	lookup code for one_soln lookup switches used the wrong column offset.
	This diff fixes these too.

	Since doubling the table size requires recalculating all the hash
	values, decouple the computation of the hash values from generating
	code for each switch arm, since the latter shouldn't be done more than
	once.

	Add a note on an old problem.

compiler/ml_string_switch.m:
compiler/string_switch.m:
	Bring the code for generating code for the arms of string switches
	here from switch_util.m.

tests/hard_coded/Mmakefile:
	Fix the reason why the bugs mentioned above were not detected:
	the relevant test cases weren't enabled.

tests/hard_coded/string_hash.m:
	Update this test case to test the correspondence of the compiler's
	and the runtime's versions of not just the first hash function,
	but also the second and third.

runtime/mercury_string.h:
	Fix a typo in a comment.
2011-08-02 00:05:44 +00:00

143 lines
3.9 KiB
Mathematica

% vim: ts=4 sw=4 et ft=mercury
% Test that string.hash and MR_hash_string return the same value.
% Do the same for string.hash2 and MR_hash_string2, and for
% string.hash3 and MR_hash_string3.
:- module string_hash.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.
:- implementation.
:- import_module bool.
:- import_module char.
:- import_module int.
:- import_module list.
:- import_module random.
:- import_module require.
:- import_module string.
main(!IO) :-
MaxLength = 1024,
random.init(1, RS0),
test(MaxLength, yes, Succeeded, RS0, _, !IO),
(
Succeeded = yes,
io.write_string("all tests succeeded\n", !IO)
;
Succeeded = no,
io.write_string("some tests failed\n", !IO)
).
:- pred test(int::in, bool::in, bool::out,
random.supply::mdi, random.supply::muo, io::di, io::uo) is det.
test(Length, !Succeeded, !RS, !IO) :-
( Length < 0 ->
true
;
make_char_list(Length, [], List, !RS),
string.from_char_list(List, String),
LibHash1 = string.hash(String),
RuntimeHash1 = runtime_string_hash(String),
( LibHash1 = RuntimeHash1 ->
true
;
!:Succeeded = no,
io.write_string("failed hash1: runtime ", !IO),
io.write_int(RuntimeHash1, !IO),
io.write_string(", library ", !IO),
io.write_int(LibHash1, !IO),
io.write_string(": """, !IO),
io.write_string(String, !IO),
io.write_string("""\n", !IO)
),
LibHash2 = string.hash2(String),
RuntimeHash2 = runtime_string_hash2(String),
( LibHash2 = RuntimeHash2 ->
true
;
!:Succeeded = no,
io.write_string("failed hash2: runtime ", !IO),
io.write_int(RuntimeHash2, !IO),
io.write_string(", library ", !IO),
io.write_int(LibHash2, !IO),
io.write_string(": """, !IO),
io.write_string(String, !IO),
io.write_string("""\n", !IO)
),
LibHash3 = string.hash3(String),
RuntimeHash3 = runtime_string_hash3(String),
( LibHash3 = RuntimeHash3 ->
true
;
!:Succeeded = no,
io.write_string("failed hash3: runtime ", !IO),
io.write_int(RuntimeHash3, !IO),
io.write_string(", library ", !IO),
io.write_int(LibHash3, !IO),
io.write_string(": """, !IO),
io.write_string(String, !IO),
io.write_string("""\n", !IO)
),
test(Length - 1, !Succeeded, !RS, !IO)
).
:- pred make_char_list(int::in, list(char)::in, list(char)::out,
random.supply::mdi, random.supply::muo) is det.
make_char_list(Length, !List, !RS) :-
( Length = 0 ->
true
;
rand_char(Char, !RS),
!:List = [Char | !.List],
make_char_list(Length - 1, !List, !RS)
).
:- pred rand_char(char::out, random.supply::mdi, random.supply::muo) is det.
rand_char(Char, !RS) :-
random.random(Rand, !RS),
% U+0001..U+10ffff (avoid null character).
Int = 1 + (Rand `mod` char.max_char_value),
char.det_from_int(Int, Char).
:- pragma foreign_decl("C", "#include ""mercury_string.h""").
:- func runtime_string_hash(string) = int.
:- pragma foreign_proc("C",
runtime_string_hash(StringArg::in) = (Hash::out),
[promise_pure, will_not_call_mercury],
"
Hash = MR_hash_string(StringArg);
").
:- func runtime_string_hash2(string) = int.
:- pragma foreign_proc("C",
runtime_string_hash2(StringArg::in) = (Hash::out),
[promise_pure, will_not_call_mercury],
"
Hash = MR_hash_string2(StringArg);
").
:- func runtime_string_hash3(string) = int.
:- pragma foreign_proc("C",
runtime_string_hash3(StringArg::in) = (Hash::out),
[promise_pure, will_not_call_mercury],
"
Hash = MR_hash_string3(StringArg);
").