Implement a more cache-friendly translation of lookup switches.

Estimated hours taken: 8
Branches: main

Implement a more cache-friendly translation of lookup switches. Previously,
for a switch such as the one in

	:- pred p(foo::in, string::out, bar::out, float::out) is semidet.

	p(d, "four", f1, 4.4).
	p(e, "five", f2, 5.5).
	p(f, "six", f4("hex"), 6.6).
	p(g, "seven", f5(77.7), 7.7).

we generated three static cells, one for each argument, and then indexed
into each one in turn to get the values of HeadVar__2, HeadVar__3 and
HeadVar__4. The different static cells each represent a column here.
Each of the loads accessing the columns will access a different cache block,
so with this technique we expect to get as many cache misses as there are
output variables.

This diff changes the code we generate to use a vector of static cells
where each cell represents a row. The assignments to the output variables
will now access the different fields of a row, which will be next to each
other. We thus expect only one cache miss irrespective of the number of output
variables, at least up to the number of variables that actually fit into one
cache block.

compiler/global_data.m:
	Provide a mechanism for creating not just single (scalar) static cells,
	but arrays (vectors) of them.

compiler/lookup_switch.m:
	Use the new mechanism to generate code along the lines described above.

	Put the information passed between the two halves of the lookup switch
	implementation (detection and code generation) into an opaque data
	structure.

compiler/switch_gen.m:
	Conform to the new interface of lookup_switch.m.

compiler/ll_pseudo_type_info.m:
compiler/stack_layout.m:
compiler/string_switch.m:
compiler/unify_gen.m:
compiler/var_locn.m:
	Conform to the change to global_data.m.

compiler/llds.m:
	Define the data structures for holding vectors of static cells. Rename
	the function symbols we used to use to refer to static cells to make
	clear that they apply to scalar cells only. Provide similar mechanisms
	for representing static cell vectors and references to them.

	Generalize heap_ref heap references to allow the index to be computed
	at runtime, not compile time. For symmetry's sake, do likewise
	for stack references.

compiler/llds_out.m:
	Add the code required to write out static cell vectors.

	Rename decl_ids to increase clarity and avoid ambiguity.

compiler/code_util.m:
compiler/exprn_aux.m:
	Modify code that traverses rvals to now also traverse the new rvals
	inside memory references.

compiler/name_mangle.m:
	Provide the prefix for static cell vectors.

compiler/layout_out.m:
compiler/rtti_out.m:
compiler/opt_debug.m:
	Conform to the change to data_addrs and decl_ids.

compiler/code_info.m:
	Provide access to the new functionality in global_data.m, and conform
	to the change to llds.m.

	Provide a utility predicate needed by lookup_switch.m.

compiler/hlds_llds.m:
	Fix the formatting of some comments.

tools/binary:
tools/binary_step:
	Fix the bit rot that has set in since they were last used (the rest
	of the system has changed quite a lot since then). I had to do so
	to debug one part of this change.

tests/hard_coded/dense_lookup_switch2.{m,exp}:
tests/hard_coded/dense_lookup_switch3.{m,exp}:
	New test cases to exercise the new algorithm.

tests/hard_coded/Mmakefile:
	Enable the new test cases, as well as an old one (from 1997!)
	that seems never to have been enabled.
This commit is contained in:
Zoltan Somogyi
2006-03-30 02:46:08 +00:00
parent b9c0ec0a21
commit 4fe703c7b9
25 changed files with 2000 additions and 1066 deletions

View File

@@ -303,12 +303,12 @@ dump_rvals([Rval | Rvals]) =
dump_rval(Rval) ++ ", " ++ dump_rvals(Rvals).
dump_mem_ref(stackvar_ref(N)) =
"stackvar_ref(" ++ int_to_string(N) ++ ")".
"stackvar_ref(" ++ dump_rval(N) ++ ")".
dump_mem_ref(framevar_ref(N)) =
"framevar_ref(" ++ int_to_string(N) ++ ")".
"framevar_ref(" ++ dump_rval(N) ++ ")".
dump_mem_ref(heap_ref(R, T, N)) =
"heap_ref(" ++ dump_rval(R) ++ ", " ++ int_to_string(T) ++ ", "
++ int_to_string(N) ++ ")".
++ dump_rval(N) ++ ")".
dump_const(true) = "true".
dump_const(false) = "false".
@@ -345,8 +345,11 @@ dump_data_addr(rtti_addr(tc_rtti_id(TCName, TCDataName))) =
dump_data_addr(layout_addr(LayoutName)) =
"layout_addr(" ++ dump_layout_name(LayoutName) ++ ")".
dump_data_name(common_ref(TypeNum, Offset)) =
"common_ref(" ++ int_to_string(TypeNum) ++ ", "
dump_data_name(scalar_common_ref(TypeNum, Offset)) =
"scalar_common_ref(" ++ int_to_string(TypeNum) ++ ", "
++ int_to_string(Offset) ++ ")".
dump_data_name(vector_common_ref(TypeNum, Offset)) =
"vector_common_ref(" ++ int_to_string(TypeNum) ++ ", "
++ int_to_string(Offset) ++ ")".
dump_data_name(tabling_pointer(ProcLabel)) =
"tabling_pointer(" ++ dump_proclabel(ProcLabel) ++ ")".