mirror of
https://github.com/Mercury-Language/mercury.git
synced 2026-04-15 17:33:38 +00:00
Given a switch arm that matches several cons_ids, and which contains
one or more switches on the *same* variable, such as the arm for
f1/f2/f3/f4 below,
(
(X = f1 ; X = f2 ; X = f3 ; X = f4),
...
(
X = f1,
...
;
(X = f2 ; X = f3),
...
;
X = f4,
...
),
...
;
...
)
this new optimization
- partitions the set of cons_id in that arm (in this case, {f1,f2,f3,f4})
as finely as needed by any other the switches on X in that arm
(in this case, that is three partitions containing {f1}, {f2,f3} and {f4}),
- splits that original switch arm into N arms, one arm for each partition,
making a copy of the switch arm's goal for each partition,
- restricts any switches on the original switch variable (in this case, X)
inside the copy of the arm goal inside each new case to only the cons_ids
in that case's partition, and then replacing any resulting one-arm switches
with the just goal inside that one arm.
The code resulting from these three steps will include some code duplication
(some of the pieces of code denoted by ... in the example above would be
duplicated), but it will need to execute fewer transfers of control.
This is worthwhile because (a) the branch instructions used to implement
switches are hard to predict unless most paths through the nested switches
are rarely if ever taken, and (b) the pipeline breaks caused by branches
that are not correctly predicted are one of the two major contributors
to the runtime of Mercury programs. (The other major contributors are
data cache misses.)
The implementation of this option has two major parts.
- Part 1 consists of discovering whether a procedure body contains
any code in which a switch on a variable is nested inside an arm
of another switch on that same variable. For any instance of such
a pair of switches, it records the identity of the variable and
the set of cons_ids of the outermost arm.
- Part 2 consists of actually transforming the procedure body
by splitting each outermost switch arm thus recorded. (This part
contains all three of the steps above.)
We integrate part 1 with the usual procedure body traversal of the
simplification pass, which makes it quite cheap. In most procedure bodies,
it won't find any nested switches meeting its criteria. We execute part 2,
which is relatively expensive, only if it does.
compiler/simplify_info.m:
Add a new type, switch_arm, which represents one arm of a switch.
Add a new field to the simplify_nested_context type. Its type
is list(switch_arm), and it represents the stack of switch arms (if any)
that the goal currently being simplified is inside. simplify_goal_switch.m
uses this field to detect switches that occur inside an arm of an
ancestor goal that is also a switch on the same variable.
Add a new field to the simplify_info, a set of switch_arms,
that denotes the set of switch arms from which that detection
has actually happened, and which should therefore be
partitioned and split.
compiler/simplify_goal_switch.m:
Add the code for doing the detection and recording mentioned just above.
This is the Part 1 mentioned above.
compiler/split_switch_arms.m:
This new module implements the splitting up process.
This is the Part 2 mentioned above.
compiler/simplify.m:
Include the new module in the simplify subpackage of the check_hlds
package.
compiler/notes/compiler_design.html:
Document the new module.
compiler/simplify_proc.m:
Invoke split_switch_arms.m (part 2) if the part of simplify_goal_switch.m
implementing part 1 has found any work for it to do.
compiler/simplify_tasks.m:
Add split_switch_arms as one of the tasks that simplification
may be asked to do. Set its default value from the value of
a new optimization option that, when specified, calls for it to be done.
compiler/options.m:
Add this option, --split-switch-arms.
Change the internal name of an existing option,
everything_in_one_c_function, to the one expected by
tools/make_optimization_options_middle.
Replace the part of this file that is constructed by
tools/make_optimization_options_middle.
doc/user_guide.texi:
Document the new option.
tools/make_optimization_options_db:
Add --split-switch-arms to the optimization tuple.
Fix software rot by
- renaming optimize_tailcalls to optimize_mlds_tailcalls inside the
optimization tuple, as it has been in options.m, and
- deleting erlang_switch_on_strings_as_atoms from the opt_tuple,
as it has been from options.m.
tools/make_optimization_options_end:
Enable the new option by default at optimization level 2.
tools/make_optimization_options_middle:
Fix bugs in this script, which generates compiler/optimization_options.m.
One bug was that it referred to the opt_level and opt_space options
by the wrong name (optopt_level and optopt_space respectively).
The other bug was that it expected opt_level to be a string_special option,
when it is an int_special option.
Make the handler_file this script generates easier to put into options.m
by not generating a line that (a) already exists in options.m, and
(b) would need to have a comma put after it, if one wanted this line
to replace the copy already in options.m.
compiler/optimization_options.m:
Rebuild this file after the changes above.
compiler/simplify_goal_unify.m:
Conform to the changes above.
compiler/inst_merge.m:
Mark two predicates, inst_merge_[34], to be inlined. The intention
is that inst_merge_4 should be inlined into inst_merge_3, and then
inst_merge_3 should be inlined inside inst_merge_2; that would then
generate six levels of switches, three on one of the insts to be
merged and three on the other, which the new optimization could
then flatten. Unfortunately, inlining does not do this yet.
tests/hard_coded/test_split_switch_arms.{m,exp}:
A new test case for the the new transformation.
tests/hard_coded/Mmakefile:
tests/hard_coded/Mercury.options:
Enable the new test case, and specify the new option for it.
390 lines
16 KiB
Awk
Executable File
390 lines
16 KiB
Awk
Executable File
#!/usr/bin/awk -f
|
|
# vim: ts=4 sw=4 et ft=awk
|
|
#---------------------------------------------------------------------------#
|
|
# Copyright (C) 2020 The Mercury team.
|
|
# This file may only be copied under the terms of the GNU General
|
|
# Public License - see the file COPYING in the Mercury distribution.
|
|
#---------------------------------------------------------------------------#
|
|
#
|
|
# The input to this file, make_optimization_options_db, has one line
|
|
# for every optimization option. Each line must have either three
|
|
# or four fields.
|
|
#
|
|
# - The first field indicates what kind of option this is. It must be
|
|
# one of "bool", "int" and "string".
|
|
#
|
|
# - The second field gives the initial value of the option.
|
|
#
|
|
# - If the option is boolean, the initial value must be either "n" or "y".
|
|
#
|
|
# - If the option is an integer, the initial value must be an integer.
|
|
#
|
|
# - If the option is a string, the initial value will be set
|
|
# to the empty string regardless of the contents of the second field.
|
|
#
|
|
# - The third field is the name of the option in options.m.
|
|
#
|
|
# - The fourth field, if it exists, gives the name of the option
|
|
# in the file this script helps generate, optimization_options.m.
|
|
# If there is no fourth field, the name is taken to be the same
|
|
# as the name in the third field.
|
|
#
|
|
# For boolean options, the name in make_optimization_options_db should be
|
|
# a verb, because when we generate a bespoke type for it, its two function
|
|
# symbols will be named "verb" and "do_not_verb".
|
|
#
|
|
|
|
BEGIN {
|
|
next_bool_opt = 0;
|
|
next_int_opt = 0;
|
|
next_string_opt = 0;
|
|
}
|
|
{
|
|
if (NF == 3 || NF == 4) {
|
|
kind = $1;
|
|
initial_value = $2;
|
|
option_name = $3;
|
|
if (NF == 4) {
|
|
option_field_name = $4;
|
|
} else {
|
|
option_field_name = $3;
|
|
}
|
|
|
|
if (kind == "bool") {
|
|
bool_opts[next_bool_opt] = option_name;
|
|
bool_field_names[next_bool_opt] = option_field_name;
|
|
if (initial_value == "y") {
|
|
bool_defaults[next_bool_opt] = option_field_name;
|
|
} else if (initial_value == "n") {
|
|
bool_defaults[next_bool_opt] = "do_not_" option_field_name;
|
|
} else {
|
|
printf("ERROR: bad bool default<%s>\n", $0);
|
|
}
|
|
next_bool_opt++;
|
|
} else if (kind == "int") {
|
|
int_opts[next_int_opt] = $3;
|
|
int_field_names[next_int_opt] = option_field_name;
|
|
int_defaults[next_int_opt] = initial_value + 0;
|
|
next_int_opt++;
|
|
} else if (kind == "string") {
|
|
string_opts[next_string_opt] = $3;
|
|
string_field_names[next_string_opt] = option_field_name;
|
|
# String options are always initialized to the empty string.
|
|
# To avoid this messing up awk's field counting, we require
|
|
# a dash to fill in the black.
|
|
if (initial_value == "-") {
|
|
string_defaults[next_int_opt] = "";
|
|
} else {
|
|
printf("ERROR: bad string default<%s>\n", $0);
|
|
}
|
|
next_string_opt++;
|
|
} else {
|
|
printf("ERROR: line with unknown type: <%s>\n", $0);
|
|
}
|
|
} else {
|
|
printf("ERROR: line with %d fields: <%s>\n", NF, $0);
|
|
}
|
|
}
|
|
END {
|
|
switch_type_name = "optimization_option";
|
|
switch_var_name = "OptOption";
|
|
record_type_name = "opt_tuple";
|
|
record_var_name = "OptTuple";
|
|
|
|
for (i = 0; i < next_bool_opt; i++) {
|
|
printf(":- type maybe_%s\n", bool_field_names[i]);
|
|
printf(" ---> %s\n", bool_field_names[i]);
|
|
printf(" ; do_not_%s.\n", bool_field_names[i]);
|
|
}
|
|
|
|
printf("\n%%---------------------%%\n\n");
|
|
|
|
printf(":- type %s\n", switch_type_name);
|
|
for (i = 0; i < next_bool_opt; i++) {
|
|
prefix=";";
|
|
suffix="";
|
|
|
|
if (i == 0)
|
|
prefix = "--->";
|
|
|
|
printf("%-4s%-4s%-4soo_%s(bool)%s\n",
|
|
"", prefix, "", bool_field_names[i], suffix);
|
|
}
|
|
for (i = 0; i < next_int_opt; i++) {
|
|
prefix=";";
|
|
suffix="";
|
|
printf("%-4s%-4s%-4soo_%s(int)%s\n", "",
|
|
prefix, "", int_field_names[i], suffix);
|
|
}
|
|
for (i = 0; i < next_string_opt; i++) {
|
|
prefix=";";
|
|
suffix="";
|
|
|
|
printf("%-4s%-4s%-4soo_%s(string)%s\n",
|
|
"", prefix, "", string_field_names[i], suffix);
|
|
}
|
|
printf("%-4s%-4s%-4soo_opt_level(int)\n", "", prefix, "");
|
|
printf("%-4s%-4s%-4soo_opt_for_space.\n", "", prefix, "");
|
|
|
|
printf("\n%%---------------------%%\n\n");
|
|
|
|
printf(":- type %s\n", record_type_name);
|
|
printf("%-4s--->%-4s%s(\n", "", "", record_type_name);
|
|
for (i = 0; i < next_bool_opt; i++) {
|
|
suffix = ",";
|
|
printf("%-16sot_%-26s :: maybe_%s%s\n",
|
|
"", bool_field_names[i], bool_field_names[i], suffix);
|
|
}
|
|
for (i = 0; i < next_int_opt; i++) {
|
|
suffix = ",";
|
|
printf("%-16sot_%-26s :: int%s\n", "", int_field_names[i], suffix);
|
|
}
|
|
for (i = 0; i < next_string_opt; i++) {
|
|
suffix = ",";
|
|
if (i == next_string_opt-1)
|
|
suffix = "";
|
|
|
|
printf("%-16sot_%-26s :: string%s\n",
|
|
"", string_field_names[i], suffix);
|
|
}
|
|
printf("%-12s).\n", "");
|
|
|
|
printf("\n");
|
|
printf(":- pred process_optimization_options(option_table::in,\n");
|
|
printf("%-4slist(%s)::in, %s::out) is det.\n",
|
|
"", switch_type_name, record_type_name);
|
|
|
|
printf("\n");
|
|
printf("%%---------------------------------------------------------------------------%%\n");
|
|
printf("\n");
|
|
|
|
printf(":- implementation.\n");
|
|
|
|
printf("\n");
|
|
printf(":- import_module getopt.\n");
|
|
printf(":- import_module int.\n");
|
|
printf(":- import_module map.\n");
|
|
printf(":- import_module string.\n");
|
|
|
|
printf("\n%%---------------------%%\n\n");
|
|
|
|
printf("process_optimization_options(OptionTable, OptOptions, !:OptTuple) :-\n");
|
|
printf("%-4s!:OptTuple = init_opt_tuple,\n", "");
|
|
printf("%-4slist.foldl2(\n", "");
|
|
printf("%-8supdate_opt_tuple(not_from_opt_level, OptionTable),\n", "");
|
|
printf("%-8sOptOptions, !OptTuple, not_seen_opt_level, MaybeSeenOptLevel),\n", "");
|
|
printf("%-4s(\n", "");
|
|
printf("%-8sMaybeSeenOptLevel = not_seen_opt_level,\n", "");
|
|
printf("%-8sget_default_opt_level(OptionTable, DefaultOptLevel),\n", "");
|
|
printf("%-8sset_opts_upto_level(OptionTable, 0, DefaultOptLevel,\n",
|
|
"");
|
|
printf("%-12s!OptTuple, MaybeSeenOptLevel, _)\n", "");
|
|
printf("%-4s;\n", "");
|
|
printf("%-8sMaybeSeenOptLevel = seen_opt_level\n", "");
|
|
printf("%-4s).\n", "");
|
|
|
|
printf("\n%%---------------------%%\n");
|
|
|
|
printf("\n");
|
|
printf(":- func init_opt_tuple = %s.\n\n", record_type_name);
|
|
|
|
printf("init_opt_tuple = %s(\n", record_type_name);
|
|
for (i = 0; i < next_bool_opt; i++) {
|
|
printf("%-8s%s,\n", "", bool_defaults[i]);
|
|
}
|
|
for (i = 0; i < next_int_opt; i++) {
|
|
printf("%-8s%s,\n", "", int_defaults[i]);
|
|
}
|
|
for (i = 0; i < next_string_opt; i++) {
|
|
suffix = ",";
|
|
if (i == next_string_opt-1) {
|
|
suffix = "";
|
|
}
|
|
printf("%-8s%s%s\n", "", "\"\"", suffix);
|
|
}
|
|
printf("%-4s).\n", "");
|
|
|
|
printf("\n%%---------------------%%\n");
|
|
|
|
printf("\n");
|
|
printf(":- type maybe_seen_opt_level\n");
|
|
printf("%-4s%-8snot_seen_opt_level\n", "", "--->");
|
|
printf("%-4s%-8sseen_opt_level.\n", "", ";");
|
|
|
|
printf("\n");
|
|
printf(":- type maybe_from_opt_level\n");
|
|
printf("%-4s%-8snot_from_opt_level\n", "", "--->");
|
|
printf("%-4s%-8sfrom_opt_level.\n", "", ";");
|
|
|
|
printf("\n");
|
|
printf(":- pred update_opt_tuple(maybe_from_opt_level::in, ");
|
|
printf("option_table::in,\n");
|
|
printf("%-4s%s::in, %s::in, %s::out,\n",
|
|
"", switch_type_name, record_type_name, record_type_name);
|
|
printf("%-4smaybe_seen_opt_level::in, maybe_seen_opt_level::out) is det.\n", "");
|
|
printf("\n");
|
|
|
|
# Each arm of the switch that handles an ordinary bool or int option
|
|
# forwards its work to a separate tiny predicate. We used to generate
|
|
# inline code for them, but this led to the Java bytecode generated
|
|
# for update_opt_tuple to exceed the 64k limit on the size of a single
|
|
# method. This was Mantis bug #522.
|
|
|
|
printf("update_opt_tuple(FromOptLevel, OptionTable, %s, !%s,\n",
|
|
switch_var_name, record_var_name);
|
|
printf("%-8s!MaybeSeenOptLevel) :-\n",
|
|
"", switch_var_name, record_var_name);
|
|
printf("%-4srequire_complete_switch [%s]\n", "", switch_var_name);
|
|
lparen_or_semi = "(";
|
|
for (i = 0; i < next_bool_opt; i++) {
|
|
printf("%-4s%s\n", "", lparen_or_semi);
|
|
lparen_or_semi = ";";
|
|
printf("%-8s%s = oo_%s(Bool),\n", "", switch_var_name, bool_field_names[i]);
|
|
if (bool_field_names[i] == "opt_delay_slot") {
|
|
printf("%-8supdate_opt_tuple_bool_%s(OptionTable, Bool, !%s)\n",
|
|
"", bool_field_names[i], record_var_name);
|
|
} else {
|
|
printf("%-8supdate_opt_tuple_bool_%s(Bool, !%s)\n",
|
|
"", bool_field_names[i], record_var_name);
|
|
}
|
|
}
|
|
for (i = 0; i < next_int_opt; i++) {
|
|
printf("%-4s%s\n", "", lparen_or_semi);
|
|
lparen_or_semi = ";";
|
|
printf("%-8s%s = oo_%s(N),\n",
|
|
"", switch_var_name, int_field_names[i]);
|
|
printf("%-8supdate_opt_tuple_int_%s(FromOptLevel, N, !%s)\n",
|
|
"", int_field_names[i], record_var_name);
|
|
}
|
|
for (i = 0; i < next_string_opt; i++) {
|
|
printf("%-4s%s\n", "", lparen_or_semi);
|
|
lparen_or_semi = ";";
|
|
printf("%-8s%s = oo_%s(Str),\n",
|
|
"", switch_var_name, string_field_names[i]);
|
|
printf("%-8s!%s ^ ot_%s := Str\n",
|
|
"", record_var_name, string_field_names[i]);
|
|
}
|
|
printf("%-4s%s\n", "", lparen_or_semi);
|
|
lparen_or_semi = ";";
|
|
printf("%-8s%s = oo_opt_level(OptLevel),\n", "", switch_var_name);
|
|
printf("%-8sset_opts_upto_level(OptionTable, 0, OptLevel, !%s, !.MaybeSeenOptLevel, _),\n",
|
|
"", record_var_name);
|
|
printf("%-8s!:MaybeSeenOptLevel = seen_opt_level\n", "");
|
|
|
|
printf("%-4s%s\n", "", lparen_or_semi);
|
|
printf("%-8s%s = oo_opt_for_space,\n", "", switch_var_name);
|
|
printf("%-8sset_opts_for_space(!%s)\n", "", record_var_name);
|
|
printf("%-4s).\n\n", "");
|
|
|
|
for (i = 0; i < next_bool_opt; i++) {
|
|
if (bool_field_names[i] == "opt_delay_slot") {
|
|
printf(":- pred update_opt_tuple_bool_%s(",
|
|
bool_field_names[i]);
|
|
printf("option_table::in, bool::in,\n");
|
|
printf("%-4s%s::in, %s::out) is det.\n\n",
|
|
"", record_type_name, record_type_name);
|
|
printf("update_opt_tuple_bool_%s(OptionTable, Bool, !%s) :-\n",
|
|
bool_field_names[i], record_var_name);
|
|
} else {
|
|
printf(":- pred update_opt_tuple_bool_%s(",
|
|
bool_field_names[i]);
|
|
printf("bool::in,\n");
|
|
printf("%-4s%s::in, %s::out) is det.\n\n",
|
|
"", record_type_name, record_type_name);
|
|
printf("update_opt_tuple_bool_%s(Bool, !%s) :-\n",
|
|
bool_field_names[i], record_var_name);
|
|
}
|
|
|
|
printf("%-4sOldValue = !.%s ^ ot_%s,\n",
|
|
"", record_var_name, bool_field_names[i]);
|
|
printf("%-4s( if\n", "");
|
|
if (bool_field_names[i] == "opt_delay_slot") {
|
|
printf("%-8sgetopt.lookup_bool_option(OptionTable, have_delay_slot, yes),\n",
|
|
"");
|
|
}
|
|
printf("%-8sBool = yes\n", "");
|
|
printf("%-4sthen\n", "");
|
|
printf("%-8s(\n", "");
|
|
printf("%-12sOldValue = do_not_%s,\n", "", bool_field_names[i]);
|
|
printf("%-12s!%s ^ ot_%s := %s\n",
|
|
"", record_var_name, bool_field_names[i], bool_field_names[i]);
|
|
printf("%-8s;\n", "");
|
|
printf("%-12sOldValue = %s\n", "", bool_field_names[i]);
|
|
printf("%-8s)\n", "");
|
|
printf("%-4selse\n", "");
|
|
printf("%-8s(\n", "");
|
|
printf("%-12sOldValue = do_not_%s\n", "", bool_field_names[i]);
|
|
printf("%-8s;\n", "");
|
|
printf("%-12sOldValue = %s,\n", "", bool_field_names[i]);
|
|
printf("%-12s!%s ^ ot_%s := do_not_%s\n",
|
|
"", record_var_name, bool_field_names[i], bool_field_names[i]);
|
|
printf("%-8s)\n", "");
|
|
printf("%-4s).\n\n", "");
|
|
}
|
|
|
|
for (i = 0; i < next_int_opt; i++) {
|
|
printf(":- pred update_opt_tuple_int_%s(\n",
|
|
int_field_names[i]);
|
|
printf("%-4smaybe_from_opt_level::in, int::in,\n", "");
|
|
printf("%-4s%s::in, %s::out) is det.\n\n",
|
|
"", record_type_name, record_type_name);
|
|
printf("update_opt_tuple_int_%s(FromOptLevel, N, !%s) :-\n",
|
|
int_field_names[i], record_var_name);
|
|
|
|
printf("%-4s(\n", "");
|
|
printf("%-8sFromOptLevel = not_from_opt_level,\n", "");
|
|
printf("%-8s!%s ^ ot_%s := N\n",
|
|
"", record_var_name, int_field_names[i]);
|
|
printf("%-4s;\n", "");
|
|
printf("%-8sFromOptLevel = from_opt_level,\n", "");
|
|
printf("%-8sOldN = !.%s ^ ot_%s,\n",
|
|
"", record_var_name, int_field_names[i]);
|
|
printf("%-8s!%s ^ ot_%s := int.max(OldN, N)\n",
|
|
"", record_var_name, int_field_names[i]);
|
|
printf("%-4s).\n\n", "");
|
|
}
|
|
|
|
# The code from here to the end of the script generates code
|
|
# that should be included in the special_handler predicate in options.m.
|
|
|
|
handler_file = "handler_file";
|
|
arm_start = "(";
|
|
for (i = 0; i < next_bool_opt; i++) {
|
|
printf("%-8s%s\n", "", arm_start) > handler_file;
|
|
arm_start = ";";
|
|
printf("%-12sOption = optopt_%s,\n",
|
|
"", bool_opts[i]) > handler_file;
|
|
printf("%-12sSpecialData = bool(Bool),\n", "") > handler_file;
|
|
printf("%-12sOptOption = oo_%s(Bool)\n",
|
|
"", bool_field_names[i]) > handler_file;
|
|
}
|
|
for (i = 0; i < next_int_opt; i++) {
|
|
printf("%-8s;\n", "") > handler_file;
|
|
printf("%-12sOption = optopt_%s,\n",
|
|
"", int_opts[i]) > handler_file;
|
|
printf("%-12sSpecialData = int(N),\n", "") > handler_file;
|
|
printf("%-12sOptOption = oo_%s(N)\n",
|
|
"", int_field_names[i]) > handler_file;
|
|
}
|
|
for (i = 0; i < next_string_opt; i++) {
|
|
printf("%-8s;\n", "") > handler_file;
|
|
printf("%-12sOption = optopt_%s,\n",
|
|
"", string_opts[i]) > handler_file;
|
|
printf("%-12sSpecialData = string(Str),\n", "") > handler_file;
|
|
printf("%-12sOptOption = oo_%s(Str)\n",
|
|
"", string_field_names[i]) > handler_file;
|
|
}
|
|
|
|
printf("%-8s;\n", "") > handler_file;
|
|
printf("%-12sOption = opt_level,\n", "") > handler_file;
|
|
printf("%-12sSpecialData = int(N),\n", "") > handler_file;
|
|
printf("%-12sOptOption = oo_opt_level(N)\n", "") > handler_file;
|
|
|
|
printf("%-8s;\n", "") > handler_file;
|
|
printf("%-12sOption = opt_space,\n", "") > handler_file;
|
|
printf("%-12sSpecialData = none,\n", "") > handler_file;
|
|
printf("%-12sOptOption = oo_opt_for_space\n", "") > handler_file;
|
|
|
|
printf("%-8s),\n", "") > handler_file;
|
|
}
|