mirror of
https://github.com/Mercury-Language/mercury.git
synced 2025-12-18 07:15:19 +00:00
Estimated hours taken: 32
Branches: main
Reduce the overhead of all forms of tabling by eliminating in many cases
the overhead of transferring data across the C/Mercury boundary. These
involve lots of control transfers as well as assignments to and from
Mercury abstract machine registers, which are not real machine registers
on x86 machines. Benchmarking in Uppsala revealed this overhead to be
a real problem.
The way we do that is by changing the tabling transformation so that instead
of generating sequences of calls to predicates from library/table_builtin.m,
we generate sequences of calls to C macros from runtime/mercury_tabling_pred.h,
and emit the resulting code string as the body of a foreign_proc goal.
(The old transformation is still available via a new option,
--no-tabling-via-extra-args.)
Since the number of inputs and outputs of the resulting C code sequences
are not always fixed (they can depend on the number of input or output
arguments of predicate being transformed), implementing this required
adding to foreign_procs a new field that allows the specification of extra
arguments to be passed to and from the given foreign code fragment. For now,
this mechanism is implemented only by the C backends, since it is needed
only by the C backends. (We don't support yet tabling on other backends.)
To simplify the new implementation of the field on foreign_procs, consolidate
three existing fields into one. Each of these fields was a list with one
element per argument, so turning them into a single list with a combined record
per argument should also improve reliability, since it reduces the likelyhood
of updates leaving the data structure inconsistent.
The goal paths of components of a tabled predicate depend on whether
-no-tabling-via-extra-args was specified. To enable the expected outputs
of the debugger test cases testing tabling, we add a new mdb command,
goal_paths, that controls whether goal paths are printed by the debugger
at events, and turn off the printing of events in the relevant test cases.
Also, prepare for a future change to optimize the trie structure for
user-defined types by handling type_infos (and once we support them,
typeclass_infos) specially.
compiler/table_gen.m:
Change the tabling transformation along the lines described above.
To allow us to factor out as much of the new code as possible,
we change the meaning of the call_table_tip variable for minimal
model subgoals: instead of the trie node at the end of the answer
table, it is not now the subgoal reachable from it. This change
has no effect as yet, because we use call_table_tip variables
only to perform resets across retries in the debugger, and we
don't do retries across calls to minimal model tabled predicates.
Put predicates into logical groups.
library/table_builtin.m:
runtime/mercury_tabling_preds.h:
When the new transformations in table_gen.m generate foreign_procs
with variable numbers of arguments, the interfaces of those
foreign_procs often do not match the interfaces of the existing
library predicates at their core: they frequently have one more
or one fewer argument. To prevent any possible confusion, in such
cases we add a new variant of the predicate. These predicates
have the suffix _shortcut in their name. Their implementations
are dummy macros that do nothing; they serve merely as placeholders
before or after which the macros that actually do the work are
inserted.
Move the definitions of the lookup, save and restore predicates
into mercury_tabling_preds.h. Make the naming scheme of their
arguments more regular.
runtime/mercury_minimal_model.c:
runtime/mercury_tabling_preds.h:
Move the definition of a predicate from mercury_minimal_model.c
to mercury_tabling_preds.h, since the compiler now needs to be
able to generate an inlined version of it.
compiler/hlds_goal.m:
Replace the three existing fields describing the arguments of
foreign_procs with one, and add a new field describing the extra
arguments that may be inserted by table_gen.m.
Add utility predicates for processing the arguments of foreign_procs.
Change the order of some existing groups of declarations make it
more logical.
compiler/hlds_pred.m:
runtime/mercury_stack_layout.h:
Extend the data structures recording the structure of tabling tries
to allow the representation of trie steps for type_infos and
typeclass_infos.
runtime/mercury_tabling_macros.c:
Fix a bug regarding the tabling of typeclass_infos, which is now
required for a clean compile.
compiler/pragma_c_gen.m:
compiler/ml_code_gen.m:
Modify the generation of code for foreign_procs to handle extra
arguments, and to conform to the new data structures for foreign_proc
arguments.
compiler/llds.m:
The tabling transformations can now generate significantly sized
foreign_procs bodies, which the LLDS code generator translates to
pragma_c instructions. Duplicating these by jump optimization
may lose more by worsening locality than it gains in avoiding jumps,
so we add an extra field to pragma_c instructions that tells jumpopt
not to duplicate code sequences containing such pragma_cs.
compiler/jumpopt.m:
Respect the new flag on pragma_cs.
compiler/goal_util.m:
Add a predicate to create foreign_procs with specified contents,
modelled on the existing predicate to create calls.
Change the order of the arguments of that existing predicate
to make it more logical.
compiler/polymorphism.m:
Conform to the new definition of foreign_procs. Try to simplify
the mechanism for generating the type_info and typeclass_info
arguments of foreign_proc goals, but it is not clear that this
code is even ever executed.
compiler/aditi_builtin_ops.m:
compiler/assertion.m:
compiler/bytecode_gen.m:
compiler/clause_to_proc.m:
compiler/code_gen.m:
compiler/code_info.m:
compiler/code_util.m:
compiler/constraint.m:
compiler/deep_profiling.m:
compiler/deforest.m:
compiler/delay_construct.m:
compiler/dependency_graph.m:
compiler/det_analysis.m:
compiler/det_report.m:
compiler/dnf.m:
compiler/dupelim.m:
compiler/equiv_type_hlds.m:
compiler/exprn_aux.m:
compiler/follow_code.m:
compiler/follow_vars.m:
compiler/frameopt.m:
compiler/goal_form.m:
compiler/goal_path.m:
compiler/higher_order.m:
compiler/higher_order.m:
compiler/hlds_module.m:
compiler/hlds_out.m:
compiler/inlining.m:
compiler/ite_gen.m:
compiler/layout_out.m:
compiler/livemap.m:
compiler/liveness.m:
compiler/llds_out.m:
compiler/loop_inv.m:
compiler/magic.m:
compiler/make_hlds.m:
compiler/mark_static_terms.m:
compiler/middle_rec.m:
compiler/modes.m:
compiler/modules.m:
compiler/opt_debug.m:
compiler/pd_cost.m:
compiler/prog_rep.m:
compiler/purity.m:
compiler/quantification.m:
compiler/reassign.m:
compiler/rl_exprn.m:
compiler/saved_vars.m:
compiler/simplify.m:
compiler/size_prof.m:
compiler/store_alloc.m:
compiler/stratify.m:
compiler/switch_detection.m:
compiler/term_pass1.m:
compiler/term_traversal.m:
compiler/termination.m:
compiler/trace.m:
compiler/typecheck.m:
compiler/unify_proc.m:
compiler/unique_modes.m:
compiler/unneeed_code.m:
compiler/unused_args.m:
compiler/use_local_vars.m:
Conform to the new definition of foreign_procs, pragma_cs and/or
table trie steps, or to changed argument orders.
compiler/add_heap_ops.m:
compiler/add_trail_ops.m:
compiler/cse_detection.m:
compiler/dead_proc_elim.m:
compiler/equiv_type.m:
compiler/intermod.m:
compiler/lambda.m:
compiler/lco.m:
compiler/module_util.m:
compiler/opt_util.m:
compiler/stack_opt.m:
compiler/trans_opt.m:
Conform to the new definition of foreign_procs.
Bring these modules up to date with our current code style guidelines,
using predmode declarations, state variable syntax and unification
expressions as appropriate.
compiler/mercury_compile.m:
Conform to the changed argument order of a predicate in trans_opt.m.
compiler/options.m:
Add the --no-tabling-via-extra-args option, but leave the
documentation commented out since the option is for developers only.
doc/user_guide.texi:
Document --no-tabling-via-extra-args option, though leave the
documentation commented out since the option is for developers only.
doc/user_guide.texi:
doc/mdb_categories:
Document the new goal_paths mdb command.
trace/mercury_trace_internals.c:
Implement the new goal_paths mdb command.
tests/debugger/completion.exp:
Conform to the presence of the goal_paths mdb command.
tests/debugger/mdb_command_test.inp:
Test the existence of documentation for the goal_paths mdb command.
tests/debugger/print_table.{inp,exp*}:
tests/debugger/retry.{inp,exp*}:
Use the goal_paths command to avoid having the expected output
depend on the presence or absence of --tabling-via-extra-args.
tests/tabling/table_foreign_output.{m,exp}:
Add a new test case to test the save/restore of arguments of foreign
types.
tests/tabling/Mmakefile:
Enable the new test case.
tests/tabling/test_tabling:
Make this script more robust.
Add an option for testing only the standard model forms of tabling.
279 lines
11 KiB
Mathematica
279 lines
11 KiB
Mathematica
%-----------------------------------------------------------------------------%
|
|
% Copyright (C) 2001-2004 The University of Melbourne.
|
|
% This file may only be copied under the terms of the GNU General
|
|
% Public License - see the file COPYING in the Mercury distribution.
|
|
%-----------------------------------------------------------------------------%
|
|
%
|
|
% File: delay_construct.m
|
|
%
|
|
% Author: zs.
|
|
%
|
|
% This module transforms sequences of goals in procedure bodies.
|
|
% It looks for a unification that constructs a ground term followed by
|
|
% primitive goals, at least one of which can fail, and none of which take
|
|
% the variable representing the cell as their input. Such code sequences
|
|
% cause the cell to be constructed even if the following goal would fail,
|
|
% which is wasteful. This module therefore reorders the sequence, moving the
|
|
% construction unification past all the semidet primitives it can.
|
|
%
|
|
% The reason we don't move the construction past calls or composite goals
|
|
% is that this may require storing the input arguments of the construction on
|
|
% the stack, which may cause a slowdown bigger than the speedup available from
|
|
% not having to construct the cell on some execution paths.
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
:- module transform_hlds__delay_construct.
|
|
|
|
:- interface.
|
|
|
|
:- import_module hlds__hlds_module.
|
|
:- import_module hlds__hlds_pred.
|
|
:- import_module io.
|
|
|
|
:- pred delay_construct_proc(pred_id::in, proc_id::in, module_info::in,
|
|
proc_info::in, proc_info::out, io::di, io::uo) is det.
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
:- implementation.
|
|
|
|
:- import_module check_hlds__inst_match.
|
|
:- import_module hlds__hlds_data.
|
|
:- import_module hlds__hlds_goal.
|
|
:- import_module hlds__instmap.
|
|
:- import_module hlds__passes_aux.
|
|
:- import_module libs__globals.
|
|
:- import_module parse_tree__prog_data.
|
|
|
|
:- import_module bool, list, set, std_util, require.
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
delay_construct_proc(PredId, ProcId, ModuleInfo, !ProcInfo, !IO) :-
|
|
write_proc_progress_message("% Delaying construction unifications in ",
|
|
PredId, ProcId, ModuleInfo, !IO),
|
|
globals__io_get_globals(Globals, !IO),
|
|
module_info_pred_info(ModuleInfo, PredId, PredInfo),
|
|
delay_construct_proc_no_io(PredInfo, ModuleInfo, Globals,
|
|
!ProcInfo).
|
|
|
|
:- pred delay_construct_proc_no_io(pred_info::in, module_info::in, globals::in,
|
|
proc_info::in, proc_info::out) is det.
|
|
|
|
delay_construct_proc_no_io(PredInfo, ModuleInfo, Globals, !ProcInfo) :-
|
|
body_should_use_typeinfo_liveness(PredInfo, Globals,
|
|
BodyTypeinfoLiveness),
|
|
proc_info_vartypes(!.ProcInfo, VarTypes),
|
|
proc_info_typeinfo_varmap(!.ProcInfo, TypeInfoVarMap),
|
|
proc_info_get_initial_instmap(!.ProcInfo, ModuleInfo, InstMap0),
|
|
DelayInfo = delay_construct_info(ModuleInfo, BodyTypeinfoLiveness,
|
|
VarTypes, TypeInfoVarMap),
|
|
proc_info_goal(!.ProcInfo, Goal0),
|
|
delay_construct_in_goal(Goal0, InstMap0, DelayInfo, Goal),
|
|
proc_info_set_goal(Goal, !ProcInfo).
|
|
|
|
:- type delay_construct_info
|
|
---> delay_construct_info(
|
|
module_info :: module_info,
|
|
body_typeinfo_liveness :: bool,
|
|
vartypes :: vartypes,
|
|
type_info_varmap :: type_info_varmap
|
|
).
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
:- pred delay_construct_in_goal(hlds_goal::in, instmap::in,
|
|
delay_construct_info::in, hlds_goal::out) is det.
|
|
|
|
delay_construct_in_goal(GoalExpr0 - GoalInfo0, InstMap0, DelayInfo, Goal) :-
|
|
(
|
|
GoalExpr0 = conj(Goals0),
|
|
goal_info_get_determinism(GoalInfo0, Detism),
|
|
determinism_components(Detism, CanFail, MaxSoln),
|
|
(
|
|
% If the conjunction cannot fail, then its conjuncts
|
|
% cannot fail either, so we have no hope of pushing a
|
|
% construction past a failing goal.
|
|
%
|
|
% If the conjuntion contains goals that can succeed
|
|
% more than once, which is possible if MaxSoln is
|
|
% at_most_many or at_most_many_cc, then moving a
|
|
% construction to the right may increase the number of
|
|
% times the construction is executed. We are therefore
|
|
% careful to make sure delay_construct_in_conj doesn't
|
|
% move constructions across goals that succeed more
|
|
% than once.
|
|
%
|
|
% If the conjunction cannot succeed, i.e. MaxSoln is
|
|
% at_most_zero, there is no point in trying to speed it
|
|
% up.
|
|
|
|
CanFail = can_fail,
|
|
MaxSoln \= at_most_zero
|
|
->
|
|
delay_construct_in_conj(Goals0, InstMap0, DelayInfo,
|
|
set__init, [], Goals1)
|
|
;
|
|
Goals1 = Goals0
|
|
),
|
|
delay_construct_in_goals(Goals1, InstMap0, DelayInfo, Goals),
|
|
Goal = conj(Goals) - GoalInfo0
|
|
;
|
|
GoalExpr0 = par_conj(Goals0),
|
|
delay_construct_in_goals(Goals0, InstMap0, DelayInfo, Goals),
|
|
Goal = par_conj(Goals) - GoalInfo0
|
|
;
|
|
GoalExpr0 = disj(Goals0),
|
|
delay_construct_in_goals(Goals0, InstMap0, DelayInfo, Goals),
|
|
Goal = disj(Goals) - GoalInfo0
|
|
;
|
|
GoalExpr0 = not(NegGoal0),
|
|
delay_construct_in_goal(NegGoal0, InstMap0, DelayInfo, NegGoal),
|
|
Goal = not(NegGoal) - GoalInfo0
|
|
;
|
|
GoalExpr0 = switch(Var, CanFail, Cases0),
|
|
delay_construct_in_cases(Cases0, InstMap0, DelayInfo, Cases),
|
|
Goal = switch(Var, CanFail, Cases) - GoalInfo0
|
|
;
|
|
GoalExpr0 = if_then_else(Vars, Cond0, Then0, Else0),
|
|
Cond0 = _ - CondInfo0,
|
|
goal_info_get_instmap_delta(CondInfo0, CondInstMapDelta),
|
|
instmap__apply_instmap_delta(InstMap0, CondInstMapDelta,
|
|
InstMapThen),
|
|
delay_construct_in_goal(Cond0, InstMap0, DelayInfo, Cond),
|
|
delay_construct_in_goal(Then0, InstMapThen, DelayInfo, Then),
|
|
delay_construct_in_goal(Else0, InstMap0, DelayInfo, Else),
|
|
Goal = if_then_else(Vars, Cond, Then, Else) - GoalInfo0
|
|
;
|
|
GoalExpr0 = some(Var, CanRemove, SubGoal0),
|
|
delay_construct_in_goal(SubGoal0, InstMap0, DelayInfo, SubGoal),
|
|
Goal = some(Var, CanRemove, SubGoal) - GoalInfo0
|
|
;
|
|
GoalExpr0 = generic_call(_, _, _, _),
|
|
Goal = GoalExpr0 - GoalInfo0
|
|
;
|
|
GoalExpr0 = call(_, _, _, _, _, _),
|
|
Goal = GoalExpr0 - GoalInfo0
|
|
;
|
|
GoalExpr0 = unify(_, _, _, _, _),
|
|
Goal = GoalExpr0 - GoalInfo0
|
|
;
|
|
GoalExpr0 = foreign_proc(_, _, _, _, _, _),
|
|
Goal = GoalExpr0 - GoalInfo0
|
|
;
|
|
GoalExpr0 = shorthand(_),
|
|
% these should have been expanded out by now
|
|
error("delay_construct_in_goal: unexpected shorthand")
|
|
).
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
% We maintain a list of delayed construction unifications that construct ground
|
|
% terms, and the set of variables they define.
|
|
%
|
|
% When we find other construction unifications, we add them to the list.
|
|
% It does not matter if they depend on other delayed construction unifications;
|
|
% when we put them back into the conjunction, we do so in the original order.
|
|
%
|
|
% There are several reasons why we may not be able to delay a construction
|
|
% unification past a conjunct. The conjunct may not be a primitive goal,
|
|
% or it may be impure; in either case, we must insert all the delayed
|
|
% construction unifications before it. The conjunct may also require the value
|
|
% of a variable defined by a construction unification. In such cases, we could
|
|
% drop before that goal only the construction unifications that define the
|
|
% variables needed by the conjunct, either directly or indirectly through
|
|
% the values required by some of those construction unifications. However,
|
|
% separating out this set of delayed constructions from the others would
|
|
% require somewhat complex code, and it is not clear that there would be any
|
|
% significant benefit. We therefore insert *all* the delayed constructions
|
|
% before a goal if the goal requires *any* of the variables bound by the
|
|
% constructions.
|
|
%
|
|
% The instmap we pass around is the one that we construct from the original
|
|
% conjunction order. At each point, it reflects the bindings made by the
|
|
% conjuncts so far *plus* the bindings made by the delayed goals.
|
|
|
|
:- pred delay_construct_in_conj(list(hlds_goal)::in, instmap::in,
|
|
delay_construct_info::in, set(prog_var)::in, list(hlds_goal)::in,
|
|
list(hlds_goal)::out) is det.
|
|
|
|
delay_construct_in_conj([], _, _, _, RevDelayedGoals, DelayedGoals) :-
|
|
list__reverse(RevDelayedGoals, DelayedGoals).
|
|
delay_construct_in_conj([Goal0 | Goals0], InstMap0, DelayInfo,
|
|
ConstructedVars0, RevDelayedGoals0, Goals) :-
|
|
Goal0 = GoalExpr0 - GoalInfo0,
|
|
goal_info_get_instmap_delta(GoalInfo0, InstMapDelta0),
|
|
instmap__apply_instmap_delta(InstMap0, InstMapDelta0, InstMap1),
|
|
(
|
|
GoalExpr0 = unify(_, _, _, Unif, _),
|
|
Unif = construct(Var, _, Args, _, _, _, _),
|
|
Args = [_ | _], % We are constructing a cell, not a constant
|
|
instmap__lookup_var(InstMap0, Var, Inst0),
|
|
inst_is_free(DelayInfo ^ module_info, Inst0),
|
|
instmap__lookup_var(InstMap1, Var, Inst1),
|
|
inst_is_ground(DelayInfo ^ module_info, Inst1)
|
|
->
|
|
set__insert(ConstructedVars0, Var, ConstructedVars1),
|
|
RevDelayedGoals1 = [Goal0 | RevDelayedGoals0],
|
|
delay_construct_in_conj(Goals0, InstMap1, DelayInfo,
|
|
ConstructedVars1, RevDelayedGoals1, Goals)
|
|
;
|
|
Goal0 = GoalExpr0 - GoalInfo0,
|
|
delay_construct_skippable(GoalExpr0, GoalInfo0),
|
|
goal_info_get_nonlocals(GoalInfo0, NonLocals),
|
|
proc_info_maybe_complete_with_typeinfo_vars(NonLocals,
|
|
DelayInfo ^ body_typeinfo_liveness,
|
|
DelayInfo ^ vartypes,
|
|
DelayInfo ^ type_info_varmap, CompletedNonLocals),
|
|
set__intersect(CompletedNonLocals, ConstructedVars0,
|
|
Intersection),
|
|
set__empty(Intersection),
|
|
\+ goal_info_has_feature(GoalInfo0, impure),
|
|
\+ goal_info_has_feature(GoalInfo0, semipure)
|
|
->
|
|
delay_construct_in_conj(Goals0, InstMap1, DelayInfo,
|
|
ConstructedVars0, RevDelayedGoals0, Goals1),
|
|
Goals = [Goal0 | Goals1]
|
|
;
|
|
list__reverse(RevDelayedGoals0, DelayedGoals),
|
|
delay_construct_in_conj(Goals0, InstMap1, DelayInfo,
|
|
set__init, [], Goals1),
|
|
list__append(DelayedGoals, [Goal0 | Goals1], Goals)
|
|
).
|
|
|
|
:- pred delay_construct_skippable(hlds_goal_expr::in, hlds_goal_info::in)
|
|
is semidet.
|
|
|
|
delay_construct_skippable(GoalExpr, GoalInfo) :-
|
|
(
|
|
GoalExpr = unify(_, _, _, _, _)
|
|
;
|
|
GoalExpr = call(_, _, _, inline_builtin, _, _)
|
|
),
|
|
goal_info_get_determinism(GoalInfo, Detism),
|
|
determinism_components(Detism, _CanFail, MaxSoln),
|
|
MaxSoln \= at_most_many.
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
:- pred delay_construct_in_goals(list(hlds_goal)::in, instmap::in,
|
|
delay_construct_info::in, list(hlds_goal)::out) is det.
|
|
|
|
delay_construct_in_goals([], _, _, []).
|
|
delay_construct_in_goals([Goal0 | Goals0], InstMap0, DelayInfo,
|
|
[Goal | Goals]) :-
|
|
delay_construct_in_goal(Goal0, InstMap0, DelayInfo, Goal),
|
|
delay_construct_in_goals(Goals0, InstMap0, DelayInfo, Goals).
|
|
|
|
:- pred delay_construct_in_cases(list(case)::in, instmap::in,
|
|
delay_construct_info::in, list(case)::out) is det.
|
|
|
|
delay_construct_in_cases([], _, _, []).
|
|
delay_construct_in_cases([case(Cons, Goal0) | Cases0], InstMap0, DelayInfo,
|
|
[case(Cons, Goal) | Cases]) :-
|
|
delay_construct_in_goal(Goal0, InstMap0, DelayInfo, Goal),
|
|
delay_construct_in_cases(Cases0, InstMap0, DelayInfo, Cases).
|
|
|
|
%-----------------------------------------------------------------------------%
|