mirror of
https://github.com/Mercury-Language/mercury.git
synced 2025-12-14 21:35:49 +00:00
Estimated hours taken: 24 WARNING: this change affects binary compatibility for debuggable code; the debuggable modules of the program and the runtime linked into the executable must either all come from before this change, or they must all come from after this change. However, this change does *not* affect binary compatibility for non-debuggable executables. Reduce the number of arguments of MR_trace() to one. Two of the arguments, the port and the goal path, move into the label layout structure, as 16-bit numbers; the port as a simple enumeration type, and the goal path as an index into the module-wide string table. (The latter will eventually allow the debugger to support the placement of breakpoints on labels with specific goal paths.) The third argument, the number of the highest-numbered rN register in use at the label, has been moved into the proc layout structure. In theory, this will require more register saves and restores, since the number in the proc layout is conservative (it is the max of the numbers that would be required at the individual labels). However, this is not important, for two reasons. First, we always save and restore all the rM registers that appear in the mrM array before the last special-purpose register, and in most cases this dictates how many registers we save/restore. Second, we save/restore registers only when the debugger starts interaction, so save/restore is a time critical activity only for the external debugger. This change reduces the execution time of debuggable executables by about 4-5% when executing outside mdb and 3-4% when executing under mdb. It also reduces executable sizes, but only by about 0.7% on x86. This change eliminates the --trace-just-in-case compiler option, since we now have the best of both --trace-just-in-case and --no-trace-just-in-case. The drawback of this scheme is slightly increased executable size with the accurage garbage collector, but that seems a small enough price to pay. compiler/code_gen.m: compiler/code_info.m: Record the number of the highest numbered rN register live at a trace label. compiler/continuation_info.m: Record the number of the highest numbered rN register live at a trace label, and the port and goal path associated with the labels of trace events. compiler/stack_layout.m: Put the number of the highest numbered rN register live at a trace label into proc layouts, and the port and goal path into label layouts. Since we are breaking binary compatibility with old debuggable modules anyway, compress the procedure id parts of proc layouts by using only 16 bits to store the procedure's arity and mode number, instead of 32 or 64. compiler/trace.m: Update the handling of ports, goal paths and max live register numbers, so that instead of being passed as MR_trace arguments, they are recorded in data structures. Generate separate labels and layouts for the fail and redo events. Although they still have the same layout information, they now record different ports. compiler/llds.m: Since trace.m now generates a label layout structure for the redo event, we must include redo events in the llds goal path type. compiler/hlds_goal.m: Since the code for handling the port type for nondet pragma events has moved from the nondet-pragma-specific to the generic part of trace.m, we must now include their event types in the hlds goal path type. compiler/llds_out.m: Add a predicate for converting ports into numbers, now that we must store ports in static data. Using their symbolic names would be better, but that would require complications in the llds type system, which would be inadvisable just before the release. compiler/options.m: compiler/handle_options.m: doc/user_guide.texi: Eliminate --trace-just-in-case. compiler/llds.m: compiler/llds_common.m: compiler/llds_out.m: Eliminate the data structure needed by --trace-just-in-case. compiler/optimize.m: Trivial update to conform to data structure changes. library/exception.m: Update the call to MR_trace. runtime/mercury_stack_layout.h: Update the C structure declarations for the layout structures as discussed above. runtime/mercury_init.h: Update the declarations of MR_trace_real and MR_trace_fake to use only one argument. runtime/mercury_wrapper.[ch]: Update the declaration of MR_trace_func to use only one argument. runtime/mercury_trace_base.[ch]: Update the declarations of MR_trace, MR_trace_real and MR_trace_fake to use only one argument. Delete MR_trace_struct(); since we deleted --trace-just-in-case, there will not be calls to it anymore. Since we are breaking binary compatibility anyway, move the exception port to be with the other interface ports. This should speed up a frequently executed test in the debugger. Update the handling of redo events. trace/mercury_trace.h: Simplify and speed up the macro that tests a port for being an interface port, now that exceptions are grouped with other interface events. trace/mercury_trace.c: Update the definition of MR_trace_real to use only one argument. The port is pulled out of the label layout structure only when needed to perform the termination tests for the current debugger command, and the goal path and the max live register number are looked up only when the termination test succeeds.
370 lines
10 KiB
Mathematica
370 lines
10 KiB
Mathematica
%-----------------------------------------------------------------------------%
|
|
% Copyright (C) 1996-1999 The University of Melbourne.
|
|
% This file may only be copied under the terms of the GNU General
|
|
% Public License - see the file COPYING in the Mercury distribution.
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
% optimize.m - LLDS to LLDS optimizations.
|
|
|
|
% Main author: zs.
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
:- module optimize.
|
|
|
|
:- interface.
|
|
|
|
:- import_module llds.
|
|
:- import_module io, list.
|
|
|
|
:- pred optimize_main(list(c_procedure)::in, global_data::in,
|
|
list(c_procedure)::out, io__state::di, io__state::uo) is det.
|
|
|
|
:- pred optimize__proc(c_procedure::in, global_data::in,
|
|
c_procedure::out, io__state::di, io__state::uo) is det.
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
:- implementation.
|
|
|
|
:- import_module jumpopt, labelopt, dupelim, peephole.
|
|
:- import_module frameopt, delay_slot, value_number, options.
|
|
:- import_module globals, passes_aux, opt_util, opt_debug, vn_debug.
|
|
:- import_module continuation_info.
|
|
|
|
:- import_module bool, int, map, bimap, set, std_util.
|
|
|
|
optimize_main([], _, []) --> [].
|
|
optimize_main([Proc0 | Procs0], GlobalData, [Proc | Procs]) -->
|
|
optimize__proc(Proc0, GlobalData, Proc), !,
|
|
optimize_main(Procs0, GlobalData, Procs).
|
|
|
|
optimize__proc(CProc0, GlobalData, CProc) -->
|
|
{ CProc0 = c_procedure(Name, Arity, PredProcId, Instrs0) },
|
|
globals__io_lookup_bool_option(debug_opt, DebugOpt),
|
|
opt_debug__msg(DebugOpt, "before optimization"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs0),
|
|
globals__io_lookup_int_option(optimize_repeat, AllRepeat),
|
|
globals__io_lookup_int_option(optimize_vnrepeat, VnRepeat),
|
|
globals__io_lookup_bool_option(optimize_value_number, ValueNumber),
|
|
{
|
|
global_data_maybe_get_proc_layout(GlobalData, PredProcId,
|
|
ProcLayout)
|
|
->
|
|
ProcLayout = proc_layout_info(_, _, _, _, _, _, _, _,
|
|
LabelMap),
|
|
map__sorted_keys(LabelMap, LayoutLabels),
|
|
set__sorted_list_to_set(LayoutLabels, LayoutLabelSet)
|
|
;
|
|
set__init(LayoutLabelSet)
|
|
},
|
|
( { ValueNumber = yes } ->
|
|
{ NovnRepeat is AllRepeat - VnRepeat },
|
|
optimize__repeat(NovnRepeat, no, LayoutLabelSet,
|
|
Instrs0, Instrs1),
|
|
optimize__middle(Instrs1, no, LayoutLabelSet, Instrs2),
|
|
optimize__repeat(VnRepeat, yes, LayoutLabelSet,
|
|
Instrs2, Instrs3)
|
|
;
|
|
optimize__repeat(AllRepeat, no, LayoutLabelSet,
|
|
Instrs0, Instrs1),
|
|
optimize__middle(Instrs1, yes, LayoutLabelSet, Instrs3)
|
|
),
|
|
optimize__last(Instrs3, LayoutLabelSet, Instrs),
|
|
{ CProc = c_procedure(Name, Arity, PredProcId, Instrs) }.
|
|
|
|
%-----------------------------------------------------------------------------%
|
|
|
|
:- pred optimize__repeat(int::in, bool::in, set(label)::in,
|
|
list(instruction)::in, list(instruction)::out,
|
|
io__state::di, io__state::uo) is det.
|
|
|
|
optimize__repeat(Iter0, DoVn, LayoutLabelSet, Instrs0, Instrs) -->
|
|
(
|
|
{ Iter0 > 0 }
|
|
->
|
|
{ Iter1 is Iter0 - 1 },
|
|
( { Iter1 = 0 } ->
|
|
{ Final = yes }
|
|
;
|
|
{ Final = no }
|
|
),
|
|
optimize__repeated(Instrs0, DoVn, Final, LayoutLabelSet,
|
|
Instrs1, Mod),
|
|
( { Mod = yes } ->
|
|
optimize__repeat(Iter1, DoVn, LayoutLabelSet,
|
|
Instrs1, Instrs)
|
|
;
|
|
{ Instrs = Instrs1 }
|
|
)
|
|
;
|
|
{ Instrs = Instrs0 }
|
|
).
|
|
|
|
% We short-circuit jump sequences before normal peepholing
|
|
% to create more opportunities for use of the tailcall macro.
|
|
|
|
:- pred optimize__repeated(list(instruction)::in, bool::in, bool::in,
|
|
set(label)::in, list(instruction)::out, bool::out,
|
|
io__state::di, io__state::uo) is det.
|
|
|
|
optimize__repeated(Instrs0, DoVn, Final, LayoutLabelSet, Instrs, Mod) -->
|
|
globals__io_lookup_bool_option(very_verbose, VeryVerbose),
|
|
globals__io_lookup_bool_option(debug_opt, DebugOpt),
|
|
{ opt_util__find_first_label(Instrs0, Label) },
|
|
{ opt_util__format_label(Label, LabelStr) },
|
|
|
|
globals__io_lookup_bool_option(optimize_value_number, ValueNumber),
|
|
( { ValueNumber = yes, DoVn = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing value number for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
value_number_main(Instrs0, LayoutLabelSet, Instrs1),
|
|
( { Instrs1 = Instrs0 } ->
|
|
[]
|
|
;
|
|
opt_debug__msg(DebugOpt, "after value numbering"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs1)
|
|
)
|
|
;
|
|
{ Instrs1 = Instrs0 }
|
|
),
|
|
globals__io_lookup_bool_option(optimize_jumps, Jumpopt),
|
|
globals__io_lookup_bool_option(optimize_fulljumps, FullJumpopt),
|
|
globals__io_lookup_bool_option(checked_nondet_tailcalls,
|
|
CheckedNondetTailCalls),
|
|
globals__io_get_trace_level(TraceLevel),
|
|
( { Jumpopt = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing jumps for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ jumpopt_main(Instrs1, LayoutLabelSet, TraceLevel,
|
|
FullJumpopt, Final, CheckedNondetTailCalls,
|
|
Instrs2, Mod1) },
|
|
( { Mod1 = yes } ->
|
|
opt_debug__msg(DebugOpt, "after jump optimization"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs2)
|
|
;
|
|
[]
|
|
)
|
|
;
|
|
{ Instrs2 = Instrs1 },
|
|
{ Mod1 = no }
|
|
),
|
|
globals__io_lookup_bool_option(optimize_peep, Peephole),
|
|
( { Peephole = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing locally for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
globals__io_get_gc_method(GC_Method),
|
|
{ peephole__optimize(GC_Method, Instrs2, Instrs3, Mod2) },
|
|
( { Mod2 = yes } ->
|
|
opt_debug__msg(DebugOpt, "after peepholing"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs3)
|
|
;
|
|
[]
|
|
)
|
|
;
|
|
{ Instrs3 = Instrs2 },
|
|
{ Mod2 = no }
|
|
),
|
|
globals__io_lookup_bool_option(optimize_labels, LabelElim),
|
|
( { LabelElim = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing labels for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ labelopt_main(Instrs3, Final, LayoutLabelSet,
|
|
Instrs4, Mod3) },
|
|
( { Mod3 = yes } ->
|
|
opt_debug__msg(DebugOpt, "after label optimization"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs4)
|
|
;
|
|
[]
|
|
)
|
|
;
|
|
{ Instrs4 = Instrs3 },
|
|
{ Mod3 = no }
|
|
),
|
|
globals__io_lookup_bool_option(optimize_dups, DupElim),
|
|
( { DupElim = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing duplicates for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ dupelim_main(Instrs4, Instrs) },
|
|
( { Instrs = Instrs4 } ->
|
|
[]
|
|
;
|
|
opt_debug__msg(DebugOpt, "after duplicate elimination"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs)
|
|
)
|
|
;
|
|
{ Instrs = Instrs4 }
|
|
),
|
|
{ Mod1 = no, Mod2 = no, Mod3 = no, Instrs = Instrs0 ->
|
|
Mod = no
|
|
;
|
|
Mod = yes
|
|
},
|
|
globals__io_lookup_bool_option(statistics, Statistics),
|
|
maybe_report_stats(Statistics).
|
|
|
|
:- pred optimize__middle(list(instruction)::in, bool::in, set(label)::in,
|
|
list(instruction)::out, io__state::di, io__state::uo) is det.
|
|
|
|
optimize__middle(Instrs0, Final, LayoutLabelSet, Instrs) -->
|
|
globals__io_lookup_bool_option(very_verbose, VeryVerbose),
|
|
globals__io_lookup_bool_option(debug_opt, DebugOpt),
|
|
{ opt_util__find_first_label(Instrs0, Label) },
|
|
{ opt_util__format_label(Label, LabelStr) },
|
|
|
|
globals__io_lookup_bool_option(optimize_frames, FrameOpt),
|
|
( { FrameOpt = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing frames for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ frameopt_main(Instrs0, Instrs1, Mod1, Jumps) },
|
|
( { Mod1 = yes } ->
|
|
opt_debug__msg(DebugOpt, "after frame optimization"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs1)
|
|
;
|
|
[]
|
|
),
|
|
globals__io_lookup_bool_option(optimize_fulljumps, FullJumpopt),
|
|
globals__io_lookup_bool_option(checked_nondet_tailcalls,
|
|
CheckedNondetTailCalls),
|
|
globals__io_get_trace_level(TraceLevel),
|
|
( { Jumps = yes, FullJumpopt = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing jumps for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ jumpopt_main(Instrs1, LayoutLabelSet, TraceLevel,
|
|
FullJumpopt, Final, CheckedNondetTailCalls,
|
|
Instrs2, Mod2) },
|
|
( { Mod2 = yes } ->
|
|
opt_debug__msg(DebugOpt, "after jump optimization"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs2)
|
|
;
|
|
[]
|
|
)
|
|
;
|
|
{ Instrs2 = Instrs1 }
|
|
),
|
|
( { Mod1 = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing labels for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ labelopt_main(Instrs2, Final, LayoutLabelSet,
|
|
Instrs, Mod3) },
|
|
( { Mod3 = yes } ->
|
|
opt_debug__msg(DebugOpt, "after label optimization"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs)
|
|
;
|
|
[]
|
|
)
|
|
;
|
|
{ Instrs = Instrs2 }
|
|
)
|
|
;
|
|
{ Instrs = Instrs0 }
|
|
).
|
|
|
|
:- pred optimize__last(list(instruction)::in, set(label)::in,
|
|
list(instruction)::out, io__state::di, io__state::uo) is det.
|
|
|
|
optimize__last(Instrs0, LayoutLabelSet, Instrs) -->
|
|
globals__io_lookup_bool_option(very_verbose, VeryVerbose),
|
|
globals__io_lookup_bool_option(debug_opt, DebugOpt),
|
|
{ opt_util__find_first_label(Instrs0, Label) },
|
|
{ opt_util__format_label(Label, LabelStr) },
|
|
|
|
globals__io_lookup_bool_option(optimize_delay_slot, DelaySlot),
|
|
globals__io_lookup_bool_option(optimize_value_number, ValueNumber),
|
|
( { DelaySlot = yes ; ValueNumber = yes } ->
|
|
% We must get rid of any extra labels added by other passes,
|
|
% since they can confuse both post_value_number and delay_slot.
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing labels for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ labelopt_main(Instrs0, no, LayoutLabelSet, Instrs1, Mod1) },
|
|
( { Mod1 = yes } ->
|
|
opt_debug__msg(DebugOpt, "after label optimization"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs1)
|
|
;
|
|
[]
|
|
)
|
|
;
|
|
{ Instrs1 = Instrs0 }
|
|
),
|
|
( { DelaySlot = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing delay slot for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ fill_branch_delay_slot(Instrs1, Instrs2) },
|
|
( { Instrs1 = Instrs0 } ->
|
|
opt_debug__msg(DebugOpt, "after delay slot filling"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs2)
|
|
;
|
|
[]
|
|
)
|
|
;
|
|
{ Instrs2 = Instrs1 }
|
|
),
|
|
( { ValueNumber = yes } ->
|
|
( { VeryVerbose = yes } ->
|
|
io__write_string("% Optimizing post value number for "),
|
|
io__write_string(LabelStr),
|
|
io__write_string("\n")
|
|
;
|
|
[]
|
|
),
|
|
{ value_number__post_main(Instrs2, Instrs) },
|
|
( { Instrs = Instrs2 } ->
|
|
[]
|
|
;
|
|
opt_debug__msg(DebugOpt, "after post value number"),
|
|
opt_debug__dump_instrs(DebugOpt, Instrs)
|
|
)
|
|
;
|
|
{ Instrs = Instrs1 }
|
|
).
|