Files
mercury/compiler/optimize.m
Zoltan Somogyi 0642a10ff8 Reduce the number of arguments of MR_trace() to one.
Estimated hours taken: 24

WARNING: this change affects binary compatibility for debuggable code;
the debuggable modules of the program and the runtime linked into the
executable must either all come from before this change, or they must all
come from after this change. However, this change does *not* affect binary
compatibility for non-debuggable executables.

Reduce the number of arguments of MR_trace() to one. Two of the arguments,
the port and the goal path, move into the label layout structure, as 16-bit
numbers; the port as a simple enumeration type, and the goal path as an
index into the module-wide string table. (The latter will eventually allow the
debugger to support the placement of breakpoints on labels with specific goal
paths.) The third argument, the number of the highest-numbered rN register in
use at the label, has been moved into the proc layout structure. In theory,
this will require more register saves and restores, since the number in the
proc layout is conservative (it is the max of the numbers that would be
required at the individual labels). However, this is not important, for two
reasons. First, we always save and restore all the rM registers that
appear in the mrM array before the last special-purpose register, and in
most cases this dictates how many registers we save/restore. Second, we
save/restore registers only when the debugger starts interaction, so
save/restore is a time critical activity only for the external debugger.

This change reduces the execution time of debuggable executables by about
4-5% when executing outside mdb and 3-4% when executing under mdb. It also
reduces executable sizes, but only by about 0.7% on x86.

This change eliminates the --trace-just-in-case compiler option, since
we now have the best of both --trace-just-in-case and --no-trace-just-in-case.

The drawback of this scheme is slightly increased executable size with the
accurage garbage collector, but that seems a small enough price to pay.

compiler/code_gen.m:
compiler/code_info.m:
	Record the number of the highest numbered rN register live at a trace
	label.

compiler/continuation_info.m:
	Record the number of the highest numbered rN register live at a trace
	label, and the port and goal path associated with the labels of trace
	events.

compiler/stack_layout.m:
	Put the number of the highest numbered rN register live at a trace
	label into proc layouts, and the port and goal path into label layouts.

	Since we are breaking binary compatibility with old debuggable modules
	anyway, compress the procedure id parts of proc layouts by using
	only 16 bits to store the procedure's arity and mode number, instead
	of 32 or 64.

compiler/trace.m:
	Update the handling of ports, goal paths and max live register numbers,
	so that instead of being passed as MR_trace arguments, they are
	recorded in data structures.

	Generate separate labels and layouts for the fail and redo events.
	Although they still have the same layout information, they now record
	different ports.

compiler/llds.m:
	Since trace.m now generates a label layout structure for the redo
	event, we must include redo events in the llds goal path type.

compiler/hlds_goal.m:
	Since the code for handling the port type for nondet pragma events
	has moved from the nondet-pragma-specific to the generic part of
	trace.m, we must now include their event types in the hlds goal path
	type.

compiler/llds_out.m:
	Add a predicate for converting ports into numbers, now that we
	must store ports in static data. Using their symbolic names would
	be better, but that would require complications in the llds type
	system, which would be inadvisable just before the release.

compiler/options.m:
compiler/handle_options.m:
doc/user_guide.texi:
	Eliminate --trace-just-in-case.

compiler/llds.m:
compiler/llds_common.m:
compiler/llds_out.m:
	Eliminate the data structure needed by --trace-just-in-case.

compiler/optimize.m:
	Trivial update to conform to data structure changes.

library/exception.m:
	Update the call to MR_trace.

runtime/mercury_stack_layout.h:
	Update the C structure declarations for the layout structures
	as discussed above.

runtime/mercury_init.h:
	Update the declarations of MR_trace_real and MR_trace_fake
	to use only one argument.

runtime/mercury_wrapper.[ch]:
	Update the declaration of MR_trace_func to use only one argument.

runtime/mercury_trace_base.[ch]:
	Update the declarations of MR_trace, MR_trace_real and MR_trace_fake
	to use only one argument.

	Delete MR_trace_struct(); since we deleted --trace-just-in-case, there
	will not be calls to it anymore.

	Since we are breaking binary compatibility anyway, move the exception
	port to be with the other interface ports. This should speed up a
	frequently executed test in the debugger.

	Update the handling of redo events.

trace/mercury_trace.h:
	Simplify and speed up the macro that tests a port for being an
	interface port, now that exceptions are grouped with other interface
	events.

trace/mercury_trace.c:
	Update the definition of MR_trace_real to use only one argument.
	The port is pulled out of the label layout structure only when
	needed to perform the termination tests for the current debugger
	command, and the goal path and the max live register number are
	looked up only when the termination test succeeds.
1999-12-14 04:54:38 +00:00

370 lines
10 KiB
Mathematica

%-----------------------------------------------------------------------------%
% Copyright (C) 1996-1999 The University of Melbourne.
% This file may only be copied under the terms of the GNU General
% Public License - see the file COPYING in the Mercury distribution.
%-----------------------------------------------------------------------------%
% optimize.m - LLDS to LLDS optimizations.
% Main author: zs.
%-----------------------------------------------------------------------------%
:- module optimize.
:- interface.
:- import_module llds.
:- import_module io, list.
:- pred optimize_main(list(c_procedure)::in, global_data::in,
list(c_procedure)::out, io__state::di, io__state::uo) is det.
:- pred optimize__proc(c_procedure::in, global_data::in,
c_procedure::out, io__state::di, io__state::uo) is det.
%-----------------------------------------------------------------------------%
:- implementation.
:- import_module jumpopt, labelopt, dupelim, peephole.
:- import_module frameopt, delay_slot, value_number, options.
:- import_module globals, passes_aux, opt_util, opt_debug, vn_debug.
:- import_module continuation_info.
:- import_module bool, int, map, bimap, set, std_util.
optimize_main([], _, []) --> [].
optimize_main([Proc0 | Procs0], GlobalData, [Proc | Procs]) -->
optimize__proc(Proc0, GlobalData, Proc), !,
optimize_main(Procs0, GlobalData, Procs).
optimize__proc(CProc0, GlobalData, CProc) -->
{ CProc0 = c_procedure(Name, Arity, PredProcId, Instrs0) },
globals__io_lookup_bool_option(debug_opt, DebugOpt),
opt_debug__msg(DebugOpt, "before optimization"),
opt_debug__dump_instrs(DebugOpt, Instrs0),
globals__io_lookup_int_option(optimize_repeat, AllRepeat),
globals__io_lookup_int_option(optimize_vnrepeat, VnRepeat),
globals__io_lookup_bool_option(optimize_value_number, ValueNumber),
{
global_data_maybe_get_proc_layout(GlobalData, PredProcId,
ProcLayout)
->
ProcLayout = proc_layout_info(_, _, _, _, _, _, _, _,
LabelMap),
map__sorted_keys(LabelMap, LayoutLabels),
set__sorted_list_to_set(LayoutLabels, LayoutLabelSet)
;
set__init(LayoutLabelSet)
},
( { ValueNumber = yes } ->
{ NovnRepeat is AllRepeat - VnRepeat },
optimize__repeat(NovnRepeat, no, LayoutLabelSet,
Instrs0, Instrs1),
optimize__middle(Instrs1, no, LayoutLabelSet, Instrs2),
optimize__repeat(VnRepeat, yes, LayoutLabelSet,
Instrs2, Instrs3)
;
optimize__repeat(AllRepeat, no, LayoutLabelSet,
Instrs0, Instrs1),
optimize__middle(Instrs1, yes, LayoutLabelSet, Instrs3)
),
optimize__last(Instrs3, LayoutLabelSet, Instrs),
{ CProc = c_procedure(Name, Arity, PredProcId, Instrs) }.
%-----------------------------------------------------------------------------%
:- pred optimize__repeat(int::in, bool::in, set(label)::in,
list(instruction)::in, list(instruction)::out,
io__state::di, io__state::uo) is det.
optimize__repeat(Iter0, DoVn, LayoutLabelSet, Instrs0, Instrs) -->
(
{ Iter0 > 0 }
->
{ Iter1 is Iter0 - 1 },
( { Iter1 = 0 } ->
{ Final = yes }
;
{ Final = no }
),
optimize__repeated(Instrs0, DoVn, Final, LayoutLabelSet,
Instrs1, Mod),
( { Mod = yes } ->
optimize__repeat(Iter1, DoVn, LayoutLabelSet,
Instrs1, Instrs)
;
{ Instrs = Instrs1 }
)
;
{ Instrs = Instrs0 }
).
% We short-circuit jump sequences before normal peepholing
% to create more opportunities for use of the tailcall macro.
:- pred optimize__repeated(list(instruction)::in, bool::in, bool::in,
set(label)::in, list(instruction)::out, bool::out,
io__state::di, io__state::uo) is det.
optimize__repeated(Instrs0, DoVn, Final, LayoutLabelSet, Instrs, Mod) -->
globals__io_lookup_bool_option(very_verbose, VeryVerbose),
globals__io_lookup_bool_option(debug_opt, DebugOpt),
{ opt_util__find_first_label(Instrs0, Label) },
{ opt_util__format_label(Label, LabelStr) },
globals__io_lookup_bool_option(optimize_value_number, ValueNumber),
( { ValueNumber = yes, DoVn = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing value number for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
value_number_main(Instrs0, LayoutLabelSet, Instrs1),
( { Instrs1 = Instrs0 } ->
[]
;
opt_debug__msg(DebugOpt, "after value numbering"),
opt_debug__dump_instrs(DebugOpt, Instrs1)
)
;
{ Instrs1 = Instrs0 }
),
globals__io_lookup_bool_option(optimize_jumps, Jumpopt),
globals__io_lookup_bool_option(optimize_fulljumps, FullJumpopt),
globals__io_lookup_bool_option(checked_nondet_tailcalls,
CheckedNondetTailCalls),
globals__io_get_trace_level(TraceLevel),
( { Jumpopt = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing jumps for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ jumpopt_main(Instrs1, LayoutLabelSet, TraceLevel,
FullJumpopt, Final, CheckedNondetTailCalls,
Instrs2, Mod1) },
( { Mod1 = yes } ->
opt_debug__msg(DebugOpt, "after jump optimization"),
opt_debug__dump_instrs(DebugOpt, Instrs2)
;
[]
)
;
{ Instrs2 = Instrs1 },
{ Mod1 = no }
),
globals__io_lookup_bool_option(optimize_peep, Peephole),
( { Peephole = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing locally for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
globals__io_get_gc_method(GC_Method),
{ peephole__optimize(GC_Method, Instrs2, Instrs3, Mod2) },
( { Mod2 = yes } ->
opt_debug__msg(DebugOpt, "after peepholing"),
opt_debug__dump_instrs(DebugOpt, Instrs3)
;
[]
)
;
{ Instrs3 = Instrs2 },
{ Mod2 = no }
),
globals__io_lookup_bool_option(optimize_labels, LabelElim),
( { LabelElim = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing labels for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ labelopt_main(Instrs3, Final, LayoutLabelSet,
Instrs4, Mod3) },
( { Mod3 = yes } ->
opt_debug__msg(DebugOpt, "after label optimization"),
opt_debug__dump_instrs(DebugOpt, Instrs4)
;
[]
)
;
{ Instrs4 = Instrs3 },
{ Mod3 = no }
),
globals__io_lookup_bool_option(optimize_dups, DupElim),
( { DupElim = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing duplicates for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ dupelim_main(Instrs4, Instrs) },
( { Instrs = Instrs4 } ->
[]
;
opt_debug__msg(DebugOpt, "after duplicate elimination"),
opt_debug__dump_instrs(DebugOpt, Instrs)
)
;
{ Instrs = Instrs4 }
),
{ Mod1 = no, Mod2 = no, Mod3 = no, Instrs = Instrs0 ->
Mod = no
;
Mod = yes
},
globals__io_lookup_bool_option(statistics, Statistics),
maybe_report_stats(Statistics).
:- pred optimize__middle(list(instruction)::in, bool::in, set(label)::in,
list(instruction)::out, io__state::di, io__state::uo) is det.
optimize__middle(Instrs0, Final, LayoutLabelSet, Instrs) -->
globals__io_lookup_bool_option(very_verbose, VeryVerbose),
globals__io_lookup_bool_option(debug_opt, DebugOpt),
{ opt_util__find_first_label(Instrs0, Label) },
{ opt_util__format_label(Label, LabelStr) },
globals__io_lookup_bool_option(optimize_frames, FrameOpt),
( { FrameOpt = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing frames for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ frameopt_main(Instrs0, Instrs1, Mod1, Jumps) },
( { Mod1 = yes } ->
opt_debug__msg(DebugOpt, "after frame optimization"),
opt_debug__dump_instrs(DebugOpt, Instrs1)
;
[]
),
globals__io_lookup_bool_option(optimize_fulljumps, FullJumpopt),
globals__io_lookup_bool_option(checked_nondet_tailcalls,
CheckedNondetTailCalls),
globals__io_get_trace_level(TraceLevel),
( { Jumps = yes, FullJumpopt = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing jumps for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ jumpopt_main(Instrs1, LayoutLabelSet, TraceLevel,
FullJumpopt, Final, CheckedNondetTailCalls,
Instrs2, Mod2) },
( { Mod2 = yes } ->
opt_debug__msg(DebugOpt, "after jump optimization"),
opt_debug__dump_instrs(DebugOpt, Instrs2)
;
[]
)
;
{ Instrs2 = Instrs1 }
),
( { Mod1 = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing labels for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ labelopt_main(Instrs2, Final, LayoutLabelSet,
Instrs, Mod3) },
( { Mod3 = yes } ->
opt_debug__msg(DebugOpt, "after label optimization"),
opt_debug__dump_instrs(DebugOpt, Instrs)
;
[]
)
;
{ Instrs = Instrs2 }
)
;
{ Instrs = Instrs0 }
).
:- pred optimize__last(list(instruction)::in, set(label)::in,
list(instruction)::out, io__state::di, io__state::uo) is det.
optimize__last(Instrs0, LayoutLabelSet, Instrs) -->
globals__io_lookup_bool_option(very_verbose, VeryVerbose),
globals__io_lookup_bool_option(debug_opt, DebugOpt),
{ opt_util__find_first_label(Instrs0, Label) },
{ opt_util__format_label(Label, LabelStr) },
globals__io_lookup_bool_option(optimize_delay_slot, DelaySlot),
globals__io_lookup_bool_option(optimize_value_number, ValueNumber),
( { DelaySlot = yes ; ValueNumber = yes } ->
% We must get rid of any extra labels added by other passes,
% since they can confuse both post_value_number and delay_slot.
( { VeryVerbose = yes } ->
io__write_string("% Optimizing labels for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ labelopt_main(Instrs0, no, LayoutLabelSet, Instrs1, Mod1) },
( { Mod1 = yes } ->
opt_debug__msg(DebugOpt, "after label optimization"),
opt_debug__dump_instrs(DebugOpt, Instrs1)
;
[]
)
;
{ Instrs1 = Instrs0 }
),
( { DelaySlot = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing delay slot for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ fill_branch_delay_slot(Instrs1, Instrs2) },
( { Instrs1 = Instrs0 } ->
opt_debug__msg(DebugOpt, "after delay slot filling"),
opt_debug__dump_instrs(DebugOpt, Instrs2)
;
[]
)
;
{ Instrs2 = Instrs1 }
),
( { ValueNumber = yes } ->
( { VeryVerbose = yes } ->
io__write_string("% Optimizing post value number for "),
io__write_string(LabelStr),
io__write_string("\n")
;
[]
),
{ value_number__post_main(Instrs2, Instrs) },
( { Instrs = Instrs2 } ->
[]
;
opt_debug__msg(DebugOpt, "after post value number"),
opt_debug__dump_instrs(DebugOpt, Instrs)
)
;
{ Instrs = Instrs1 }
).