Commit Graph

7 Commits

Author SHA1 Message Date
Zoltan Somogyi
25b8b1abc3 Fix several performance bugs that showed up when the compiler was invoked on
Estimated hours taken: 20
Branches: main

Fix several performance bugs that showed up when the compiler was invoked on
Douglas Auclair's training_cars example. Also fix some minor problems that
made it harder to find the information needed to localize those problems.

training_cars.m is hard to compile quickly because it is big in two dimensions:
it has lots of clauses, and each clause has big terms.

My laptop still tries to swap itself to death on the full version of
training_cars.m (it has only 512 Mb), but the compiler now works fine
on a version containing about 20% of its clauses, whereas previously
it couldn't compile it at all.

In most cases, the changes convert N^2 algorithms to NlogN algorithms.
They probably have higher constant factors and may yield small slowdowns
for small N, but this is probably not noticeable. Avoiding bad worst case
behavior is more important.

compiler/superhomogeneous.m:
	Record the number of goals inserted in each goal being converted
	to superhomogeneous form. If this exceeds a threshold, wrap a
	from_ground_term scope around it.

	Put the predicates into a more cohesive sequence.

compiler/field_access.m:
	Work with the code in superhomogeneous to record the number of inserted
	goals. Reorder the arguments of some performances to be consistent
	with the predicates in superhomogeneous.m.

compiler/modes.m:
	Use the from_ground_term scope to reverse the list of inserted
	unifications if necessary. It is much more efficient to do this here
	than to let it happen by sequences of delays and wakeups. That would
	have quadratic complexity; this is linear.

	This is what I originally introduced from_ground_term scopes for.
	Then, the overhead was too high, because I added one scope per function
	symbol. This version should be fine, since there is at most one scope
	added per argument of an atom (clause head or call).

compiler/modes.m:
compiler/unique_modes.m:
	When we are processing goals inside a from_ground_term scope, record
	this fact.

compiler/mode_info.m:
	Make it possible to record this fact.

compiler/modecheck_unify.m:
	When we are inside a from_ground_term scope, don't try to update the
	insts of vars on the right hand sides of construction unifications.
	Since these variables came from expansion to superhomogeneous form,
	those variables won't occur in any following code, so updating their
	state is useless, and the algorithm we used to do so is linear in the
	size of the inst. Since the size of an inst of a variable that results
	from superhomogeneous expansion is itself on average proportional to
	the size of the original term, this change turns a quadratic algorithm
	into a linear one.

compiler/inst_match.m:
	Use balanced trees instead of ordered lists to represents sets of
	expansions, since these sets can be large.

	Note an opportunity for further improvement.

compiler/inst_util.m:
	Note another opportunity for further improvement.

compiler/instmap.m:
	Rename several predicates to avoid ambiguities.

compiler/cse_detection.m:
	We used to print statistics for the processing of each procedure
	without saying which procedure it is for; fix this.

compiler/switch_detection.m:
	Don't print progress messages for predicates with no procedures,
	since they would be misleading.

compiler/higher_order.m:
	Change an algorithm that was quadratic in the number of arms
	for merging the information from the different arms of disjunctions
	and switches to an NlogN algorithm.

	Change the algorithm for merging the info from two branches
	that quadratic in the number of variables per arm to an NlogN
	algorithm.

	Changed some type equivalences to notag types to aid robustness.

compiler/quantification.m:
	Rename several predicates to avoid ambiguities.

	The sets of variables in different arms of disjunctions and switches
	tend to have relatively small intersections. Yet the algorithms we
	used to compute the set of variables free in the disjunction or switch
	included the variables from the already processed arms in the sets
	being accumulated when processing later arms, leading to the quadratic
	behavior. This diff changes the algorithm to process each arm
	independently, and then use a more balanced algorithm to summarize
	the result.

	Specialize the predicates that compute sets of free vars in various
	HLDS fragments to work either with ordinary_nonlocals or
	code_gen_nonlocals without making the same decision repeatedly.

	Move some code out of large predicates into predicates of their own.

compiler/Mercury.options:
	Specify the compiler option that can exploit this specialization
	to make the code run faster.

compiler/simplify.m:
	Use a more efficient data structure for recording the parameters
	of an invocation of simplification.

	Change some predicate names and function symbol names to avoid
	ambiguity.

compiler/common.m:
compiler/deforest.m:
compiler/deforest.m:
compiler/make_hlds_warn.m:
compiler/mercury_compile.m:
compiler/pd_util.m:
compiler/stack_opt.m:
compiler/term_constr_build.m:
	Conform to the changes in simplify.m and/or instmap.m.

compiler/mercury_compile.m:
	Fix a bug in progress messages for polymorphism.m.

compiler/equiv_type_hlds.m:
	Most of the time, substitutions inside insts have no effect, because
	very few insts include any reference to a types. Instead of the old
	approach of building new insts and then throwing them away if they
	are the same as the old ones, don't build new insts at all if the
	old inst contains no types.

compiler/common.m:
	Change some predicate names to make them clearer.

compiler/hlds_clauses.m:
	Record the number of clauses so far, to allow a more informative
	progress message to be printed.

compiler/add_clause.m:
	Print this more informative progress message.

	Conform to the changes in superhomogeneous.m.

compiler/code_gen.m:
	Use the context of the predicate's first clause (which will be the
	context of the first clause head) as the context of the predicate's
	interface events. Unlike the context of the body goal, this won't
	be affected by program transformations such as wrapping a
	from_ground_term scope around some goals. It is better for users
	anyway, since the old policy lead to contexts in the middle of
	procedure bodies if the top level goal was a disjunction, switch or
	if-then-else.

tests/debugger/*.exp:
	Update the expected outputs to conform to the change to code_gen.m.
2006-03-29 00:57:46 +00:00
Ralph Becket
a8ffd3680c Change the compiler and tools so that .' and not :' is now used as the
Estimated hours taken: 14
Branches: main

Change the compiler and tools so that `.' and not `:' is now used as the
module separator in all output.

Infix `.' now has associativity yfx and priority 10.

NEWS:
	Report the change.

configure.in:
	Amend the test for an up-to-date Mercury compiler to check whether
	it recognises `.' as a module qualifier.

compiler/code_gen.m:
compiler/error_util.m:
compiler/hlds_out.m:
compiler/prog_out.m:
compiler/prog_util.m:
compiler/rl_exprn.m:
compiler/rl_gen.m:
compiler/source_file_map.m:
compiler/unused_args.m:
library/io.m:
library/rtti_implementation.m:
library/type_desc.m:
runtime/mercury_debug.c:
runtime/mercury_deconstruct.c:
runtime/mercury_stack_trace.c:
	Change `:' to `.' as module separator for output.

compiler/mercury_to_mercury.m:
compiler/prog_io_typeclass.m:
	As above.
	Fixed a bug where `.' was not being recognised as a module separator.

doc/reference_manual.texi:
	Report the change.

library/term_io.m:
	Ensure that infix `.' is written without surrounding spaces.

tests/hard_coded/dot_separator.m:
tests/hard_coded/dot_separator.exp:
tests/hard_coded/Mmakefile:
	Test case added.
2003-01-17 05:57:20 +00:00
Zoltan Somogyi
ed83fe4623 Optimize shallow traced modules by not adding calls to MR_trace to shallow
Estimated hours taken: 12
Branches: main

Optimize shallow traced modules by not adding calls to MR_trace to shallow
traced procedures which cannot be called from a deep traced environment.
A shallow traced procedure can be optimized in this way if it is neither
exported from its defining module nor has its address taken.

The main purpose of this optimization is not the avoidance of the cost of the
MR_trace calls as much as it is the restoration of tail recursion optimization.
Previously, compiling a program in a debug grade would disable all tail
recursion in the program (since debug grades require at least shallow tracing
every module). This was a problem because it limited the sizes of the inputs
the debugged program could process before running out of memory. As long as
the procedures that recurse on the input are in the implementation section
of a shallow traced module, this should no longer happen.

compiler/trace_params.m:
	Introduce the concept of a procedure's effective trace level. This is
	identical to the global trace level, except if the procedure is not
	exported and doesn't have its address taken, and the global trace level
	is shallow. In that case, we say that the procedure's effective trace
	level is none.

	Computing a procedure's effective trace level requires its proc_info
	and its parent pred_info, so require callers to supply these as
	parameters.

compiler/code_info.m:
	Store the current pred_info as well as the current proc_info, for
	trace parameter lookups.

compiler/continuation_info.m:
compiler/code_gen.m:
	Record the required trace parameters of a procedure in its layout
	structure, since it can no longer be computed from the global trace
	level.

compiler/stack_layout.m:
	Use the trace parameters in procedures' layout structures, instead of
	trying to compute them from the global trace level.

compiler/inlining.m:
compiler/liveness.m:
compiler/stack_alloc.m:
compiler/store_alloc.m:
compiler/trace.m:
	Use procedures' effective trace level instead of the global trace level
	where relevant.

compiler/llds.m:
	Record the required trace parameter of a procedure in its c_procedure
	representation, since it can no longer be computed from the global
	trace level.

	Delete an obsolete field.

compiler/optimize.m:
compiler/jumpopt.m:
	Use a required trace parameter of a procedure in its c_procedure
	representation, since it can no longer be computed from the global
	trace level.

compiler/compile_target_code.m:
compiler/handle_options.m:
compiler/llds_out.m:
compiler/mercury_compile.m:
	Trivial changes to conform to updated interfaces.

compiler/stack_opt.m:
	Use the option opt_no_return_calls, instead of approximating it
	with the trace level. (The old code was a holdover from before the
	creation of the option.)

tests/debugger/shallow.m:
tests/debugger/shallow2.m:
tests/debugger/shallow.{inp,exp*}:
	Divide the old test case in shallow.m in two. The top level predicates
	stay in shallow.m and continue to be shallow traced. The two bottom
	predicates move to shallow2.m and are now deep traced.

	The new test input checks whether the debugger can walk across the
	stack frames of procedures in shallow traced modules whose effective
	trace level is "none" (such as queen/2).
2002-07-30 08:25:20 +00:00
Fergus Henderson
9dcdf072b2 Update the line numbers in the expected debugger output to
Estimated hours taken: 0.5

tests/debugger/shallow.exp:
tests/debugger/shallow.exp2:
tests/debugger/browser_test.exp:
tests/debugger/browser_test.exp2:
tests/debugger/multi_parameter.exp:
	Update the line numbers in the expected debugger output to
	reflect changes caused by zs's recent bug fix to jumpopt.m.
1999-12-13 07:43:41 +00:00
Zoltan Somogyi
f0964815a3 Support line numbers in the debugger. You now get contexts (filename:lineno
Estimated hours taken: 40

Support line numbers in the debugger. You now get contexts (filename:lineno
pairs) printed in several circumstances, and you can put breakpoints on
contexts, when they correspond to trace events or to calls. The latter are
implemented as breakpoints on the label layouts of the return sites.

This required extending the debugging RTTI, so that associated with each
module there is now a new data structure listing the source file names that
contribute labels with layout structures to the code of the module. For each
such source file, this table gives a list of all such labels arising from
that file. The table entry for a label gives the line number within the file,
and the pointer to the label layout structure.

compiler/llds.m:
	Add a context field to the call instruction.

compiler/continuation_info.m:
	Instead of the old division of continuation info about labels into
	trace ports and everything else, divide them into trace ports, resume
	points and return sites. Record contexts with trace ports, and record
	contexts and called procedure information with return sites.

compiler/code_info.m:
	Conform to the changes in continuation_info.m.

compiler/options.m:
	Add a new option that allows us to disable the generation of line
	number information for size benchmarking (it has no other use).

compiler/stack_layout.m:
	Generate the new components of the RTTI, unless the option says not to.

compiler/code_gen.m:
compiler/pragma_c_gen.m:
compiler/trace.m:
	Include contexts in the information we gather for the layouts
	associated with the events we generate.

compiler/call_gen.m:
	Include contexts in the call LLDS instructions, for association
	with the return site's label layout structure (which is done after
	code generation is finished).

compiler/handle_options.m:
	Delete the code that tests or sets the deleted options.

compiler/mercury_compile.m:
	Delete the code that tests the deleted options.

compiler/basic_block.m:
compiler/dupelim.m:
compiler/frameopt.m:
compiler/livemap.m:
compiler/llds_common.m:
compiler/llds_out.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/value_number.m:
compiler/vn_*.m:
	Trivial changes to conform to the changes to llds.m.

compiler/jumpopt.m:
	Do not optimize away jumps to labels with layout structures.
	The jumps we are particularly concerned about now are the jumps
	that return from procedure calls. Previously, it was okay to redirect
	returns from several calls so that all go to the same label, since
	the live variable information associated with the labels could be
	merged. However, we now also associate line numbers with calls, and
	these cannot be usefully merged.

compiler/optimize.m:
	Pass the information required by jumpopt to it.

doc/user_guide.texi:
	Document that you can now break at line numbers.

	Document the new "context" command, and the -d or --detailed option
	of the stack command and the commands that set ancestor levels.

runtime/mercury_stack_layout.h:
	Extend the module layout structure definition with the new tables.

	Remove the conditional facility for including label numbers in label
	layout structures. It hasn't been used in a long time, and neither
	Tyson or me expect to use it to debug either gc or the debugger itself,
	so it has no uses left; the line numbers have superseded it.

runtime/mercury_stack_trace.[ch]:
	Extend the code to print stack traces to also optionally print
	contexts.

	Add some utility predicates currently used by the debugger that could
	also be use for debugging gc or for more detailed stack traces.

trace/mercury_trace_internal.c:
	Implement the "break <context>" command, the "context" command, and
	the -d or --detailed option of the stack command and the commands
	that set ancestor levels.

	Conditionally define a conditionally used variable.

trace/mercury_trace_external.c:
	Minor changes to keep up with the changes to stack traces.

	Delete an unused variable.

trace/mercury_trace_spy.[ch]:
	Check for breakpoints on contexts.

trace/mercury_trace_tables.[ch]:
	Add functions to search the RTTI data structures for labels
	corresponding to a given context.

trace/mercury_trace_vars.[ch]:
	Remember the context of the current environment.

tests/debugger/queen.{inp,exp}:
	Test the new capabilities of the debugger.

tests/debugger/*.{inp,exp}:
	Update the expected output of the debugger to account for contexts.
	In some cases, modify the input script to put contexts where they don't
	overflow lines.
1999-11-15 00:43:59 +00:00
Zoltan Somogyi
8eda38ee58 Explicitly turn on command echoing, as for the other debugger tests,
Estimated hours taken: 0.05

tests/debugger/shallow.{inp,exp}:
	Explicitly turn on command echoing, as for the other debugger tests,
	to eliminate a potential source of variability (e.g. whether readline
	is enabled or not.)
1999-10-04 04:20:35 +00:00
Zoltan Somogyi
3c9d350e81 Shallow traced procedures do not fill in most of their stack slots holding
Estimated hours taken: 6

Shallow traced procedures do not fill in most of their stack slots holding
debugging information unless they are called from deep traced code, yet the
code in the runtime that handles redo events for such procedures was using
the values in those slots. This changes fixes that problem, by adding another
piece of code in the runtime for handling redo events for shallow traced
procedures.

compiler/llds.m:
	Rename do_trace_redo_fail as do_trace_redo_fail_deep and add a new
	label, do_trace_redo_fail_shallow.

compiler/trace.m:
	Select which label gets included as the redoip in the temporary frame
	that generates redo events, and document the assumptions about stack
	slots used by the new label.

compiler/dupelim.m:
compiler/exprn_aux.m:
compiler/livemap.m:
compiler/llds_out.m:
compiler/opt_debug.m:
compiler/opt_util.m:
	Conform to the changes in llds.m.

runtime/mercury_trace_base.c:
	Add the label and code for handling redo events in shallow traced
	procedures.

runtime/mercury_stack_layout.h:
	Add a macro used by mercury_trace_base.c.

tests/debugger/shallow.{m,inp,exp}:
	New test case. It is the same code as queens.m, but compiled with
	shallow tracing. It segfaults before but not after this change.

tests/debugger/Mmakefile:
	Enable the new test case.

	Comment out a rule for a disabled test case, to avoid make warnings.
1999-08-24 09:40:47 +00:00