mirror of
https://github.com/Mercury-Language/mercury.git
synced 2026-04-23 05:13:48 +00:00
dadf30718d6084962be37dae8b94de41aaee90e2
7 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
25b8b1abc3 |
Fix several performance bugs that showed up when the compiler was invoked on
Estimated hours taken: 20 Branches: main Fix several performance bugs that showed up when the compiler was invoked on Douglas Auclair's training_cars example. Also fix some minor problems that made it harder to find the information needed to localize those problems. training_cars.m is hard to compile quickly because it is big in two dimensions: it has lots of clauses, and each clause has big terms. My laptop still tries to swap itself to death on the full version of training_cars.m (it has only 512 Mb), but the compiler now works fine on a version containing about 20% of its clauses, whereas previously it couldn't compile it at all. In most cases, the changes convert N^2 algorithms to NlogN algorithms. They probably have higher constant factors and may yield small slowdowns for small N, but this is probably not noticeable. Avoiding bad worst case behavior is more important. compiler/superhomogeneous.m: Record the number of goals inserted in each goal being converted to superhomogeneous form. If this exceeds a threshold, wrap a from_ground_term scope around it. Put the predicates into a more cohesive sequence. compiler/field_access.m: Work with the code in superhomogeneous to record the number of inserted goals. Reorder the arguments of some performances to be consistent with the predicates in superhomogeneous.m. compiler/modes.m: Use the from_ground_term scope to reverse the list of inserted unifications if necessary. It is much more efficient to do this here than to let it happen by sequences of delays and wakeups. That would have quadratic complexity; this is linear. This is what I originally introduced from_ground_term scopes for. Then, the overhead was too high, because I added one scope per function symbol. This version should be fine, since there is at most one scope added per argument of an atom (clause head or call). compiler/modes.m: compiler/unique_modes.m: When we are processing goals inside a from_ground_term scope, record this fact. compiler/mode_info.m: Make it possible to record this fact. compiler/modecheck_unify.m: When we are inside a from_ground_term scope, don't try to update the insts of vars on the right hand sides of construction unifications. Since these variables came from expansion to superhomogeneous form, those variables won't occur in any following code, so updating their state is useless, and the algorithm we used to do so is linear in the size of the inst. Since the size of an inst of a variable that results from superhomogeneous expansion is itself on average proportional to the size of the original term, this change turns a quadratic algorithm into a linear one. compiler/inst_match.m: Use balanced trees instead of ordered lists to represents sets of expansions, since these sets can be large. Note an opportunity for further improvement. compiler/inst_util.m: Note another opportunity for further improvement. compiler/instmap.m: Rename several predicates to avoid ambiguities. compiler/cse_detection.m: We used to print statistics for the processing of each procedure without saying which procedure it is for; fix this. compiler/switch_detection.m: Don't print progress messages for predicates with no procedures, since they would be misleading. compiler/higher_order.m: Change an algorithm that was quadratic in the number of arms for merging the information from the different arms of disjunctions and switches to an NlogN algorithm. Change the algorithm for merging the info from two branches that quadratic in the number of variables per arm to an NlogN algorithm. Changed some type equivalences to notag types to aid robustness. compiler/quantification.m: Rename several predicates to avoid ambiguities. The sets of variables in different arms of disjunctions and switches tend to have relatively small intersections. Yet the algorithms we used to compute the set of variables free in the disjunction or switch included the variables from the already processed arms in the sets being accumulated when processing later arms, leading to the quadratic behavior. This diff changes the algorithm to process each arm independently, and then use a more balanced algorithm to summarize the result. Specialize the predicates that compute sets of free vars in various HLDS fragments to work either with ordinary_nonlocals or code_gen_nonlocals without making the same decision repeatedly. Move some code out of large predicates into predicates of their own. compiler/Mercury.options: Specify the compiler option that can exploit this specialization to make the code run faster. compiler/simplify.m: Use a more efficient data structure for recording the parameters of an invocation of simplification. Change some predicate names and function symbol names to avoid ambiguity. compiler/common.m: compiler/deforest.m: compiler/deforest.m: compiler/make_hlds_warn.m: compiler/mercury_compile.m: compiler/pd_util.m: compiler/stack_opt.m: compiler/term_constr_build.m: Conform to the changes in simplify.m and/or instmap.m. compiler/mercury_compile.m: Fix a bug in progress messages for polymorphism.m. compiler/equiv_type_hlds.m: Most of the time, substitutions inside insts have no effect, because very few insts include any reference to a types. Instead of the old approach of building new insts and then throwing them away if they are the same as the old ones, don't build new insts at all if the old inst contains no types. compiler/common.m: Change some predicate names to make them clearer. compiler/hlds_clauses.m: Record the number of clauses so far, to allow a more informative progress message to be printed. compiler/add_clause.m: Print this more informative progress message. Conform to the changes in superhomogeneous.m. compiler/code_gen.m: Use the context of the predicate's first clause (which will be the context of the first clause head) as the context of the predicate's interface events. Unlike the context of the body goal, this won't be affected by program transformations such as wrapping a from_ground_term scope around some goals. It is better for users anyway, since the old policy lead to contexts in the middle of procedure bodies if the top level goal was a disjunction, switch or if-then-else. tests/debugger/*.exp: Update the expected outputs to conform to the change to code_gen.m. |
||
|
|
a8ffd3680c |
Change the compiler and tools so that .' and not :' is now used as the
Estimated hours taken: 14 Branches: main Change the compiler and tools so that `.' and not `:' is now used as the module separator in all output. Infix `.' now has associativity yfx and priority 10. NEWS: Report the change. configure.in: Amend the test for an up-to-date Mercury compiler to check whether it recognises `.' as a module qualifier. compiler/code_gen.m: compiler/error_util.m: compiler/hlds_out.m: compiler/prog_out.m: compiler/prog_util.m: compiler/rl_exprn.m: compiler/rl_gen.m: compiler/source_file_map.m: compiler/unused_args.m: library/io.m: library/rtti_implementation.m: library/type_desc.m: runtime/mercury_debug.c: runtime/mercury_deconstruct.c: runtime/mercury_stack_trace.c: Change `:' to `.' as module separator for output. compiler/mercury_to_mercury.m: compiler/prog_io_typeclass.m: As above. Fixed a bug where `.' was not being recognised as a module separator. doc/reference_manual.texi: Report the change. library/term_io.m: Ensure that infix `.' is written without surrounding spaces. tests/hard_coded/dot_separator.m: tests/hard_coded/dot_separator.exp: tests/hard_coded/Mmakefile: Test case added. |
||
|
|
ed83fe4623 |
Optimize shallow traced modules by not adding calls to MR_trace to shallow
Estimated hours taken: 12
Branches: main
Optimize shallow traced modules by not adding calls to MR_trace to shallow
traced procedures which cannot be called from a deep traced environment.
A shallow traced procedure can be optimized in this way if it is neither
exported from its defining module nor has its address taken.
The main purpose of this optimization is not the avoidance of the cost of the
MR_trace calls as much as it is the restoration of tail recursion optimization.
Previously, compiling a program in a debug grade would disable all tail
recursion in the program (since debug grades require at least shallow tracing
every module). This was a problem because it limited the sizes of the inputs
the debugged program could process before running out of memory. As long as
the procedures that recurse on the input are in the implementation section
of a shallow traced module, this should no longer happen.
compiler/trace_params.m:
Introduce the concept of a procedure's effective trace level. This is
identical to the global trace level, except if the procedure is not
exported and doesn't have its address taken, and the global trace level
is shallow. In that case, we say that the procedure's effective trace
level is none.
Computing a procedure's effective trace level requires its proc_info
and its parent pred_info, so require callers to supply these as
parameters.
compiler/code_info.m:
Store the current pred_info as well as the current proc_info, for
trace parameter lookups.
compiler/continuation_info.m:
compiler/code_gen.m:
Record the required trace parameters of a procedure in its layout
structure, since it can no longer be computed from the global trace
level.
compiler/stack_layout.m:
Use the trace parameters in procedures' layout structures, instead of
trying to compute them from the global trace level.
compiler/inlining.m:
compiler/liveness.m:
compiler/stack_alloc.m:
compiler/store_alloc.m:
compiler/trace.m:
Use procedures' effective trace level instead of the global trace level
where relevant.
compiler/llds.m:
Record the required trace parameter of a procedure in its c_procedure
representation, since it can no longer be computed from the global
trace level.
Delete an obsolete field.
compiler/optimize.m:
compiler/jumpopt.m:
Use a required trace parameter of a procedure in its c_procedure
representation, since it can no longer be computed from the global
trace level.
compiler/compile_target_code.m:
compiler/handle_options.m:
compiler/llds_out.m:
compiler/mercury_compile.m:
Trivial changes to conform to updated interfaces.
compiler/stack_opt.m:
Use the option opt_no_return_calls, instead of approximating it
with the trace level. (The old code was a holdover from before the
creation of the option.)
tests/debugger/shallow.m:
tests/debugger/shallow2.m:
tests/debugger/shallow.{inp,exp*}:
Divide the old test case in shallow.m in two. The top level predicates
stay in shallow.m and continue to be shallow traced. The two bottom
predicates move to shallow2.m and are now deep traced.
The new test input checks whether the debugger can walk across the
stack frames of procedures in shallow traced modules whose effective
trace level is "none" (such as queen/2).
|
||
|
|
9dcdf072b2 |
Update the line numbers in the expected debugger output to
Estimated hours taken: 0.5 tests/debugger/shallow.exp: tests/debugger/shallow.exp2: tests/debugger/browser_test.exp: tests/debugger/browser_test.exp2: tests/debugger/multi_parameter.exp: Update the line numbers in the expected debugger output to reflect changes caused by zs's recent bug fix to jumpopt.m. |
||
|
|
f0964815a3 |
Support line numbers in the debugger. You now get contexts (filename:lineno
Estimated hours taken: 40
Support line numbers in the debugger. You now get contexts (filename:lineno
pairs) printed in several circumstances, and you can put breakpoints on
contexts, when they correspond to trace events or to calls. The latter are
implemented as breakpoints on the label layouts of the return sites.
This required extending the debugging RTTI, so that associated with each
module there is now a new data structure listing the source file names that
contribute labels with layout structures to the code of the module. For each
such source file, this table gives a list of all such labels arising from
that file. The table entry for a label gives the line number within the file,
and the pointer to the label layout structure.
compiler/llds.m:
Add a context field to the call instruction.
compiler/continuation_info.m:
Instead of the old division of continuation info about labels into
trace ports and everything else, divide them into trace ports, resume
points and return sites. Record contexts with trace ports, and record
contexts and called procedure information with return sites.
compiler/code_info.m:
Conform to the changes in continuation_info.m.
compiler/options.m:
Add a new option that allows us to disable the generation of line
number information for size benchmarking (it has no other use).
compiler/stack_layout.m:
Generate the new components of the RTTI, unless the option says not to.
compiler/code_gen.m:
compiler/pragma_c_gen.m:
compiler/trace.m:
Include contexts in the information we gather for the layouts
associated with the events we generate.
compiler/call_gen.m:
Include contexts in the call LLDS instructions, for association
with the return site's label layout structure (which is done after
code generation is finished).
compiler/handle_options.m:
Delete the code that tests or sets the deleted options.
compiler/mercury_compile.m:
Delete the code that tests the deleted options.
compiler/basic_block.m:
compiler/dupelim.m:
compiler/frameopt.m:
compiler/livemap.m:
compiler/llds_common.m:
compiler/llds_out.m:
compiler/middle_rec.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/value_number.m:
compiler/vn_*.m:
Trivial changes to conform to the changes to llds.m.
compiler/jumpopt.m:
Do not optimize away jumps to labels with layout structures.
The jumps we are particularly concerned about now are the jumps
that return from procedure calls. Previously, it was okay to redirect
returns from several calls so that all go to the same label, since
the live variable information associated with the labels could be
merged. However, we now also associate line numbers with calls, and
these cannot be usefully merged.
compiler/optimize.m:
Pass the information required by jumpopt to it.
doc/user_guide.texi:
Document that you can now break at line numbers.
Document the new "context" command, and the -d or --detailed option
of the stack command and the commands that set ancestor levels.
runtime/mercury_stack_layout.h:
Extend the module layout structure definition with the new tables.
Remove the conditional facility for including label numbers in label
layout structures. It hasn't been used in a long time, and neither
Tyson or me expect to use it to debug either gc or the debugger itself,
so it has no uses left; the line numbers have superseded it.
runtime/mercury_stack_trace.[ch]:
Extend the code to print stack traces to also optionally print
contexts.
Add some utility predicates currently used by the debugger that could
also be use for debugging gc or for more detailed stack traces.
trace/mercury_trace_internal.c:
Implement the "break <context>" command, the "context" command, and
the -d or --detailed option of the stack command and the commands
that set ancestor levels.
Conditionally define a conditionally used variable.
trace/mercury_trace_external.c:
Minor changes to keep up with the changes to stack traces.
Delete an unused variable.
trace/mercury_trace_spy.[ch]:
Check for breakpoints on contexts.
trace/mercury_trace_tables.[ch]:
Add functions to search the RTTI data structures for labels
corresponding to a given context.
trace/mercury_trace_vars.[ch]:
Remember the context of the current environment.
tests/debugger/queen.{inp,exp}:
Test the new capabilities of the debugger.
tests/debugger/*.{inp,exp}:
Update the expected output of the debugger to account for contexts.
In some cases, modify the input script to put contexts where they don't
overflow lines.
|
||
|
|
8eda38ee58 |
Explicitly turn on command echoing, as for the other debugger tests,
Estimated hours taken: 0.05
tests/debugger/shallow.{inp,exp}:
Explicitly turn on command echoing, as for the other debugger tests,
to eliminate a potential source of variability (e.g. whether readline
is enabled or not.)
|
||
|
|
3c9d350e81 |
Shallow traced procedures do not fill in most of their stack slots holding
Estimated hours taken: 6
Shallow traced procedures do not fill in most of their stack slots holding
debugging information unless they are called from deep traced code, yet the
code in the runtime that handles redo events for such procedures was using
the values in those slots. This changes fixes that problem, by adding another
piece of code in the runtime for handling redo events for shallow traced
procedures.
compiler/llds.m:
Rename do_trace_redo_fail as do_trace_redo_fail_deep and add a new
label, do_trace_redo_fail_shallow.
compiler/trace.m:
Select which label gets included as the redoip in the temporary frame
that generates redo events, and document the assumptions about stack
slots used by the new label.
compiler/dupelim.m:
compiler/exprn_aux.m:
compiler/livemap.m:
compiler/llds_out.m:
compiler/opt_debug.m:
compiler/opt_util.m:
Conform to the changes in llds.m.
runtime/mercury_trace_base.c:
Add the label and code for handling redo events in shallow traced
procedures.
runtime/mercury_stack_layout.h:
Add a macro used by mercury_trace_base.c.
tests/debugger/shallow.{m,inp,exp}:
New test case. It is the same code as queens.m, but compiled with
shallow tracing. It segfaults before but not after this change.
tests/debugger/Mmakefile:
Enable the new test case.
Comment out a rule for a disabled test case, to avoid make warnings.
|