Commit Graph

24 Commits

Author SHA1 Message Date
Zoltan Somogyi
a47de48c4d s/input_stream/text_input_stream/ ...
... and the same for output streams.
2023-04-24 14:59:20 +10:00
Julien Fischer
f8d188fda8 Fix minor documentation problems.
deep_profiler/display_report.m:
deep_profiler/message.m:
deep_profiler/recursion_patterns.m:
deep_profiler/var_use_analsis.m:
java/runtime/UnreachableDefault.java:
runtime/mercury_engine.c:
runtime/mercury_minimal_model.c:
runtime/mercury_signal.h:
runtime/mercury_stack_layout.h:
runtime/mercury_wrapper.c:
runtime/mercury_threadscope.c:
trace/mercury_trace_external.c:
HISTORY:
    As above.
2018-10-09 05:27:36 +00:00
Zoltan Somogyi
9095985aa8 Fix more warnings from --warn-inconsistent-pred-order-clauses.
deep_profiler/*.m:
    Fix inconsistencies between (a) the order in which functions and predicates
    are declared, and (b) the order in which they are defined.

    In most modules, either the order of the declarations or the order
    of the definitions made sense, and I changed the other to match.
    In some modules, neither made sense, so I changed *both* to an order
    that *does* make sense (i.e. it has related predicates together).

    In query.m, put the various commands in the same sensible order
    as the code processing them.

    In html_format.m, merge two exported functions together, since
    they can't be used separately.

    In some places, put dividers between groups of related
    functions/predicates, to make the groups themselves more visible.

    In some places, fix comments or programming style.

deep_profiler/DEEP_FLAGS.in:
    Since all the modules in this directory are now free from any warnings
    generated by --warn-inconsistent-pred-order-clauses, specify that option
    by default in this directory to keep it that way.
2017-04-30 15:48:13 +10:00
Zoltan Somogyi
270170416d Bring the style of deep_profiler/* up-to-date.
deep_profiler/*.m:
    Replace ( C -> T ; E ) if-then-elses with (if C then T else E ).

    Replace calls to error/1 with calls to unexpected/3.

    Add some module qualifications where this makes the code easier to read.
2015-07-15 15:30:22 +02:00
Zoltan Somogyi
cf0037fc7c Track format strings through string appends.
compiler/format_call.m:
    When optimizing calls to format predicates, allow the format string to
    be constructed by nested sequences of calls to string.++, string.append,
    and string.append_list.

    Add conditionally compiled debugging output.

compiler/set_of_var.m:
    Allow this module to compile if tree_bitset is replaced with plain set
    for debugging.

deep_profiler/message.m:
    Fix a bug in a format string automatically detected by the updated
    compiler. The compiler didn't detect it before because the format string
    is comstructed using string.++.
2015-01-22 23:10:05 +11:00
Zoltan Somogyi
340c5300e6 Fix spelling in the deep profiler.
Fix some other issues as well that I found while fixing the spelling.

mdbcomp/feedback.automatic_parallelism.m:
deep_profiler/autopar_find_best_par.m:
deep_profiler/mdprof_create_feedback.m:
    Rename the best_par_algorithm type to alg_for_finding_best_par,
    since the old name was misleading. Perform the same rename for
    another type based on it, and the option specifying it.

    Remove the functor estimate_speedup_by_num_vars, since it hasn't
    been used by anything in a long time, and won't in the future.

deep_profiler/autopar_calc_overlap.m:
deep_profiler/autopar_costs.m:
deep_profiler/autopar_reports.m:
deep_profiler/autopar_search_callgraph.m:
deep_profiler/autopar_search_goals.m:
deep_profiler/coverage.m:
deep_profiler/create_report.m:
deep_profiler/dump.m:
deep_profiler/mdprof_report_feedback.m:
deep_profiler/measurement_units.m:
deep_profiler/measurements.m:
deep_profiler/message.m:
deep_profiler/query.m:
deep_profiler/recursion_patterns.m:
deep_profiler/report.m:
deep_profiler/startup.m:
deep_profiler/var_use_analysis.m:
mdbcomp/mdbcomp.goal_path.m:
mdbcomp/program_representation.m:
    Conform to the above. Fix spelling errors. In some places, improve
    comments and/or variable names.
2014-12-20 23:05:38 +11:00
Zoltan Somogyi
e7d9649021 Remove all references to mdprof_feedback, and replace it with
Estimated hours taken: 0.2
Branches: main

Mmakefile:
compiler/introduce_parallelism.m:
compiler/options.m:
deep_profiler/autopar_search_callgraph.m:
deep_profiler/message.m:
	Remove all references to mdprof_feedback, and replace it with
	references to mdprof_create_feedback and/or mdprof_report_feedback,
	as appropriate.
2011-09-27 04:41:25 +00:00
Zoltan Somogyi
d013a4cfcf Change the types that represent forward and reverse goal paths from being
Estimated hours taken: 20
Branches: main

Change the types that represent forward and reverse goal paths from being
wrappers around lists of steps, to being full discriminated union types.
This is meant to accomplish two objectives.

First, since taking the wrappers off and putting them back on is inconvenient,
code often dealt with naked lists of steps, with the meaning of those steps
sometimes being unclear.

Second, in a future change I intend to change the way the debugger represents
goal paths from being strings to being statically allocated terms of the
reverse_goal_path type. This should have two benefits. One is reduced memory
consumption, since two different goal path strings cannot share memory
but two different reverse goal paths can share the memory containing their
common tail (the goal paths steps near the root). The other is that the
declarative debugger won't need to do any conversion from string to structure,
and should therefore be faster.

Having the compiler generate static terms of the reverse_goal_path type into
the .c files it generates for every Mercury program being compiled with
debugging requires it to have access to the definition of that type and all
its components. The best way to do this is to put all those types into a new
builtin module in the library (a debugging equivalent of e.g.
profiling_builtin.m). We cannot put the definition of the list type into
that module without causing considerable backward incompatibilities.

mdbcomp/mdbcomp.goal_path.m:
	Make the change described above.

	Add some more predicates implementing abstract operations on goal
	paths.

browser/declarative_tree.m:
compiler/goal_path.m:
compiler/goal_util.m:
compiler/hlds_goal.m:
compiler/introduce_parallelism.m:
compiler/mode_ordering.m:
compiler/push_goals_together.m:
compiler/rbmm.condition_renaming.m:
compiler/trace_gen.m:
compiler/tupling.m:
compiler/unneeded_code.m:
deep_profiler/autopar_costs.m:
deep_profiler/autopar_reports.m:
deep_profiler/autopar_search_callgraph.m:
deep_profiler/autopar_search_goals.m:
deep_profiler/create_report.m:
deep_profiler/message.m:
deep_profiler/program_representation_utils.m:
deep_profiler/read_profile.m:
deep_profiler/recursion_patterns.m:
deep_profiler/var_use_analysis.m:
	Conform to the change in representation. In some cases, remove
	predicates whose only job was to manipulate wrappers. In others,
	replace concrete operations on lists of steps with abstract operations
	on goal paths.

compiler/mode_constraints.m:
	Comment out some code that I do not understand, which I think never
	worked (not surprising, since the whole module has never been
	operational).

mdbcomp/slice_and_dice.m:
	Since this diff changes the types representing goal paths, it also
	changes their default ordering, as implemented by builtin.compare.
	When ordering slices and dices by goal paths, make the ordering
	explicitly work on the forward goal path, since ordering by the
	reverse goal path (the actual data being used) gives nonintuitive
	results.

library/list.m:
	Speed up some code.

mdbcomp/feedback.automatic_parallelism.m:
	Fix some formatting.
2011-09-26 07:08:58 +00:00
Zoltan Somogyi
59b0edacbe New module for calculating the overlap between the conjuncts of a
Estimated hours taken: 2

deep_profiler/autopar_calc_overlap.m:
	New module for calculating the overlap between the conjuncts of a
	parallelised conjunction. Its contents are taken from the old
	autopar_search_callgraph.m.

deep_profiler/autopar_costs.m:
	New module for calculating the costs of goals. Its contents
	are taken from the old autopar_search_callgraph.m.

deep_profiler/autopar_reports.m:
	New module for creating reports. Its contents are taken from
	the old autopar_search_callgraph.m.

deep_profiler/autopar_search_goals.m:
	New module for searching goals for parallelizable conjunctions.
	Its contents are taken from the old autopar_search_callgraph.m.

deep_profiler/autopar_search_callgraph.m:
	Remove the code moved to other modules.

deep_profiler/mdprof_fb.automatic_parallelism.m:
	Add the new modules.

deep_profiler/*.m:
	Remove unnecessary imports.
	Fix copyright years on the new modules.

browser/*.m:
compiler/*.m:
mdbcomp/*.m:
	Remove unnecessary imports.

library/Mercury.options:
	Make it possible to compile a whole workspace with
	--warn-unused-imports by turning that option off for type_desc.m
	(which has a necessary import that --warn-unused-imports thinks
	is unused).
2011-01-27 08:03:54 +00:00
Zoltan Somogyi
f3f3a6f0d3 Create candidate parallel conjunctions that require pushing goals.
Estimated hours taken: 6

Create candidate parallel conjunctions that require pushing goals.
2011-01-20 05:37:12 +00:00
Paul Bone
2070f42b24 Refactor goal annotations in the deep profiler.
Goal annotations have previously been attached to goals using type-polymorphism
and in some cases type classes.  This has become clumsy as new annotations are
created.  Using the goal_id code introduced recently, this change associates
annotations with goals by storing them in an array indexed by goal ids.  Many
analyses have been updated to make use of this code.  This code should also be
faster as less allocation is done when annotating a goal as the goal
representation does not have to be reconstructed.

mdbcomp/mdbcomp.goal_path.m:
    Add predicates for working with goal attribute arrays.  These are
    polymorphic arrays that are indexed by goal id and can be used to associate
    information with goals.

deep_profiler/report.m:
    The procrep coverage info report now stores the coverage annotations in a
    goal_attr_array.

deep_profiler/coverage.m:
    The coverage analysis now returns its result in a goal_attr_array rather
    than by annotation the goal directly.

    The interface for the coverage module has changed, it now allows
    programmers to pass a goal_rep to it directly.  This makes it easier to
    call from other analyses.

    The coverage analysis no longer uses the calls_and_exits structure.
    Instead it uses the cost_and_callees structure like many other analyses.
    This also makes it easier to perform this annotation and others using only
    a single call site map structure.

    Moved add_coverage_point_to_map/5 from create_report.m to coverage.m.

deep_profiler/analysis_utils.m:
    Made cost_and_callees structure polymorphic so that any type can be used to
    represent the callees.  (So that either static or dynamic callees can be
    used).

    Added the number of exit port counts to the cost_and_callees structure.

    Added build_static_call_site_cost_and_callees_map/4.

    Rename build_call_site_cost_and_callees_map/4 to
    build_dynamic_call_site_cost_and_callees_map/4.

deep_profiler/var_use_analysis.m:
    Update the var_use_analysis to use coverage information provided in a
    goal_attr_array.

deep_profiler/recursion_patterns.m:
    Update the recursion analysis to use coverage information provided in a
    goal_attr_array.

deep_profiler/program_representation_utils.m:
    Add label_goals/4 to label goals with goal ids and build a map of goal ids
    to goal paths.

    Update pretty printing fucntions to work with either annotation on the
    goals themselves or provided by a higher order value.  The higher order
    argument maps nicly to the function goal_get_attribute/3 in goal_path.m

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Modify goal_annotate_with_instmap, it now returns the instmap annotations
    in a goal_attr_array.

    Conform to changes in:
        program_representation_utils.m
        coverage.m
        var_use_analysis.m

deep_profiler/message.m:
    Updated messagee to more correctly express the problems that
    mdprof_fb.automatic_parallelism.m may encounter.

deep_profiler/create_report.m:
    Conform to changes in coverage.m.

    Make use of code in analysis_utils.m to prepare call site maps for coverage
    analysis.

deep_profiler/recursion_patterns.m:
deep_profiler/var_use_analysis.m:
    Conform to changes in analysis_utils.m.

deep_profiler/display_report.m:
    Conform to changes in program_representation_utils.m.
2011-01-17 01:47:19 +00:00
Paul Bone
d43239d6a7 Move some of the goal path code from compiler/goal_path.m to the mdbcomp
library where it can be used by the deep profiler.

Also move the goal path code from program_representation.m to the new module,
goal_path.m in mdbcomp/

mdbcomp/goal_path.m:
    New module containing goal path code.

mdbcomp/program_representation.m:
    Original location of goal path code.

compiler/goal_path.m:
    Move some of this goal_path code into mdbcomp/goal_path.m

mdbcomp/feedback.automatic_parallelisation.m:
mdbcomp/rtti_access.m:
mdbcomp/slice_and_dice.m:
mdbcomp/trace_counts.m:
browser/debugger_interface.m:
browser/declarative_execution.m:
browser/declarative_tree.m:
compiler/build_mode_constraints.m:
compiler/call_gen.m:
compiler/code_info.m:
compiler/continuation_info.m:
compiler/coverage_profiling.m:
compiler/deep_profiling.m:
compiler/format_call.m:
compiler/goal_path.m:
compiler/goal_util.m:
compiler/hlds_data.m:
compiler/hlds_goal.m:
compiler/hlds_out_goal.m:
compiler/hlds_out_pred.m:
compiler/hlds_pred.m:
compiler/interval.m:
compiler/introduce_parallelism.m:
compiler/layout_out.m:
compiler/llds.m:
compiler/mode_constraint_robdd.m:
compiler/mode_constraints.m:
compiler/mode_ordering.m:
compiler/ordering_mode_constraints.m:
compiler/polymorphism.m:
compiler/post_typecheck.m:
compiler/prog_rep.m:
compiler/prop_mode_constraints.m:
compiler/push_goals_together.m:
compiler/rbmm.condition_renaming.m:
compiler/smm_common.m:
compiler/stack_layout.m:
compiler/stack_opt.m:
compiler/trace_gen.m:
compiler/tupling.m:
compiler/type_constraints.m:
compiler/typecheck.m:
compiler/unify_gen.m:
compiler/unneeded_code.m:
deep_profiler/Mmakefile:
deep_profiler/analysis_utils.m:
deep_profiler/coverage.m:
deep_profiler/create_report.m:
deep_profiler/display_report.m:
deep_profiler/dump.m:
deep_profiler/mdprof_fb.automatic_parallelism.m:
deep_profiler/message.m:
deep_profiler/old_query.m:
deep_profiler/profile.m:
deep_profiler/program_representation_utils.m:
deep_profiler/read_profile.m:
deep_profiler/recursion_patterns.m:
deep_profiler/report.m:
deep_profiler/var_use_analysis.m:
slice/Mmakefile:
slice/mcov.m:
    Conform to the move of the goal path code.
2011-01-13 00:36:56 +00:00
Zoltan Somogyi
a2cd0da5b3 The existing representation of goal_paths is suboptimal for several reasons.
Estimated hours taken: 80
Branches: main

The existing representation of goal_paths is suboptimal for several reasons.

- Sometimes we need forward goal paths (e.g. to look up goals), and sometimes
  we need reverse goal paths (e.g. when computing goal paths in the first
  place). We had two types for them, but

  - their names, goal_path and goal_path_consable, were not expressive, and
  - we could store only one of them in goal_infos.

- Testing whether goal A is a subgoal of goal B is quite error-prone using
  either form of goal paths.

- Using a goal path as a key in a map, which several compiler passes want to
  do, requires lots of expensive comparisons.

This diff replaces most uses of goal paths with goal ids. A goal id is an
integer, so it can be used as a key in faster maps, or even in arrays.
Every goal in the body of a procedure gets its id allocated in a depth first
search. Since we process each goal before we dive into is descendants,
the goal representing the whole body of a procedure always gets goal id 0.
The depth first traversal also builds up a map (the containing goal map)
that tells us the parent goal of ever subgoal, with the obvious exception
of the root goal itself. From the containing goal map, one can compute
both reverse and forward goal paths. It can also serve as the basis of an
efficient test of whether the goal identified by goal id A is an ancestor
of another goal identified by goal id B. We don't yet use this test,
but I expect we will in the future.

mdbcomp/program_representation.m:
	Add the goal_id type.

	Replace the existing goal_path and goal_path_consable types
	with two new types, forward_goal_path and reverse_goal_path.
	Since these now have wrappers around the list of goal path steps
	that identify each kind of goal path, it is now ok to expose their
	representations. This makes several compiler passes easier to code.

	Update the set of operations on goal paths to work on the new data
	structures.

	Add a couple of step types to represent lambdas and try goals.
	Their omission prior to this would have been a bug for constraint-based
	mode analysis, or any other compiler pass prior to the expansion out
	of lambda and try goals that wanted to use goal paths to identify
	subgoals.

browser/declarative_tree.m:
mdbcomp/rtti_access.m:
mdbcomp/slice_and_dice.m:
mdbcomp/trace_counts.m:
slice/mcov.m:
deep_profiler/*.m:
	Conform to the changes in goal path representation.

compiler/hlds_goal:
	Replace the goal_path field with a goal_id field in the goal_info,
	indicating that from now on, this should be used to identify goals.

	Keep a reverse_goal_path field in the goal_info for use by RBMM and
	CTGC. Those analyses were too hard to convert to using goal_ids,
	especially since RBMM uses goal_paths to identify goals in multi-pass
	algorithms that should be one-pass and should not NEED to identify
	any goals for later processing.

compiler/goal_path:
	Add predicates to fill in goal_ids, and update the predicates
	filling in the now deprecated reverse goal path fields.

	Add the operations needed by the rest of the compiler
	on goal ids and containing goal maps.

	Remove the option to set goal paths using "mode equivalent steps".
	Constraint based mode analysis now uses goal ids, and can now
	do its own equivalent optimization quite simply.

	Move the goal_path module from the check_hlds package to the hlds
	package.

compiler/*.m:
	Conform to the changes in goal path representation.

	Most modules now use goal_ids to identify goals, and use a containing
	goal map to convert the goal ids to goal paths when needed.
	However, the ctgc and rbmm modules still use (reverse) goal paths.

library/digraph.m:
library/group.m:
library/injection.m:
library/pprint.m:
library/pretty_printer.m:
library/term_to_xml.m:
	Minor style improvements.
2010-12-20 07:47:49 +00:00
Zoltan Somogyi
8a28e40c9b Add the predicates sorry, unexpected and expect to library/error.m.
Estimated hours taken: 2
Branches: main

Add the predicates sorry, unexpected and expect to library/error.m.

compiler/compiler_util.m:
library/error.m:
	Move the predicates sorry, unexpected and expect from compiler_util
	to error.

	Put the predicates in error.m into the same order as their
	declarations.

compiler/*.m:
	Change imports as needed.

compiler/lp.m:
compiler/lp_rational.m:
	Change imports as needed, and some minor cleanups.

deep_profiler/*.m:
	Switch to using the new library predicates, instead of calling error
	directly. Some other minor cleanups.

NEWS:
	Mention the new predicates in the standard library.
2010-12-15 06:30:36 +00:00
Paul Bone
c6d041cbc5 Improve the efficiency of the algorithms that select the best parallelsation of
a conjunction.  Now (by default) the search will stop creating choice points if
it has already created too many choice points.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Fix a large number of whitespace problems, such as trailing whitespace at
    the end of lines.

    Never attempt to parallelise goals that arn't det or cc_multi.

    Remove the original greedy search, it's now an option in the branch and
    bound search code.  Note that the greedy search algorithm has changed and
    sacrifices more solutions for runtime than before.

    Note that there are bugs remaining in a few cases causing incorrect
    parallel execution times to be calculated for dependant parallelisations.

deep_profiler/mdprof_feedback.m:
    Conform to changes in mdbcomp/feedback.automatic_parallelism.m.

    Update parsing of options for the choice of best parallelsation algorithm.

deep_profiler/branch_and_bound.m:
    Allow branch and bound code to track how many 'alternatives' have been
    created and alter the search in response to this.

    Branch and bound code must now be impure as it may call these impure
    predicates.

    Flush the output stream in debugging trace goals for branch and bound.

deep_profiler/measurements.m:
    Adjust the interface to the parallelsation metrics structure, so that it is
    easier to use with the new parallelsation search code.

    Changes to the goal costs code:
        Rename zero_goal_cost to dead_goal_cost, it is the cost of goals that are
        never executed.

        Modify atomic_goal_cost to take as a parameter the number of calls made to
        this goal.

        add_goal_costs has been renamed to add_goal_costs_seq since it computes
        the cost of a sequential conjunction of goals.

        The goal_cost_csq type has changed to track the number of calls made to
        trivial goals.

deep_profiler/message.m:
    Added a notice message to be used when the candidate parallel conjunction
    is not det or cc_multi.

mdbcomp/feedback.automatic_parallelism.m:
    Modify the alternatives for 'best parallelisation algorithm'.
    This type now represents the new ways of selecting complete vs greedy
    algorithms.

mdbcomp/program_representation.m:
    Add a multi-moded detism_components/3 predicate and refactor
    detism_get_solutions/1 and detism_get_can_fail/1 to call it.

    Add a multi-moded detism_committed_choice/2 predicate and a
    committed_choice type.

    Fix whitespace errors in this file.

library/array.m:
    modify fetch_items/4 to do bounds checking.  This change helped me track
    down a bug.
2010-12-13 04:31:46 +00:00
Paul Bone
91e60619b0 Remove the concept of 'partitions' from the candidate parallel conjunction
mdbcomp/feedback.automatic_parallelism.m:
    Remove the concept of 'partitions' from the candidate parallel conjunction
    type.  We no-longer divide conjunctions into partitions before
    parallelising them.

mdbcomp/feedback.m:
    Increment the feedback format version number.

compiler/implicit_parallelism.m:
    Conform to changes in mdbcomp/feedback.automatic_parallelism.m.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Allow the non-atomic goals to be parallelised against one-another.

    Modify the goal annotations used internally, many annotations used only for
    calls are now used for any goal type.

    Variable use information is now stored in a map from variable name to lazy
    use data for every goal, not just for the arguments of calls.

    Do not partition conjunctions before attempting to parallelise them.

    Make the adjust_time_for_waits tolerate floating point errors more easily.

    Format costs with commas and, in most cases, two decimal places.

deep_profiler/var_use_analysis.m:
    Export a new predicate var_first_use that computes the first use of a
    variable within a goal.  This predicate uses a new typeclass to retrieve
    coverage data from any goal that can implement the typeclass.

deep_profiler/measurements.m:
    Added a new abstract type for measuring the cost of a goal, goal_cost_csq.
    This is like cs_cost_csq except that it can represent trivial goals (which
    don't have a call count).

deep_profiler/coverage.m:
    Added deterministic versions of the get_coverage_* predicates.

deep_profiler/program_representation_utils.m:
    Made initial_inst_map more generic in its type signature.

    Add a new predicate, atomic_goal_is_call/2 which can be used instead of a
    large switch on an atomic_goal_rep value.

deep_profiler/message.m:
    Rename a message type to make it more general, this is required now that we
    compute variable use information for arbitrary goals, not just calls.

library/list.m:
    Add map3_foldl.

NEWS:
    Announced change to list.m.
2010-10-14 04:02:22 +00:00
Paul Bone
881039cfed Correct problems in the automatic parallelism analysis.
This patch fixes various problems, the most significant is the calculation of
variable use information.  The parallelisation analysis uses deep profiling
data.  In other words, profiling data that is attached to context information
referring to not just the procedure but the chain of calls leading to that
invocation of that procedure (modulo recursion).  The variable use analysis did
not use deep profiling data, therefore comparing the time that a variable is
produced with a call to the time in total of that call was not sound, and
sometimes resulted in information that is not possible, such as a variable
being produced or consumed after the call that produces or consumes it has
exited.

This change-set updates the variable use analysis to use deep profiling data to
avoid these problems.  At the same time it provides more accurate information
to the automatic parallelisation pass.  This is possible because of an earlier
change that allowed the coverage data to use deep profiling data.

In its current state, the parallelisation analysis now finishes without errors
and computes meaningful results when analysing a profile of the mercury
compiler's execution.

deep_profiler/report.m:
    The proc var use report is now a call site dynamic var use report.
       1) It now uses deep profiling data.
       2) It makes more sense from the callers perspective so it's now based
          around a call site rather than a proc.

    Add inst subtypes to the recursion_type type.

deep_profiler/query.m:
    The proc var use query is now a call site dynamic var use query, see
    report.m.

deep_profiler/var_use_analysis.m:
    Fix a bug here and in mdprof_fb.automatic_parallelism.m: If a
    variable is consumed by a call and appears in it's argument list more than
    once, take the earliest consumption time rather than the one for the
    earliest argument.

    Variable use analysis now uses recursion_patterns.m to correctly compute
    the cost of recursive calls.  It also uses 'deep' profiler data.

    Only measure variable use relative to the entry into a procedure, rather
    than either relative to the entry or exit.  This allows us to simplify a
    lot of code.

deep_profiler/create_report.m:
    The proc var use info report is now a call site dynamic var use info
    report.

    Move some utility code from here to the new analysis_utils.m module.

deep_profiler/display_report.m:
    Conform to changes in report.m.

    Improve the information displayed for variable first-use time
    reports.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Conform to changes in report.m

    Refactored the walk down the clique tree.  This no-longer uses the
    clique reports from the deep profiling tool.

    We now explore the same static procedure more than once.  It may be best to
    parallelise it in some contexts rather than others but for now we assume
    that the benefits in some context are worth the costs without benefit in
    the other contexts.  This is better than reaching a context where it is
    undesirable first and never visiting a case where parallelisation is
    desirable.

    Fix a bug in the calculation of how much parallelisation is used by
    parallelisations in a clique's parents.  This used to trigger an
    assertion.

    Don't try to parallelize anything in the "exception" module.
    There's probably other builtin code we should skip over here.

    Removed an overzealous assertion that was too easily triggered by the
    inaccuracies of IEEE-754 arithmetic.

    Compute variable use information lazily for each variable in each call.  I
    believe that this has made our implementation much faster as it no-longer
    computes information that is never used.

    Refactor and move build_recursive_call_site_cost_map to the new
    module analysis_utils.m where it can be used by other analyses.

    Call site cost maps now use the cs_cost_csq type to store costs,
    code in this module now conforms to this change.

    Conform to changes in messages.m

deep_profiler/recursion_patterns.m:
    Export a new predicate, recursion_type_get_maybe_avg_max_depth/2.  This
    retrieves the average maximum recursion depth from recursion types that know
    this information.

    Move code that builds a call site cost map for a procedure to
    analysis_utils.m where it can be used by other analyses.

deep_profiler/analysis_utils.m:
    Added a new module containing various utility predicates for profile
    analysis.

deep_profiler/coverage.m:
    Added an extra utility predicate get_coverage_after/2.

deep_profiler/message.m:
    Each message has a location that it refers to, a new location type has
    been added: call_site_dynamic.

    Added a new warning that can be used to describe when a call site's
    argument's use time cannot be computed.

    Added new predicates for printing out messages whose level is below a
    certain threshold.  These predicates can be called from io trace goals.
    Message levels start at 0 and currently go to 4, more critical messages
    have lower levels.  The desired verbosity level is stored in a module local
    mutable.

deep_profiler/mdprof_feedback.m:
    Move the message printing code from here to message.m.

deep_profiler/old_html_format.m:
deep_profiler/old_query.m:
    Conform to changes in query.m.

mdbcomp/feedback.automatic_parallelism.m:
    Added a new function for computing the 'cpu time' of a parallel
    computation.

library/lazy.m:
    Moved lazy.m from extras to the standard library.

library/list.m:
    Add a new predicate member_index0/3.  Like member/2 except it also gives
    the zero-based index of the current element within the list.

library/maybe.m:
    Add two new insts.
        maybe_yes(I) for the maybe type's yes/1 constructor.
        maybe_error_ok(I) for the maybe_error type's ok/1 constructor.

library/Mercury.options:
    Add a work around for compiling lazy.m with intermodule optimisations.

NEWS:
    Update news file for the addition of lazy.m and the member_index0 predicate
    in list.m

deep_profiler/.cvsignore:
    Ignore feedback.automatic_parallelism.m which is copied by Mmakefile from
    the mdbcomp/ directory.
2010-10-07 02:38:10 +00:00
Paul Bone
1793e3898b Updated the automatic parallelism analysis to use the new recursive call costs
analysis.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Use new clique recursion costs report to give the costs of recursive calls.
    This is more accurate than the current method which is only accurate in some
    less-common situations.

    Refactored the walk through the program's call graph so that it fits more
    neatly with the calculation of recursive calls.  For instance it is
    no-longer necessary to know the cost of the call into the current clique.

    Delete a number of predicates that are never called.

deep_profiler/message.m:
    Added a new message type, warning_cannot_compute_cost_of_recursive_calls
    since the new recursive call cost algorithm is incomplete.

deep_profiler/recursion_patterns.m:
    Avoid a thrown exception when trying to retrieve the parent call site of the
    initial clique.

    Fix the calculation of recursion depth.  Name some variables more clearly
    to avoid similar issues.

deep_profiler/report.m:
    Add a clarifying comment to the recursion_type data type to indicate that
    costs are per-call.

mdbcomp/program_representation.m:
    Added a new exported predicate goal_path_inside/3 like goal_path_inside/2
    except that it also returns the goal path of the inner goal relative to the
    outer goal.

    Made goal_path_inside/2 more efficient by using list.remove_suffix rather
    than list.append which creates a choice point whose second solution always
    fails.  (See the comment on list.append/3 in mode out, in, in.
2010-09-27 05:51:02 +00:00
Paul Bone
f16e8118bd Implement a linear alternative to the exponential algorithm that determines how
best to parallelise a conjunction.

Made other performance improvements.

mdbcomp/feedback.m:
    Add a field to the candidate_parallel_conjunction_params structure giving
    the preference of algorithm.

    Simplify the parallel exec metrics type here.  It is now used only to
    summarise information that has already been calculated.  The original code
    has been moved into deep_profiler/measurements.m

    Add a field to the candidate_par_conjunction structure giving the index
    within the conjunction of the first goal in the partition.  This is used
    for pretty-printing parallelisation reports.

    Incremented the feedback format version number.

deep_profiler/measurements.m:
    Move the original parallel exec metrics type and code here from
    mdbcomp/feedback.m

deep_profiler/create_report.m:
    Avoid a performance issue by memoizing create_proc_var_use_dump_report
    which is called by the analysis for the same procedure (at different
    dynamic call sites) many times.  In simple cases this more than doubled the
    execution time, in more complicated cases it should perform even better.

    Conform to changes in coverage.m

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Implement the linear algorithm for parallelising a conjunction.

    Since we don't to parallelism specialisation don't try to parallelise the
    same procedure more than once.  This should avoid some performance problems
    but I haven't tested it.

    If it is impossible to generate an independent parallelisation generate a
    dependent one and then report it as something we cannot parallelise.  This
    can help programmers write more independent code.

    Use directed graphs rather than lookup maps to track dependencies.  This
    simplifies some code as the digraph standard library module already has
    code to compute reverse graphs and transitive closures of the graphs.

    Since there are now two parallelisation algorithms; code common to both of
    them has been factored out.

    The objective function used by the branch and bound search has been
    modified to take into account the overheads of parallel execution.  It is:
        minimise(ParTime + ParOverheads X 2.0)
    This way we allow the overheads to increase by 1csc provided that it
    reduces ParTime by more than 2csc.  (csc = call sequence counts)

    When pretty-printing parallelisation reports print each goal in the
    parallelised conjunction with it's new goal path.  This makes debugging
    easier for large procedures.

    Fix a bug where the goal path of scope goals was calculated incorrectly,
    this lead to a thrown exception in the coverage analysis code when it used
    the goalpath to lookup the call site of a call.

deep_profiler/mdprof_feedback.m:
    Support a new command line option for choosing which algorithm to use.
    Additionally the linear algorithm will be used if the problem is above a
    certain size and the exponential algorithm was chosen.  This can be
    configured including the fallback threshold.

    Print the user's choice of algorithm as part of the candidate parallel
    conjunctions report.

deep_profiler/message.m:
    Add an extra log message type for exceptions thrown during auto
    parallelisation.

deep_profiler/program_representation_utils.m:
    The goal_rep pretty printer now prints the goal path for each goal.

deep_profiler/coverage.m:
    procrep_annotate_with_coverage now catches and returns exceptions in a
    maybe_error result.

deep_profiler/cliques.m:
    Copy predicates from the standard library into cliques.m to prevent the
    lack of tail recursion from blowing the stack in some cases.  (cliques.m is
    compiled with --trace minimum).

deep_profiler/callgraph.m:
    Copy list.foldl from the standard library into callgraph.m and transform it
    so that it is less likely to smash the stack in non tail-recursive grades.

deep_profiler/read_profile.m:
    Transform read_nodes so that it is less likely to smash the stack in non
    tail-recursive grades.

deep_profiler/Mercury.options:
    Removed old options that where used to work around a bug.  The bug still
    exists but the work-around moved into the compiler long ago.
2010-08-04 02:25:02 +00:00
Paul Bone
b7f0270f36 Implement the new algorithm for calculating how dependant parallel conjuncts'
executions overlap.  This algorithm has also been generalised to handle cases
where there are more than two conjuncts in a parallel conjunction.  A number of
other improvements have also been made.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Wrote dependant parallel conjunction overlap analysis algorithm (as above).

    This algorithm introduced a new structure, parallel_execution_overlap.
    This structure describes how dependant parallel executions overlap.

    Use both sparking cost and sparking delay as costs of parallelisation.
    Sparking cost is the cost from the perspective of the sparker, whereas
    delay is the delay between creating the spark and actually beginning the
    execution of the spark.

    Handle pretty-printing of the candidate parallel conjunction structure.

    Include variable identifiers as well as canonical names in the
    pardgoal_type structure.

    The inst_map_info structure has been modified to contain the sets of
    consumed and produced variables separately, rather than simply containing a
    set of all consumed and produced variables.

    Improve the readability of messages printed by trace goals.

    The search code no longer attempts to look up procedure bodies for code
    whose module is "Mercury runtime".

    Conjunctions that did not have a speedup due to parallelisation are now
    printed out by a new trace goal.

deep_profiler/mdprof_feedback.m:
    Include support for pretty printing the feedback information after
    creating it.  This is handled by the new --report command line option.

    Include a new --implicit-parallelism-sparking-delay command line option.
    This may be used to specify how long it takes an engine to steal a spark.

mdbcomp/feedback.m:
    Export the sparking delay as part of the feedback information.

    Create a new structure parallel_exec_metrics which contains many metrics
    about parallel execution performance.  This is exported for each candidate
    parallel conjunction rather than only exporting the Speedup.

    Create predicates for creating and querying the parallel_exec_metrics
    structure.

    Create a new predicate, get_all_feedback_data/2, this is used to retrieve
    all the data for building the report in the mdprof_feedback tool.

    Increment the feedback file format version number.

deep_profiler/message.m:
    Improve the readability of the messages printed due to verbosity settings.

    Export some predicates that can be used for managing indentation while
    pretty-printing structures.

compiler/implicit_parallelism.m:
    Conform to changes in feedback_data_candidate_parallel_conjunctions.

    Add a pi_sparking_delay field to parallelism information.

deep_profiler/program_representation_utils.m:
    Fix a bug in calc_inst_map_delta/3.

    Correct a comment for inst_map_ground_vars/5.

deep_profiler/cliques.m:
    Fixed a minor indentation issue.

deep_profiler/Mercury.options:
    Document the new trace goal that enables printing of candidate parallel
    conjunctions that do not result in a speedup.
2010-04-30 13:09:54 +00:00
Paul Bone
79c3f39a68 Implicit parallelism work.
The implicit parallelism algorithm, feedback file format and therefore compiler
have been updated.  They now support parallelisation across other goals and, in
theory, parallelising three or more calls against one another.  The algorithm
is far from complete and very much experimental, it has been tested on a
modified version of icfp_2000 where it improves the runtime.  Note that
automatic parallelisation of dependant conjunctions is disabled for now.

mdbcomp/feedback.m:
    Modify deep profiling feedback data, a candidate parallel conjunct now
    contains a list of sequential conjunctions that contain other goals.
    Previously only two calls to be parallelised against one-another where
    listed.

    Document restrictions on the new candidate parallel conjunct structure that
    can't be expressed by the type system.

    Incremented the feedback file format number.

mdbcomp/program_representation.m:
    Made a semidet predicate version of empty_goal_path.

    Created maybe_search_var_name which returns it's result wrapped in a maybe
    structure, this is a deterministic alternative to search_var_name.  It is
    useful as an argument to list.map

deep_profiler/mdprof_feedback.m:
    When printing messages out to stderr also print the newlines between the
    messages to stderr.

deep_profiler/measurements.m:
    Re-aranged the order of arguments and added a four argument version for
    sub_computation_parallelism.

    Added a new function, some_parallelism/1, that initialises a parallelism
    amount as.

deep_profiler/message.m:
    Added extra messages.

    Pretty-print program locations using the conventional string representation
    for procedures and goal paths.  Export the predicate that does this.

deep_profiler/program_representation_utils.m:
    Export a predicate to format a procedure identifier nicely.

    Add code for calculating and manipulating inst_map_delta objects similar to
    those in the compiler.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Various code cleanups/simplifications.

    Re-worked the parallelisation algorithm, it can now parallelise across
    cheaper calls and (theoretically) handle parallel conjunctions with any
    number of conjuncts.

    Conform to new candidate parallel conjunction representation.

    Internally use a structure similar to the candidate parallel conjunct
    structure in feedback.m  This makes the maybe_call_conjunct structure
    obsolete, the old structure has been removed.

compiler/implicit_parallelism.m:
    Updated implicit parallelism transformation to conform to the new feedback
    file format.

compiler/goal_util.m:
    Added goal_is_atomic/2
    Modified create_conj_from_list to simply return the only goal in the list
    when the list contains exactly one goal.

library/maybe.m:
    Add a simple predicate (maybe_is_yes/2) that 'opens' a maybe and returns the result or
    fails.

NEWS:
    Announce maybe_is_yes/2
2010-01-09 05:49:41 +00:00
Paul Bone
10a612bd68 Improve the performance of the automatic parallelism analysis.
Estimated hours taken: 1.
Branches: main

Improve the performance of the automatic parallelism analysis.

Profiling this code showed that it spent 90% of it's time generating some
error messages used by some sanity checks in the coverage annotation code.
By generating these error messages only when an error occurs the performance
has been increased significantly.

deep_profiler/coverage.m:
	As above.

deep_profiler/message.m:
	Fix an incorrect capitalisation, probably caused by a typo.
2009-04-16 06:15:56 +00:00
Paul Bone
e70295415d Various changes for automatic parallelism, the two major changes are:
Estimated hours taken: 20.
Branches: main

Various changes for automatic parallelism, the two major changes are:

Refactored some of the search for parallel conjunctions to use types that
describe the cost of a call site and the cost of a clique-procedure.  These
new types make it harder for programmers to mistakingly compare values of
either type accidentally.

Where possible, use the body of a clique to determine the cost of recursive
calls at the top level of recursion.  This improves the accuracy of this
calculation significantly.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    As above.

deep_profiler/measurements.m:
    New cost data types as above.

deep_profiler/coverage.m:
    When coverage information completeness tests fail print out the procedure
    where the coverage information is incomplete.

deep_profiler/message.m:
    Introduce a new warning used in the automatic parallelism analysis.

deep_profiler/profile.m:
    Introduce a semidet version of deep_get_progrep_det.

mdbcomp/program_representation.m:
    Introduce a predicate to return the goal_rep from inside a case_rep
    structure.  This can be used as higher order code to turn a case list into
    a goal list for example.

deep_profiler/Mercury.options:
    Keep a commented out MCFLAGS definition that can be used to enable
    debugging output for the automatic parallelism analysis.
2009-04-02 09:49:27 +00:00
Paul Bone
453a08caab Split up mdprof_feedback.m into three modules, The original, utility code for
Estimated hours taken: 1.5.
Branches: main

Split up mdprof_feedback.m into three modules,  The original, utility code for
raising messages and the code related to automatic parallelism.

deep_profiler/mdprof_fb.m:
deep_profiler/mdprof_fb.automatic_parallelism.m:
deep_profiler/mdprof_feedback.m:
deep_profiler/message.m:
	As above
2009-03-17 06:27:07 +00:00