Commit Graph

30 Commits

Author SHA1 Message Date
Zoltan Somogyi
13b6f03f46 Module qualify end_module declarations.
compiler/*.m:
    Module qualify the end_module declarations. In some cases, add them.

compiler/table_gen.m:
    Remove an unused predicate, and inline another in the only place
    where it is used.

compiler/add_pragma.m:
    Give some predicates more meaningful names.
2014-09-04 00:24:52 +02:00
Zoltan Somogyi
3c16f614df Remove Jerome Tannier's old implicit parallelization transformation, since it
Estimated hours taken: 0.3
Branches: main, release

Remove Jerome Tannier's old implicit parallelization transformation, since it
is obsolete, (due to the compiler aborts it generates) not useful even as
a baseline for comparisons, and a maintenance burden.

Divide the remainder of implicit_parallelism.m into two submodules.

compiler/implicit_parallelism.m:
	Make this file a package containing no code.

	Add a comment about where to find Jerome's code.

compiler/introduce_parallism.m:
	The rest of implicit_parallelism.m from a week ago.

compiler/push_goals_together.m:
	The transformation I recently added to implicit_parallelism.m.

compiler/options.m:
	Remove the option calling for Jerome's transformation.

compiler/mercury_compile_middle_passes.m:
	Conform to the changes above.

compiler/follow_code.m:
	Remove some obsolete imports.

compiler/notes/compiler_design.html:
	Document the new modules, as well as implicit_parallelism.m
	(which should have already been listed, but wasn't.)
2011-01-04 05:01:37 +00:00
Zoltan Somogyi
91314c58d8 Add support for pushing expensive goals in different conjunctions
Estimated hours taken: 20
Branches: main

compiler/implicit_parallelism.m:
	Add support for pushing expensive goals in different conjunctions
	into the same conjunction, so we can parallelize that conjunction.
	This support is not yet tested, but Paul should now be able to test it.

compiler/follow_code.m:
compiler/goal_util.m:
	Minor style improvements.
2011-01-03 06:25:40 +00:00
Zoltan Somogyi
1c3bc03415 Make the system compiler with --warn-unused-imports.
Estimated hours taken: 2
Branches: main, release

Make the system compiler with --warn-unused-imports.

browser/*.m:
library/*.m:
compiler/*.m:
	Remove unnecesary imports as flagged by --warn-unused-imports.

	In some files, do some minor cleanup along the way.
2010-12-30 11:18:04 +00:00
Paul Bone
0c42f810c2 Start working on the 'goal push' feedback.
This feedback information is part of automatic parallelisation feedback.  It
describes cases where goals after a branch goal but in the same conjunction
should be pushed into the branches of the branching goal.  This can allow the
pushed goal to be parallelised against goals that already exist in one or more
arms of the branch goal without parallelising the whole branch goal.

This change simply creates the data-structures within the feedback framework on
which this feature will be based.

nmdbcomp/feedback.automatic_parallelism.m:
    Introduce new push_goal structure that describes the transformation.

mdbcomp/feedback.m:
    Incremented feedback format version number.

deep_profiler/mdprof_fb.automatic_parallelism.m:
compiler/implicit_parallelism.m:
    Conform to changes in feedback.automatic_parallelism.m.

    The code to generate or use this feedback has not been implemented, that
    will come later.
2010-12-21 12:01:34 +00:00
Zoltan Somogyi
a2cd0da5b3 The existing representation of goal_paths is suboptimal for several reasons.
Estimated hours taken: 80
Branches: main

The existing representation of goal_paths is suboptimal for several reasons.

- Sometimes we need forward goal paths (e.g. to look up goals), and sometimes
  we need reverse goal paths (e.g. when computing goal paths in the first
  place). We had two types for them, but

  - their names, goal_path and goal_path_consable, were not expressive, and
  - we could store only one of them in goal_infos.

- Testing whether goal A is a subgoal of goal B is quite error-prone using
  either form of goal paths.

- Using a goal path as a key in a map, which several compiler passes want to
  do, requires lots of expensive comparisons.

This diff replaces most uses of goal paths with goal ids. A goal id is an
integer, so it can be used as a key in faster maps, or even in arrays.
Every goal in the body of a procedure gets its id allocated in a depth first
search. Since we process each goal before we dive into is descendants,
the goal representing the whole body of a procedure always gets goal id 0.
The depth first traversal also builds up a map (the containing goal map)
that tells us the parent goal of ever subgoal, with the obvious exception
of the root goal itself. From the containing goal map, one can compute
both reverse and forward goal paths. It can also serve as the basis of an
efficient test of whether the goal identified by goal id A is an ancestor
of another goal identified by goal id B. We don't yet use this test,
but I expect we will in the future.

mdbcomp/program_representation.m:
	Add the goal_id type.

	Replace the existing goal_path and goal_path_consable types
	with two new types, forward_goal_path and reverse_goal_path.
	Since these now have wrappers around the list of goal path steps
	that identify each kind of goal path, it is now ok to expose their
	representations. This makes several compiler passes easier to code.

	Update the set of operations on goal paths to work on the new data
	structures.

	Add a couple of step types to represent lambdas and try goals.
	Their omission prior to this would have been a bug for constraint-based
	mode analysis, or any other compiler pass prior to the expansion out
	of lambda and try goals that wanted to use goal paths to identify
	subgoals.

browser/declarative_tree.m:
mdbcomp/rtti_access.m:
mdbcomp/slice_and_dice.m:
mdbcomp/trace_counts.m:
slice/mcov.m:
deep_profiler/*.m:
	Conform to the changes in goal path representation.

compiler/hlds_goal:
	Replace the goal_path field with a goal_id field in the goal_info,
	indicating that from now on, this should be used to identify goals.

	Keep a reverse_goal_path field in the goal_info for use by RBMM and
	CTGC. Those analyses were too hard to convert to using goal_ids,
	especially since RBMM uses goal_paths to identify goals in multi-pass
	algorithms that should be one-pass and should not NEED to identify
	any goals for later processing.

compiler/goal_path:
	Add predicates to fill in goal_ids, and update the predicates
	filling in the now deprecated reverse goal path fields.

	Add the operations needed by the rest of the compiler
	on goal ids and containing goal maps.

	Remove the option to set goal paths using "mode equivalent steps".
	Constraint based mode analysis now uses goal ids, and can now
	do its own equivalent optimization quite simply.

	Move the goal_path module from the check_hlds package to the hlds
	package.

compiler/*.m:
	Conform to the changes in goal path representation.

	Most modules now use goal_ids to identify goals, and use a containing
	goal map to convert the goal ids to goal paths when needed.
	However, the ctgc and rbmm modules still use (reverse) goal paths.

library/digraph.m:
library/group.m:
library/injection.m:
library/pprint.m:
library/pretty_printer.m:
library/term_to_xml.m:
	Minor style improvements.
2010-12-20 07:47:49 +00:00
Paul Bone
91e60619b0 Remove the concept of 'partitions' from the candidate parallel conjunction
mdbcomp/feedback.automatic_parallelism.m:
    Remove the concept of 'partitions' from the candidate parallel conjunction
    type.  We no-longer divide conjunctions into partitions before
    parallelising them.

mdbcomp/feedback.m:
    Increment the feedback format version number.

compiler/implicit_parallelism.m:
    Conform to changes in mdbcomp/feedback.automatic_parallelism.m.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Allow the non-atomic goals to be parallelised against one-another.

    Modify the goal annotations used internally, many annotations used only for
    calls are now used for any goal type.

    Variable use information is now stored in a map from variable name to lazy
    use data for every goal, not just for the arguments of calls.

    Do not partition conjunctions before attempting to parallelise them.

    Make the adjust_time_for_waits tolerate floating point errors more easily.

    Format costs with commas and, in most cases, two decimal places.

deep_profiler/var_use_analysis.m:
    Export a new predicate var_first_use that computes the first use of a
    variable within a goal.  This predicate uses a new typeclass to retrieve
    coverage data from any goal that can implement the typeclass.

deep_profiler/measurements.m:
    Added a new abstract type for measuring the cost of a goal, goal_cost_csq.
    This is like cs_cost_csq except that it can represent trivial goals (which
    don't have a call count).

deep_profiler/coverage.m:
    Added deterministic versions of the get_coverage_* predicates.

deep_profiler/program_representation_utils.m:
    Made initial_inst_map more generic in its type signature.

    Add a new predicate, atomic_goal_is_call/2 which can be used instead of a
    large switch on an atomic_goal_rep value.

deep_profiler/message.m:
    Rename a message type to make it more general, this is required now that we
    compute variable use information for arbitrary goals, not just calls.

library/list.m:
    Add map3_foldl.

NEWS:
    Announced change to list.m.
2010-10-14 04:02:22 +00:00
Zoltan Somogyi
58211e2f2e Allow more than 2^15 vars in a procedure representation.
Estimated hours taken: 12
Branches: main

Allow more than 2^15 vars in a procedure representation.

mdbcomp/program_representation.m:
	Allow a variable number to be represented by four bytes as well as
	two and one. This means that we also have to represent the number
	of variables in a procedure using a four-byte number, not a two-byte
	number.

	Use four bytes to represent line numbers. Programs that overflow
	16-bit var numbers may also overflow 16 bit line numbers.

	These requires a change in the deep profiler data's binary
	compatibility version number.

compiler/prog_rep.m:
	Encode vars using four bytes if necessary. Be consistent in using
	only signed 8-bit as well as signed 16-bit numbers.

compiler/implicit_parallelism.m:
	Conform to the change in program_representation.m.

deep_profiler/profile.m:
deep_profiler/read_profile.m:
	Add a compression flag to the set of flags read from the data file.
	Put the flags into the profile_stats as a group, not one-by-one.

deep_profiler/canonical.m:
deep_profiler/create_report.m:
deep_profiler/dump.m:
deep_profiler/mdprof_feedback.m:
deep_profiler/old_html_format.m:
deep_profiler/old_query.m:
deep_profiler/query.m:
	Conform to the change in profile.m.

runtime/mercury_deep_profiling.c:
	Prepare for compression of profiling data files being implemented.

runtime/mercury_stack_layout.h:
	Fix some documentation rot.

runtime/mercury_conf_param.h:
	Add an implication between debug flags to make debugging easier.
2010-10-11 00:49:27 +00:00
Paul Bone
d3011e03b0 Changes that make implicit parallelism easier to test.
compiler/implicit_parallelism.m
    The implicit parallelism transformation emits a warning if it cannot match
    feedback data to the program being compiled.  With the default
    --halt-at-warn this aborts compilation which is impractical since the user
    cannot easily control the compiler's ability to honour the feedback data.
    For example, the internal representation of the program may be different in
    for profiling builds compared to release builds, even with similar
    compilation options.

    Therefore this warning is now informational and it does not cause
    compilation to abort.

tools/speedtest:
    Add a new command line option -1.  This causes the speedtest script to run
    the compiler against a single module only (typecheck.m).  This is useful
    for generating representative Deep.data files for automatic
    parallelisation.
2010-10-09 01:26:56 +00:00
Paul Bone
7425922921 Refactor mdbcomp/feedback.m
Move automatic parallelisation specific code to a new module
mdbcomp/feedback.automatic_parallelism.m.

mdbcomp/feedback.m:
mdbcomp/feedback.automatic_parallelism.m:
	As above.

slice/Mmakefile
deep_profiler/Mmakefile
	Copy the new file into the current working directory when with the other
	mdbcomp files.

compiler/implicit_parallelism.m:
deep_profiler/mdprof_fb.automatic_parallelism.m:
deep_profiler/mdprof_feedback.m:
deep_profiler/measurements.m:
	Import the new module to access code that used to be in feedback.m

	Remove unused module imports.
2010-08-24 00:01:47 +00:00
Zoltan Somogyi
543fc6e342 Change the way the typechecker iterates over the predicates of the program.
Estimated hours taken: 12
Branches: main

Change the way the typechecker iterates over the predicates of the program.
We used to do it by looking up each predicate in the module_info,
typechecking it, and putting it back into the module_info. We now do it
by converting the predicate table into a list, iterating over the list
transforming each pred_info in it, converting the updated list back to
a predicate table.

The original intention of this change was to allow different predicates
to be typechecked in parallel by removing a synchronization bottleneck:
the typechecking of a predicate now doesn't have to wait for the typechecking
of the previous predicate to generate the updated version of the module_info.
However, it turned out that the change is good for sequential execution
as well, improving the time on tools/speedtest from 11.33 seconds to 11.08
seconds, a speedup of 2.2%. On tools/speedtest -l, which tests the compilation
of more modules, the speedup is even better: 3.1% (from 32.63 to 31.60s).

compiler/typecheck.m:
	Implement the above change.

compiler/hlds_module.m:
compiler/pred_table.m:
	Add a new operation, setting the list of valid pred_ids, now needed by
	typecheck.m, to both modules.

	Make the names of the predicates for accessing the predicate table
	more expressive, and make them conform to our naming conventions.

compiler/*.m:
	Trivial changes to conform to the change in hlds_module.m.

library/assoc_list.m:
	Add new predicates used by the new version of typecheck.m
	(at some time in its development).

NEWS:
	Mention the new predicates.

library/list.m:
	Improve documentation that is now copied to assoc_list.m.

tools/speedtest:
	Make the test command more easily configurable.
2010-07-30 05:16:26 +00:00
Paul Bone
531c2d94ea Automatic Parallelisation Improvements.
Factor in all the costs of parallelistion into the parallel overlap estimation
algorithm.  Previously only some costs where being taken into consideration.
Independent parallelsations are now generally preferred as they have fewer
overheads for similar parallelsations.

Generalised the branch and bound search algorithm into a new Mercury module.

mdbcomp/feedback.m:
    Grouped candidate parallel conjunction parameters into a single type.

    Added extra parameters:
        future_signal_cost
        future_wait_cost
        context_wakeup_delay.
    The first two replace locking cost, they are the costs of the signal and
    wait calls for futures respectively.  The third represents the length of
    time for a context to begin executing after it has been placed on the run
    queue.  It is used to estimate the cost of blocking.

    Refactored the parallel_exec_metrics type to make representing overheads easier.

    Modify parallel_exec_metrics so that it can represent the cost of calling
    signal in the left conjunct of any conjunct pair.

    Modify parallel_exec_metrics so that it stores the parallel execution time
    of the initial (leftmost) conjunct.  This is necessary as the parallel
    execution time includes the cost of the 'fork' call of the next conjunct.

    Modify parallel_exec_metrics to record the cost of blocking for the
    leftmost conjunct if it completes before the parallel conjunction completes
    as a whole.

    Increment the feedback file format version number.

compiler/implicit_parallelism.m:
    Conform to changes in mdbcomp/feedback.m.

deep_profiler/branch_and_bound.m:
    A generic branch and bound solver loop and utilities.

    The modified branch and bound code includes a profiling facility.

deep_profiler/Mercury.options:
    The new branch_and_bound module supports the debug_branch_and_bound trace
    flag.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Generalise and move branch and bound code to branch_and_bound.m

    Removed the candidate_parallel_conjunctions_opts type, we now use the
    candidate_par_conjunctions_params type in its place.

    Modify the code for parallelising conjunctions so that it works with lists
    of goals rather than cords of goals.

    Factor out the code tha looks for the next costly call, this is now handled
    by a preprocessing pass so that it has linear time rather than increasing
    the complexity of the search code.

    Documented some predicates in more detail.

deep_profiler/mdprof_feedback.m:
    Conform to changes in deep_profiler/mdprof_fb.automatic_parallelism.m and
    mdbcomp/feedback.m

    Add command line support for the new candidate parallel conjunctions
    feedback parameters.
2010-07-14 00:40:22 +00:00
Paul Bone
c877dceb2b Refactor profiler feedback code for implicit parallelism.
This change mostly re-factors the goal representation used to feedback implicit
parallelism information to the compiler.  The goal_rep datatype is now used
rather than the much simpler datatype.  (goal_rep is the same type that is used
by the declarative debugger).

This makes it easier for the compiler to match HLDS goals against goals from
the implicit parallelism analysis and will probably help in the future if the
analysis wants the compiler to re-order goals.

It also makes it easier to pretty-print the feedback sent to the compiler in
more detail.

mdbcomp/feedback.m:
    As above, redefine pard_goal as a type alias to
    goal_rep(pard_goal_annotation).

    Added a new type, candidate_par_conjunctions_proc, it represents candidate
    parallelisations within a procedure along with shared information for the
    procedure.

    Add a new predicate, convert_candidate_par_conjunctions_proc.

    Increment the feedback file format version number.

mdbcomp/program_representation.m:
    XXX: See about refactoring bytecode in/out put into one place.

    Add a new predicate transform_goal_rep for transforming a goal_rep
    structure from one arbitrary annotation type to another.

    Add extra predicates to aid in converting a prog_rep structure to and from
    bytecode.  This includes cut_byte/2 and can_fail_byte/2.

deep_profiler/program_representation_utils.m:
    Export print_goal_to_strings/4 so that it can be used when printing the
    feedback file reports.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Conform to changes in mdbcomp/feedback.m

    Wrap some lines at 76 characters.

    Improve explanations in comments.

    Use the goal_rep pretty-printer to print the candidate parallel
    conjunctions feedback report.

deep_profiler/mdprof_feedback.m:
    Conform to changes in deep_profiler/mdprof_fb.automatic_parallelism.m

deep_profiler/program_representation_utils.m:
    Modify print_goal_to_strings to print determinisms and annotations on
    separate lines before each goal.

deep_profiler/display_report.m:
    Modify pretty printing of coverage annotations so that they make sense
    after modifying print_goal_to_strings/4.

compiler/implicit_parallelism.m:
    Refactor goal matching code that compares HLDS goals to feedback goals.
    Goal matching is now more accurate and can more easily support goal
    re-ordering when parallelising code (this is not implemented yet).

    The code that builds parallel conjunctions has also been refactored.

    This pass now generates warnings if it is not able to parallelise
    a candidate parallel conjunction in the feedback data.

    Insert deeper and later parallelizations before shallower or earlier ones,
    this makes it easier to continue to parallelise a procedure as it's goal
    tree changes due to parallelisation.

    Silently ignore duplicate candidate parallel conjunctions.

    Refuse to parallelise a procedure that has been parallelized explicitly.

compiler/prog_rep.m:
    Re-factor the hlds_goal to bytecode transformation, this transformation now
    goes via goal_rep.  We use the hlds_goal to goal_rep portion of this
    transformation in compiler/implicit_parallelism.m.

    Add variable names prefixed with DCG_ to the list of those introduced by
    the compiler.

compiler/goal_util.m:
    Modify maybe_transform_goal_at_goal_path so that it returns a value that
    can describe the different kinds of error that may be encountered.

    Add a new predicate, maybe_transform_goal_at_goal_path_with_instmap.  Given
    a goal, goal path and initial inst map this predicate recurses the goal
    structure following the goal path and maintaining the inst map.  It then
    uses a higher order value to transform the goal at it's destination before
    re-constructing the goal.  It is different to
    maybe_transform_goal_at_goal_path in that it passes the instmap to it's
    higher order argument, the instmap is correct for the state immediately
    before executing the goal in question.

compiler/hlds_pred.m:
    Include the procedure's varset in the information used to construct the
    program representation data that is included in deep profiling builds.

compiler/instmap.m:
    Add a useful function, apply_instmap_delta_sv.  This is the same as
    apply_instmap_delta except that it's arguments are in a more convenient
    order for state variable notation.

compiler/stack_layout.m:
    Export compute_var_number_map for the use of implicit_parallelism.m and
    prog_rep.m

compiler/error_util.m:
    Add a new error phase, 'phase_auto_parallelism'.  This is used for warnings
    issued from the automatic parallelisation transformation.

compiler/deep_profiling.m:
    Conform to changes in hlds_pred.m

compiler/mercury_compile_middle_passes.m:
    Conform to changes in implicit_parallelism.m

compiler/type_constraints.m:
    Conform to changes in goal_util.
2010-07-04 10:24:09 +00:00
Paul Bone
452dcd116c mdprof_feedback improvements.
Add an option to mdprof_feedback to print the feedback report without modifying
it.  This option also avoids reading and parsing a Deep.data file, this makes
it quick and convenient if you just wish to view the feedback report.

deep_profiler/mdprof_feedback.m:
    As above,

    These changes make it necessary for the feedback_info structure to store
    the program's name that the feedback is generated for.  mdprof_feedback now
    also checks that the program names in the feedback file, and deep profiling
    data match.

mdbcomp/feedback.m:
    Store the name of the program in the feedback_info structure and provide
    methods to query this.

    read_or_create now takes a new parameter, the name of the program that
    we're creating a feedback file for.  Or if a feedback file already exists,
    the name that is checked against the one in the existing feedback file.

    init_feedback_file now takes a new parameter, the name of the program that
    this feedback_info structure is for.

    These changes haven't changed the format of the feedback file, it always
    contained the program's name.  Therefore the feedback file version number
    has not been incremented.

compiler/globals.m:
    The feedback field in the compiler's globals structure now has the type
    maybe(feedback).  If feedback data couldn't be, or wasn't read then empty
    feedback data is no longer used.

compiler/handle_options.m:
    Conform to changes in mdbcomp/feedback.m and compiler/globals.m.

compiler/implicit_parallelism.m:
    Conform to changes in compiler/globals.m
2010-05-13 02:27:38 +00:00
Paul Bone
b7f0270f36 Implement the new algorithm for calculating how dependant parallel conjuncts'
executions overlap.  This algorithm has also been generalised to handle cases
where there are more than two conjuncts in a parallel conjunction.  A number of
other improvements have also been made.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Wrote dependant parallel conjunction overlap analysis algorithm (as above).

    This algorithm introduced a new structure, parallel_execution_overlap.
    This structure describes how dependant parallel executions overlap.

    Use both sparking cost and sparking delay as costs of parallelisation.
    Sparking cost is the cost from the perspective of the sparker, whereas
    delay is the delay between creating the spark and actually beginning the
    execution of the spark.

    Handle pretty-printing of the candidate parallel conjunction structure.

    Include variable identifiers as well as canonical names in the
    pardgoal_type structure.

    The inst_map_info structure has been modified to contain the sets of
    consumed and produced variables separately, rather than simply containing a
    set of all consumed and produced variables.

    Improve the readability of messages printed by trace goals.

    The search code no longer attempts to look up procedure bodies for code
    whose module is "Mercury runtime".

    Conjunctions that did not have a speedup due to parallelisation are now
    printed out by a new trace goal.

deep_profiler/mdprof_feedback.m:
    Include support for pretty printing the feedback information after
    creating it.  This is handled by the new --report command line option.

    Include a new --implicit-parallelism-sparking-delay command line option.
    This may be used to specify how long it takes an engine to steal a spark.

mdbcomp/feedback.m:
    Export the sparking delay as part of the feedback information.

    Create a new structure parallel_exec_metrics which contains many metrics
    about parallel execution performance.  This is exported for each candidate
    parallel conjunction rather than only exporting the Speedup.

    Create predicates for creating and querying the parallel_exec_metrics
    structure.

    Create a new predicate, get_all_feedback_data/2, this is used to retrieve
    all the data for building the report in the mdprof_feedback tool.

    Increment the feedback file format version number.

deep_profiler/message.m:
    Improve the readability of the messages printed due to verbosity settings.

    Export some predicates that can be used for managing indentation while
    pretty-printing structures.

compiler/implicit_parallelism.m:
    Conform to changes in feedback_data_candidate_parallel_conjunctions.

    Add a pi_sparking_delay field to parallelism information.

deep_profiler/program_representation_utils.m:
    Fix a bug in calc_inst_map_delta/3.

    Correct a comment for inst_map_ground_vars/5.

deep_profiler/cliques.m:
    Fixed a minor indentation issue.

deep_profiler/Mercury.options:
    Document the new trace goal that enables printing of candidate parallel
    conjunctions that do not result in a speedup.
2010-04-30 13:09:54 +00:00
Paul Bone
3d6770a091 Refactor feedback parallelisation code.
These changes rename some poorly named types from inner_goal to pard_goal.
'pard' means 'parallelised'.  This is explained in a comment near this type.
The candidate_par_conjunction type has been made polymorphic on the type that
it uses to represent individual goals.  This is easier than using slightly
different candidate_par_conjunction types in different modules.

mdbcomp/feedback.m:
    Changes to types as above.

    Introduce predicates to convert candidate_par_conjunctions from one type to
    anther given a function to convert the type of goal used.

    Increment the feedback file format version number.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Remove our alternative candidate_par_conjunction types in favor of the
    polymorphic type in feedback.m

    Rename the type inner_goal_internal to pard_goal_detail.

    Rename occurrences inner_goal or InnerGoal to pard_goal or PardGoal.

    Use the generic conversion code in feedback.m to convert between different
    types of candidate_par_conjunction.

    Conform to changes in mdbcomp/feedback.m

compiler/implicit_parallelism.m:
    Rename occurrences inner_goal or InnerGoal to pard_goal or PardGoal.

    Conform to changes in mdbcomp/feedback.m
2010-03-25 01:17:03 +00:00
Paul Bone
25ed5e004d Add an option to mdprof_feedback to control whether the automatic
parallelisation feedback will recommend parallelising dependant conjunctions or
not.  The compiler will now parallelise both independent and dependant
conjunctions.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Add a field to the candidate_parallel_conjunctions_opts structure to
    represent whether we should parallelise dependant conjunctions.

    Use this flag to determine if a dependant conjunction should be recommended
    for parallelisation in innergoals_build_candidate_conjunction.

deep_profiler/mdprof_feedback.m:
    Add the actual command line argument.

    Update the --help message.

    Conform to changes in mdprof_fb.automatic_parallelism.m.

compiler/implicit_parallelism.m:
    Previously the compiler would not automatically parallelise dependant
    conjunctions, this restriction has been removed as the control is now
    available in the mdprof_feedback tool.
2010-01-27 04:48:07 +00:00
Paul Bone
79c3f39a68 Implicit parallelism work.
The implicit parallelism algorithm, feedback file format and therefore compiler
have been updated.  They now support parallelisation across other goals and, in
theory, parallelising three or more calls against one another.  The algorithm
is far from complete and very much experimental, it has been tested on a
modified version of icfp_2000 where it improves the runtime.  Note that
automatic parallelisation of dependant conjunctions is disabled for now.

mdbcomp/feedback.m:
    Modify deep profiling feedback data, a candidate parallel conjunct now
    contains a list of sequential conjunctions that contain other goals.
    Previously only two calls to be parallelised against one-another where
    listed.

    Document restrictions on the new candidate parallel conjunct structure that
    can't be expressed by the type system.

    Incremented the feedback file format number.

mdbcomp/program_representation.m:
    Made a semidet predicate version of empty_goal_path.

    Created maybe_search_var_name which returns it's result wrapped in a maybe
    structure, this is a deterministic alternative to search_var_name.  It is
    useful as an argument to list.map

deep_profiler/mdprof_feedback.m:
    When printing messages out to stderr also print the newlines between the
    messages to stderr.

deep_profiler/measurements.m:
    Re-aranged the order of arguments and added a four argument version for
    sub_computation_parallelism.

    Added a new function, some_parallelism/1, that initialises a parallelism
    amount as.

deep_profiler/message.m:
    Added extra messages.

    Pretty-print program locations using the conventional string representation
    for procedures and goal paths.  Export the predicate that does this.

deep_profiler/program_representation_utils.m:
    Export a predicate to format a procedure identifier nicely.

    Add code for calculating and manipulating inst_map_delta objects similar to
    those in the compiler.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Various code cleanups/simplifications.

    Re-worked the parallelisation algorithm, it can now parallelise across
    cheaper calls and (theoretically) handle parallel conjunctions with any
    number of conjuncts.

    Conform to new candidate parallel conjunction representation.

    Internally use a structure similar to the candidate parallel conjunct
    structure in feedback.m  This makes the maybe_call_conjunct structure
    obsolete, the old structure has been removed.

compiler/implicit_parallelism.m:
    Updated implicit parallelism transformation to conform to the new feedback
    file format.

compiler/goal_util.m:
    Added goal_is_atomic/2
    Modified create_conj_from_list to simply return the only goal in the list
    when the list contains exactly one goal.

library/maybe.m:
    Add a simple predicate (maybe_is_yes/2) that 'opens' a maybe and returns the result or
    fails.

NEWS:
    Announce maybe_is_yes/2
2010-01-09 05:49:41 +00:00
Zoltan Somogyi
77a6a6c10c Implement several more changes that together speed up compilation time
Estimated hours taken: 16
Branches: main

Implement several more changes that together speed up compilation time
on training_cars_full by 12%, and also improve tools/speedtest -h by 7.2%
and tools/speedtest by 1.6%.

The first change is designed to eliminate the time that the compiler spends
constructing error messages that are then ignored. The working predicates of
prog_io_sym_name used to always return a single result, which either gave
a description of the thing being looked, or an error message. However,
in many places, the caller did not consider not finding the thing being looked
for to be an error, and thus threw away the error message, keeping only
the "not found" indication. For each predicate with such callers, this diff
provides a parallel predicate that indicates "not found" simply by failing.
This allows us to eliminate the construction of the error message, the
preparation for the construction of the error message (usually by describing
the context), and the construction of the "ok" wrapper.

The second change is to specialize the handling of from_ground_term_construct
scopes in the termination analyzer. To make this easier, I also cleaned up
of the infrastructure of the termination analyzer.

The third change is to avoid traversing from_ground_term_construct scopes
in quantification.m when finding the variables in a goal, since termination
analysis no longer needs the information it gathers.

The fourth change is to avoid traversing second and later conjuncts in
conjunctions twice. The first step in handling conjunctions is to call
implicitly_quantify_conj, which builds up a data structure that pairs each
conjunct with the variables that occur free in all the conjuncts following it.
However, after this was done and each conjunct was annotated with its
nonlocals, we used to compute the variables that occur free in the conjunction
as a whole from scratch. This diff changes the code so that we now compute that
set based on the information we gathered earlier, avoiding a redundant
traversal.

The fifth change is to create specialized, lower-arity versions of many of
the predicates in quantification.m. These versions are intended for traversals
that take place after the compiler has replaced lambda expressions with
references to separate procedures. These traversals do not need to pass around
arguments representing the variables occurring free in the (now non-existent)
lambda expressions.

compiler/prog_io_sym_name.m:
	Make the first change described above.

	Change some predicate names to adopt a consistent naming scheme
	in which predicates that do the same job and differ only in how they
	handle errors have names that differ only in a "try_" prefix.

	Add some predicate versions that do common tests on the output
	of the base versions. For example, try_parse_sym_name_and_no_args
	is a version of try_parse_sym_name_and_args that insists on finding
	an empty argument list.

	Remove the unused "error term" argument that we used to need a while
	ago.

	Move some predicate definitions to make their order match the order of
	their declarations.

	Turn a predicate into a function for its caller's convenience.

compiler/term_constr_build.m:
	Make the second change described above by modeling each
	from_ground_term_construct scope as a single unification,
	assigning the total size of the ground term to the variable being
	built.

compiler/term_constr_util.m:
	Put the arguments of some predicates into a more standard order.

compiler/lp_rational.m:
	Change the names of some function symbols to avoid both the use of
	graphic characters that require quoting and clashes with other types.

	Change the names of some predicates to make their purpose clear,
	and to avoid ambiguity.

compiler/quantification.m:
	Make the third, fourth and fifth changes described above.

compiler/*.m:
	Conform to the changes above.
2009-09-08 02:43:41 +00:00
Paul Bone
0bbb6d07fa Support implicit parallelism in the compiler.
Estimated hours taken: 20
Branches: main

Support implicit parallelism in the compiler.

The compiler now uses the deep profiler feedback information to build a
parallel version of a program.

Changes have also been made to the feedback format for candidate parallel
conjunctions and the analysis that recommends opportunities for parallelism to
the compiler.

compiler/implicit_parallelism.m:
	Mark Tannier's implementation as deprecated (it also crashes the
	compiler).
	Introduce new implicit parallelism transformation.
	apply_implicit_parallelism_transformation now returns maybe_error rather
	than maybe so that errors can be described.

compiler/goal_util.m:
	Add a predicate to transform a goal referenced by a goal path within a
	larger goal structure and rebuild that structure.

compiler/mercury_compile.m:
	Conform to changes in implicit_parallelism.m

deep_profiler/mdprof_feedback.m:
	Return a cord of warnings from many predicates, these warnings are used to
	describe cases where parallelism might be profitable but it is not (yet)
	possible to transform the code into parallel code.
	Fix a bug whereby the wrong deep profiling statistic was used to calculate
	the cost of a call.
	Do not attempt to parallelise calls with other goals between them.

mdbcomp/feedback.m:
	Remove the intermediate goals information from the candidate parallel
	conjunctions feedback data.

mdbcomp/program_representation.m:
	Provide a in-order alternative to the goal_path type so that operations on
	the start of the goal path occur in constant time and goal_path itself
	remains usable as a key in arrays because it doesn't use the cord type
	internally.

library/cord.m:
	Added a di/uo mode to cord.foldl_pred.

library/list.m:
	Added list.find_index_of_match/4 to return the index of the first item in
	a list that satisfies the predicate given in the first argument.

library/pqueue.m:
	Added pqueue.length/1

NEWS:
	Announce standard library changes.
2009-01-30 03:51:45 +00:00
Zoltan Somogyi
5ad9a27793 Speed up the compiler's handling of code that constructs large ground terms
Estimated hours taken: 80
Branches: main

Speed up the compiler's handling of code that constructs large ground terms
by specializing the treatment of such code.

This diff reduces the compilation time for training_cars_full.m from 106.9
seconds to 30.3 seconds on alys, my laptop. The time on tools/speedtest
stays pretty much the same.

compiler/hlds_goal.m:
	Record the classification of from_ground_term scopes as purely
	constructing terms, purely deconstructing them or something other.

	Fix an old potential bug: variables inside the construct_how fields
	of unifications weren't being renamed along with other variables.
	This is a bug if any part of the compiler later looks at those
	variables. (I am not sure whether or not this happens.)

compiler/superhomogenous.m:
	Provisionally mark newly constructed static terms as being
	from_ground_term_construct. Mode checking will either confirm this
	or change the scope kind.

compiler/options.m:
compiler/handle_options.m:
	Add a new option, from_ground_term_threshold, that allows the user to
	set the boundary between ground terms that get scopes and ground terms
	do not. I plan to experiment with different settings later.

compiler/modes.m:
	Make this classification. For scopes that construct ground terms,
	use a specialized algorithm that avoids quadratic behavior.
	(It does not access the unify_inst_table, which is where the
	factor of N other than the length of the goal list came from.)
	The total size of the instmap_deltas, if printed out, still looks like
	O(N^2) in size, but due to structure sharing it needs only O(N) memory.

	For scopes that construct ground terms, set the determinism information
	so that det_analysis.m doesn't have to traverse such scopes.

	When handling disjunctions, check whether some nonlocals of the
	disjunctions are constructed by from_ground_term_construct scopes.
	For any such nonlocals, set their insts to just ground, throwing away
	the precise information we have about exactly what function symbols
	they and ALL their subterms are bound to. This is HUGE win, since
	it allows us avoid spending a lot of time building a huge merge_inst
	table, which later passes of the compiler (e.g. equiv_type_hlds) would
	then have to spend similarly huge times traversing.

	This approach does have a down side. If lots of arms of a disjunction
	bind a nonlocal to a large ground term, but a few bind it to a SMALL
	ground term, a term below the from_ground_term_threshold, this
	optimization won't kick in. That could be one purpose of the new
	option. It isn't documented yet; I will seek feedback about its
	usefulness first.

compiler/modecheck_unify.m:
	Handle the three different kinds of right hand sides separately.
	This yields a small speedup, because now we don't test rhs_vars and
	rhs_functors (the common right hand sides) for a special case
	(goals containing "any" insts) that is applicable only to
	rhs_lambda_goals.

compiler/unique_modes.m:
	Don't traverse scopes that construct ground terms, since modes.m has
	already done everything that needs to be done.

compiler/det_analysis.m:
	Don't traverse scopes that construct ground terms, since modes.m has
	already done the needed work.

compiler/instmap.m:
	Add a new predicate for use by modes.m.

	Many predicate names in this module were quite uninformative; give them
	informative names.

compiler/polymorphism.m:
	If this pass invalidates the from_ground_term_construct invariants,
	then mark the relevant scope as from_ground_term_other.

	Delete two unused access predicates.

compiler/equiv_type_hlds.m:
	Don't traverse scopes that construct ground terms, since modes.m
	ensures that their instmap deltas do not contain typed insts, and
	thus the scope cannot contain types that need to be expanded.

	Convert some predicates to single clauses.

compiler/goal_form.m:
compiler/goal_util.m:
	In predicates that test goals for various properties, don't traverse
	scopes that construct ground terms when the outcome of the test
	is the same for all such scopes.

	Convert some predicates to single clauses.

compiler/simplify.m:
	Do not look for common structs in from_ground_term_construct scopes,
	both because this speeds up the compiler, and because retaining
	references to ground terms is in fact a pessimization, not an
	optimization. This is because (a) those references need to be stored in
	stack slots across calls, and (b) the C code generators ensure that
	the cells representing ground terms will be shared as needed.

	If all arms of a switch are from_ground_term_construct scopes,
	do not merge the instmap_deltas from those arms, since this is
	both time-consuming (even after the other changes in this diff)
	and extremely unlikely to improve the instmap_delta.

	Disable common_struct in from_ground_term_construct scopes,
	since for these scopes, it is actually a pessimization.

	Do not delete from_ground_term_construct scopes, since many
	compiler passes can now use them.

	Do some manual deforestation, break up some large predicates,
	and give better names to some.

compiler/liveness.m
	Special-case the handling from_ground_term_construct scopes. This
	allows us to traverse them just once instead of three times, and this
	traversal is simpler and faster than any of the three.

	In some traversals, we were switching on the goal type twice; once
	in e.g. detect_liveness_in_goal_2, and once by calling
	goal_expr_has_subgoals. Eliminate the double switching by merging
	the relevant predicates. (The double-switching structure was easier
	to work with before we had multi-cons-id switches.)

compiler/typecheck.m:
	Move a lookup after a test, so we don't have to do it if the test
	fails.

	Provide a specialized mode for a predicate. This should allow the
	compiler to eliminate an argument and a test in the common case.

	Note a possible chance for a speedup.

compiler/typecheck_info.m:
	Don't apply empty substitutions to the types of a possibly very large
	set of variables.

compiler/quantification.m:
	Don't quantify from_ground_term_construct scopes. They are created
	correctly quantified, and any compiler pass that invalidates that
	quantification also removes the from_ground_term_construct mark.

	Don't apply empty renamings to a possibly very large set of variables.

	Move the code for handling scopes to its own predicate, to avoid
	overwhelming the code that handles other kinds of goals. Even from
	this, factor out the renaming code, since it is needed only for
	some kinds of scopes.

	Make some predicate names better reflect what the predicate does.

compiler/pd_cost.m:
	For from_ground_term_construct scopes, instead of computing their cost
	by adding up the costs of the goals inside, make their cost a constant,
	since binding a variable to a static term takes constant time.

compiler/pd_info.m:
	Add prefixes on field names to avoid ambiguities.

compiler/add_heap_ops.m:
compiler/add_trail_ops.m:
compiler/closure_analysis.m:
compiler/constraint.m:
compiler/cse_detection.m:
compiler/dead_proc_elim.m:
compiler/deep_profiling.m:
compiler/deforest.m:
compiler/delay_construct.m:
compiler/delay_partial_inst.m:
compiler/dep_par_conj.m:
compiler/distance_granularity.m:
compiler/exception_analysis.m:
compiler/follow_code.m:
compiler/follow_vars.m:
compiler/format_call.m:
compiler/granularity.m:
compiler/higher_order.m:
compiler/implicit_parallelism.m:
compiler/inlining.m:
compiler/interval.m:
compiler/lambda.m:
compiler/lco.m:
compiler/live_vars.m:
compiler/loop_inv.m:
compiler/middle_rec.m:
compiler/mode_util.m:
compiler/parallel_to_plain_conj.m:
compiler/saved_vars.m:
compiler/stm_expand.m:
compiler/store_alloc.m:
compiler/stratify.m:
compiler/structure_reuse.direct.detect_garbage.m:
compiler/structure_reuse.lbu.m:
compiler/structure_sharing.analysis.m:
compiler/switch_detection.analysis.m:
compiler/trail_analysis.m:
compiler/term_pass1.m:
compiler/tupling.m:
compiler/unneeded_code.m:
compiler/untupling.m:
compiler/unused_args.m:
	These passes have nothing to do in from_ground_term_construct scopes,
	so don't traverse them.

	In some modules (e.g. dead_proc_elim), some traversals had to be kept.

	In loop_inv.m, replace a code structure that updated accumulators
	with functions (which prevented the natural use of state variables),
	that in lots of places reconstructed the term it had just
	deconstructed, and obscured the identical handling of different kinds
	of goals, with a structure based on predicates, state variables and
	shared code for different goal types where possible.

	In store_alloc.m, avoid some double switching on the same value.

	In stratify.m, unneeded_code.m and unused_args.m, rename predicates
	to avoid ambiguities.

compiler/goal_path.m:
compiler/goal_util.m:
compiler/implementation_defined_literals.m:
compiler/intermode.m:
compiler/mark_static_terms.m:
compiler/ml_code_gen.m:
compiler/mode_ordering.m:
compiler/ordering_mode_constraints.m:
compiler/prop_mode_constraints.m:
compiler/purity.m:
compiler/rbmm.actual_region_arguments.m:
compiler/rbmm.add_rbmm_goal_infos.m:
compiler/rbmm.condition_renaming.m:
compiler/rbmm.execution_path.m:
compiler/rbmm.region_transformation.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/structure_reuse.indirect.m:
compiler/structure_reuse.lfu.m:
compiler/structure_reuse.versions.m:
compiler/term_const_build.m:
compiler/term_traversal.m:
compiler/unused_imports.m:
	Mark places where we cannot (yet) special case
	from_ground_term_construct scopes.

	In structure_reuse.lfu.m, turn nested if-then-elses into a switch in.

compiler/size_prof.m:
	Turn from_ground_term_construct scopes into from_ground_term_other
	scopes, since in term size profiling grades, we need to attach sizes to
	terms.

	Give predicates better names.

compiler/*.m:
	Minor changes to conform to the changes above.

compiler/make_hlds_passes.m:
	With -S, print statistics after the third pass over items, since
	this is the time-consuming one.

compiler/mercury_compile.m:
	Conform to the new names of some predicates.

	When declining to output a HLDS dump because it would be identical to
	the previous dump, don't confuse the user either by being silent about
	the decision, or by leaving an old dump laying around that could be
	mistaken for a new one.

tools/binary:
tools/binary_step:
	Bring these tools up to date.

compiler/Mmakefile:
	Add an int3s target for use by the new code in the tools. The
	Mmakefiles in the other directories with Mercury code already have
	such a target.

compiler/notes/allocation.html:
	Fix an out-of-date reference.

tests/debugger/polymorphic_ground_term.{m,inp,exp}:
	New test case to check whether liveness.m handles typeinfo liveness
	of ground terms correctly.

tests/debugger/Mmakefile:
	Enable the new test case.

tests/debugger/polymorphic_output.{m,exp}:
	Fix tab/space mixup.
2008-12-23 01:38:03 +00:00
Paul Bone
6a6e81b9e3 Add a new structure to the feedback data type,
Estimated hours taken: 2
Branches: main

Add a new structure to the feedback data type,
candidate_parallel_conjunctions, This produces feedback information about
parallel conjunctions that may be parallelised.

This data is not yet collected by the mdprof_feedback tool, or used by the
compiler.

Make changes to the feedback API and on disk format.  This makes it easier to
query the feedback_info structure for feedback data.

mdbcomp/feedback.m:
	Introduce candidate_parallel_conjunctions feedback information.
	Remove type arguments from feedback predicates.
	Move feedback_type out of this modules interface.
	Use a partially instantiated feedback_data data structure to retrieve
	feedback data, A caller of get_feedback_data no-longer needs to use a
	switch to check that they received the correct data.
	Remove keys from the on disk format, removing the risk that some data
	could be stored against an incorrect key.
	Increment the feedback data file version number.

compiler/implicit_parallelism.m:
	conform to changes in mdbcomp/feedback.m

compiler/options.m:
	Added the --implicit-parallelisation-old compiler option, this will enable
	the old implicit parallelism implementation.

deep_profiler/mdprof_feedback.m:
	Added options for collecting the candidate_parallel_conjunctions feedback
	data.
2008-09-30 02:30:51 +00:00
Paul Bone
01d145ab8f Introduce a feedback system that allows analysis tools to feed information
Estimated hours taken: 8
Branches: main

Introduce a feedback system that allows analysis tools to feed information
back into the compiler.  This can be used with the deep profiler to improve
many optimizations.  Tools update information in the feedback file rather than
clobbering existing un-related information.

Modify the implicit parallelism work to make use of the new feedback system.
mdprof_feedback updates a feedback file and in the future will be able to
collect more information from the deep profiler.

mdbcomp/feedback.m:
	Created a new module for the feedback system, types representing feedback
	information and predicates for reading and writing feedback files, and
	manipulating feedback information are defined here.

mdbcomp/mdbcomp.m:
	Updated to include the mdbcomp/feedback.m in this library.

mdbcomp/program_representation.m:
	Created a new type to describe a call.  This is used by the current
	implicit parallelism implementation.

deep_profiler/mdprof_feedback.m:
	Updated to use the new feedback system.  The old feedback file code has
	been removed.
	--program-name option has been added, a program name must be provided to
	be included in the header of the feedback file.
	Conform to changes in mdbcomp/program_representation.m

compiler/globals.m:
	Added feedback data to globals structure.
	Added predicates to get and set the feedback information stored in the
	globals structure.
	Modified predicates that create the globals structure.

compiler/handle_options.m:
	Set feedback information in globals structure when it is created in
	postprocess_options.
	Read feedback information in from file in check_option_values.
	Code added to postprocess_options2 to check the usage of the
	--implicit-parallelism option.

compiler/implicit_parallelism.m:
	This module no-longer reads the feedback file it's self, this code has
	been removed, as has the IO state.
	Information from the feedback state is retrieved and used to control
	implicit parallelism.

compiler/mercury_compile.m:
	No-longer checks options for implicit parallelization, this is now done in
	compiler/handle_options.m.
	Conform to changes in implicit_parallelism.m

deep_profiler/Mmakefile:
slice/Mmakefile:
	Modified to include mdbcomp/feedback.m for compilation in this directory.
2008-07-23 23:20:35 +00:00
Zoltan Somogyi
b000cb322e Provide compiler support for Software Transactional Memory through the new
Estimated hours taken: 80 by zs, and lots more by lmika
Branches: main

Provide compiler support for Software Transactional Memory through the new
atomic goal. This work was done by Leon Mika; I merely brought it up to date,
resolved conflicts, and cleaned up a few things. There are still several
aspects that are as yet incomplete.

library/ops.m:
	Add the operators needed for the syntax of atomic scopes.

library/stm_builtin.m:
	Add the builtin operations needed for the implementation of atomic
	goals.

compiler/hlds_goal.m:
	Add a new HLDS goal type, which represents an atomic goal and its
	possible fallbacks (in case an earlier goal throws an exception).

	Rename the predicate goal_is_atomic as goal_expr_has_subgoals,
	since now its old name would be misleading.

compiler/prog_data.m:
compiler/prog_item.m:
	Add a parse tree representation of the new kind of goal.

compiler/prog_io_goal.m:
	Parse the new kind of goal.

compiler/add_clause.m:
	Translate atomic goals from parse tree form to HLDS.

compiler/typecheck.m:
compiler/typecheck_errors.m:
	Do type checking of atomic goals.

compiler/modes.m:
	Do mode checking of atomic goals, and determine whether they are nested
	or not.

compiler/unique_modes.m:
	Do unique mode checking of atomic goals.

compiler/stm_expand.m:
	New module to expand atomic goals into sequences of simpler goals.

library/stm_builtin.m:
	Add the primitives needed by the transformation.

	Improve the existing debugging support.

mdbcomp/prim_data.m:
	Add utility functions to allow stm_expand.m to refer to modules in the
	library.

mdbcomp/program_representation.m:
	Expand the goal_path type to allow the representation of components of
	atomic goals.

compiler/notes/compiler_design.html:
	Document the new module.

compiler/transform_hlds.m:
	Include the new module in the compiler.

compiler/mercury_compile.m:
	Invoke the STM transformation.

compiler/hlds_module.m:
	Add an auxiliary counter used by the STM transformation.

compiler/hlds_pred.m:
	Add a new predicate origin: the STM transformation.

compiler/modules.m:
	Import the STM builtin module automatically if the module contains any
	atomic goals.

compiler/assertion.m:
compiler/bytecode_gen.m:
compiler/clause_to_proc.m:
compiler/code_gen.m:
compiler/code_info.m:
compiler/code_util.m:
compiler/constraint.m:
compiler/cse_detection.m:
compiler/deep_profiling.m:
compiler/code_util.m:
compiler/delay_construct.m:
compiler/delay_partial_inst.m:
compiler/dep_par_conj.m:
compiler/dependency_graph.m:
compiler/det_analysis.m:
compiler/det_report.m:
compiler/distance_granularity.m:
compiler/equiv_type_hlds.m:
compiler/erl_code_gen.m:
compiler/exception_analysis.m:
compiler/follow_code.m:
compiler/format_call.m:
compiler/goal_form.m:
compiler/goal_path.m:
compiler/goal_util.m:
compiler/granularity.m:
compiler/hlds_out.m:
compiler/implicit_parallelism.m:
compiler/inlining.m:
compiler/intermod.m:
compiler/lambda.m:
compiler/layout_out.m:
compiler/lco.m:
compiler/lookup_switch.m:
compiler/make_hlds_warn.m:
compiler/mark_static_terms.m:
compiler/mercury_to_mercury.m:
compiler/middle_rec.m:
compiler/ml_code_gen.m:
compiler/mode_constraint_robdd.m:
compiler/mode_constraints.m:
compiler/mode_errors.m:
compiler/mode_info.m:
compiler/mode_util.m:
compiler/ordering_mode_constraints.m:
compiler/pd_cost.m:
compiler/pd_util.m:
compiler/polymorphism.m:
compiler/post_typecheck.m:
compiler/prog_rep.m:
compiler/prog_type.m:
compiler/prop_mode_constraints.m:
compiler/rbmm.actual_region_arguments.m:
compiler/rbmm.add_rbmm_goal_info.m:
compiler/rbmm.condition_renaming.m:
compiler/rbmm.execution_path.m:
compiler/rbmm.points_to_analysis.m:
compiler/rbmm.region_transformation.m:
compiler/saved_vars.m:
compiler/simplify.m:
compiler/size_prog.m:
compiler/smm_common.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/structure_reuse.direct.detect_garbage.m:
compiler/structure_reuse.indirect.m:
compiler/structure_reuse.lbu.m:
compiler/structure_reuse.lfu.m:
compiler/structure_reuse.versions.m:
compiler/structure_sharing.analysis.m:
compiler/switch_detection.m:
compiler/unused_imports.m:
compiler/granularity.m:
compiler/granularity.m:
	Conform to the changes above. Mostly this means handling the new
	kind of goal.

compiler/add_heap_ops.m:
compiler/add_trail_ops.m:
compiler/build_mode_constraints.m:
compiler/closure_analysis.m:
compiler/dead_proc_elim.m:
compiler/deforest.m:
compiler/follow_vars.m:
compiler/higher_order.m:
compiler/live_vars.m:
compiler/liveness.m:
compiler/loop_inv.m:
compiler/module_qual.m:
compiler/prog_util.m:
compiler/purity.m:
compiler/quantification.m:
compiler/store_alloc.m:
compiler/stratify.m:
compiler/tabling_analysis.m:
compiler/term_constr_build.m:
compiler/term_pass1.m:
compiler/term_traversal.m:
compiler/trailing_analysis.m:
	Conform to the changes above. Mostly this means handling the new
	kind of goal.

	Switch syntax from clauses to disj.

runtime/mercury_stm.[ch]:
	Implement the primitives needed by the STM transformation.

	Add more debugging support to the existing primitives.

library/term.m:
	Generalize get_term_context to work on terms of all kinds.
2008-02-27 07:23:57 +00:00
Zoltan Somogyi
cc88711d63 Implement true multi-cons_id arm switches, i.e. switches in which we associate
Estimated hours taken: 40
Branches: main

Implement true multi-cons_id arm switches, i.e. switches in which we associate
more than one cons_id with a switch arm. Previously, for switches like this:

	(
		X = a,
		goal1
	;
		( X = b
		; X = c
		),
		goal2
	)

we duplicated goal2. With this diff, goal2 won't be duplicated. We still
duplicate goals when that is necessary, i.e. in cases which the inner
disjunction contains code other than a functor test on the switched-on var,
like this:

	(
		X = a,
		goal1
	;
		(
			X = b,
			goalb
		;
			X = c
			goalc
		),
		goal2
	)

For now, true multi-cons_id arm switches are supported only by the LLDS
backend. Supporting them on the MLDS backend is trickier, because some MLDS
target languages (e.g. Java) don't support the concept at all. So when
compiling to MLDS, we still duplicate the goal in switch detection (although
we could delay the duplication to just before code generation, if we wanted.)

compiler/options.m:
	Add an internal option that tells switch detection whether to look for
	multi-cons_id switch arms.

compiler/handle_options.m:
	Set this option based on the back end.

	Add a version of the "trans" dump level that doesn't print unification
	details.

compiler/hlds_goal.m:
	Extend the representation of switch cases to allow more than one
	cons_id for a switch arm.

	Add a type for representing switches that also includes tag information
	(for use by the backends).

compiler/hlds_data.m:
	For du types, record whether it is possible to speed up tests for one
	cons_id (e.g. cons) by testing for the other (nil) and negating the
	result. Recording this information once is faster than having
	unify_gen.m trying to compute it from scratch for every single
	tag test.

	Add a type for representing a cons_id together with its tag.

compiler/hlds_out.m:
	Print out the cheaper_tag_test information for types, and possibly
	several cons_ids for each switch arm.

	Add some utility predicates for describing switch arms in terms of
	which cons_ids they are for.

	Replace some booleans with purpose-specific types.

	Make hlds_out honor is documentation, and not print out detailed
	information about unifications (e.g. uniqueness and static allocation)
	unless the right character ('u') is present in the control string.

compiler/add_type.m:
	Fill in the information about cheaper tag tests when adding a du type.

compiler/switch_detection.m:
	Extend the switch detection algorithm to detect multi-cons_id switch
	arms.

	When entering a switch arm, update the instmap to reflect that the
	switched-on variable can now be bound only to the cons_ids that this
	switch arm is for. We now need to do this, because if the arm contains
	another switch on the same variable, computing the can_fail field of
	that switch correctly requires us to know this information.
	(Obviously, an arm for a single cons_id is unlikely to have switch on
	the same variable, and for arms for several cons_ids, we previously
	duplicated the arm and left the unification with the cons_id in each
	copy, and this unification allowed the correct handling of any later
	switches. However, the code of a multi-cons_id switch arm obviously
	cannot have a unification with each cons_id in it, which is why
	we now need to get the binding information from the switch itself.)

	Replace some booleans with purpose-specific types, and give some
	predicates better names.

compiler/instmap.m:
	Provide predicates for recording that a switched-on variable has
	one of several given cons_ids, for use at the starts of switch arms.

	Give some predicates better names.

compiler/modes.m:
	Provide predicates for updating the mode_info at the start of a
	multi-cons_id switch arm.

compiler/det_report.m:
	Handle multi-cons_id switch arms.

	Update the instmap when entering each switch arm, since this is needed
	to provide good (i.e. non-misleading) error messages when one switch on
	a variable exists inside another switch on the same variable.

	Since updating the instmap requires updating the module_info (since
	the new inst may require a new entry in an inst table), thread the
	det_info through as updateable state.

	Replace some multi-clause predicate definitions with single clauses,
	to make it easier to print the arguments in mdb.

	Fix some misleading variable names.

compiler/det_analysis.m:
	Update the instmap when entering each switch arm and thread the
	det_info through as updateable state, since the predicates we call
	in det_report.m require this.

compiler/det_util.m:
	Handle multi-cons_id switch arms.

	Rationalize the argument order of some access predicates.

compiler/switch_util.m:
	Change the parts of this module that deal with string and tag switches
	to optionally convert each arm to an arbitrary representation of the
	arm. In the LLDS backend, the conversion process generated code for
	the arm, and the arm's representation is the label at the start of
	this code. This way, we can duplicate the label without duplicating
	the code.

	Add a new part of this module that associates each cons_id with its
	tag, and (during the same pass) checks whether all the cons_ids are
	integers, and if so what are min and max of these integers (needed
	for dense switches). This scan is needed because the old way of making
	this test had single-cons_id switch arms as one of its basic
	assumptions, and doing it while adding tags to each case reduces
	the number of traversals required.

	Give better names to some predicates.

compiler/switch_case.m:
	New module to handle the tasks associated with managing multi-cons_id
	switch arms, including representing them for switch_util.m.

compiler/ll_backend.m:
	Include the new module.

compiler/notes/compiler_design.html:
	Note the new module.

compiler/llds.m:
	Change the computed goto instruction to take a list of maybe labels
	instead of a list of labels, with any missing labels meaning "not
	reached".

compiler/string_switch.m:
compiler/tag_switch.m:
	Reorganize the way these modules work. We can't generate the code of
	each arm in place anymore, since it is now possible for more than one
	cons_id to call for the execution of the same code. Instead, in
	string_switch.m, we generate the codes of all the arms all at once,
	and construct the hash index afterwards. (This approach simplifies
	the code significantly.)

	In tag switches (unlike string switches), we can get locality benefits
	if the code testing for a cons_id is close to the code for that
	cons_id, so we still try to put them next to each other when such
	a locality benefit is available.

	In both modules, the new approach uses a utility predicate in
	switch_case.m to actually generate the code of each switch arm,
	eliminating several copies the same code in the old versions of these
	modules.

	In tag_switch.m, don't create a local label that simply jumps to the
	code address do_not_reached. Previously, we had to do this for
	positions in jump tables that corresponded to cons_ids that the switch
	variable could not be bound to. With the change to llds.m, we now
	simply generate a "no" instead.

compiler/lookup_switch.m:
	Get the info about int switch limits from our caller; don't compute it
	here.

	Give some variables better names.

compiler/dense_switch.m:
	Generate the codes of the cases all at once, then assemble the table,
	duplicate the labels as needed. This separation of concerns allows
	significant simplifications.

	Pack up all the information shared between the predicate that detects
	whether a dense switch is appropriate and the predicate that actually
	generates the dense switch.

	Move some utility predicates to switch_util.

compiler/switch_gen.m:
	Delete the code for tagging cons_ids, since that functionality is now
	in switch_util.m.

	The old version of this module could call the code generator to produce
	(i.e. materialize) the switched-on variable repeatedly. We now produce
	the variable once, and do the switch on the resulting rval.

compiler/unify_gen.m:
	Use the information about cheaper tag tests in the type constructor's
	entry in the HLDS type table, instead of trying to recompute it
	every time.

	Provide the predicates switch_gen.m now needs to perform tag tests
	on rvals, as opposed to variables, and against possible more than one
	cons_id.

	Allow the caller to provide the tag corresponding to the cons_id(s)
	in tag tests, since when we are generating code for switches, the
	required computations have already been done.

	Factor out some code to make all this possible.

	Give better names to some predicates.

compiler/code_info.m:
	Provide some utility predicates for the new code in other modules.
	Give better names to some existing predicates.

compiler/hlds_code_util.m:
	Rationalize the argument order of some predicates.

	Replace some multi-clause predicate definitions with single clauses,
	to make it easier to print the arguments in mdb.

compiler/accumulator.m:
compiler/add_heap_ops.m:
compiler/add_pragma.m:
compiler/add_trail_ops.m:
compiler/assertion.m:
compiler/build_mode_constraints.m:
compiler/check_typeclass.m:
compiler/closure_analysis.m:
compiler/code_util.m:
compiler/constraint.m:
compiler/cse_detection.m:
compiler/dead_proc_elim.m:
compiler/deep_profiling.m:
compiler/deforest.m:
compiler/delay_construct.m:
compiler/delay_partial_inst.m:
compiler/dep_par_conj.m:
compiler/distance_granularity.m:
compiler/dupproc.m:
compiler/equiv_type_hlds.m:
compiler/erl_code_gen.m:
compiler/exception_analysis.m:
compiler/export.m:
compiler/follow_code.m:
compiler/follow_vars.m:
compiler/foreign.m:
compiler/format_call.m:
compiler/frameopt.m:
compiler/goal_form.m:
compiler/goal_path.m:
compiler/goal_util.m:
compiler/granularity.m:
compiler/hhf.m:
compiler/higher_order.m:
compiler/implicit_parallelism.m:
compiler/inlining.m:
compiler/inst_check.m:
compiler/intermod.m:
compiler/interval.m:
compiler/lambda.m:
compiler/lambda.m:
compiler/lambda.m:
compiler/lco.m:
compiler/live_vars.m:
compiler/livemap.m:
compiler/liveness.m:
compiler/llds_out.m:
compiler/llds_to_x86_64.m:
compiler/loop_inv.m:
compiler/make_hlds_warn.m:
compiler/mark_static_terms.m:
compiler/middle_rec.m:
compiler/ml_tag_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/mode_constraints.m:
compiler/mode_errors.m:
compiler/mode_util.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/pd_cost.m:
compiler/pd_into.m:
compiler/pd_util.m:
compiler/peephole.m:
compiler/polymorphism.m:
compiler/post_term_analysis.m:
compiler/post_typecheck.m:
compiler/purity.m:
compiler/quantification.m:
compiler/rbmm.actual_region_arguments.m:
compiler/rbmm.add_rbmm_goal_infos.m:
compiler/rbmm.condition_renaming.m:
compiler/rbmm.execution_paths.m:
compiler/rbmm.points_to_analysis.m:
compiler/rbmm.region_transformation.m:
compiler/recompilation.usage.m:
compiler/saved_vars.m:
compiler/simplify.m:
compiler/size_prof.m:
compiler/ssdebug.m:
compiler/store_alloc.m:
compiler/stratify.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/structure_reuse.indirect.m:
compiler/structure_reuse.lbu.m:
compiler/structure_reuse.lfu.m:
compiler/structure_reuse.versions.m:
compiler/structure_sharing.analysis.m:
compiler/table_gen.m:
compiler/tabling_analysis.m:
compiler/term_constr_build.m:
compiler/term_norm.m:
compiler/term_pass1.m:
compiler/term_traversal.m:
compiler/trailing_analysis.m:
compiler/transform_llds.m:
compiler/tupling.m:
compiler/type_ctor_info.m:
compiler/type_util.m:
compiler/unify_proc.m:
compiler/unique_modes.m:
compiler/unneeded_code.m:
compiler/untupling.m:
compiler/unused_args.m:
compiler/unused_imports.m:
compiler/xml_documentation.m:
	Make the changes necessary to conform to the changes above, principally
	to handle multi-cons_id arm switches.

compiler/ml_string_switch.m:
	Make the changes necessary to conform to the changes above, principally
	to handle multi-cons_id arm switches.

	Give some predicates better names.

compiler/dependency_graph.m:
	Make the changes necessary to conform to the changes above, principally
	to handle multi-cons_id arm switches. Change the order of arguments
	of some predicates to make this easier.

compiler/bytecode.m:
compiler/bytecode_data.m:
compiler/bytecode_gen.m:
	Make the changes necessary to conform to the changes above, principally
	to handle multi-cons_id arm switches. (The bytecode interpreter
	has not been updated.)

compiler/prog_rep.m:
mdbcomp/program_representation.m:
	Change the byte sequence representation of goals to allow switch arms
	with more than one cons_id. compiler/prog_rep.m now writes out the
	updated representation, while mdbcomp/program_representation.m reads in
	the updated representation.

deep_profiler/mdbprof_procrep.m:
	Conform to the updated program representation.

tools/binary:
	Fix a bug: if the -D option was given, the stage 2 directory wasn't
	being initialized.

	Abort if users try to give that option more than once.

compiler/Mercury.options:
	Work around bug #32 in Mantis.
2007-12-30 08:24:23 +00:00
Zoltan Somogyi
672f77c4ec Add a new compiler option. --inform-ite-instead-of-switch.
Estimated hours taken: 20
Branches: main

Add a new compiler option. --inform-ite-instead-of-switch. If this is enabled,
the compiler will generate informational messages about if-then-elses that
it thinks should be converted to switches for the sake of program reliability.

Act on the output generated by this option.

compiler/simplify.m:
	Implement the new option.

	Fix an old bug that could cause us to generate warnings about code
	that was OK in one duplicated copy but not in another (where a switch
	arm's code is duplicated due to the case being selected for more than
	one cons_id).

compiler/options.m:
	Add the new option.

	Add a way to test for the bug fix in simplify.

doc/user_guide.texi:
	Document the new option.

NEWS:
	Mention the new option.

library/*.m:
mdbcomp/*.m:
browser/*.m:
compiler/*.m:
deep_profiler/*.m:
	Convert if-then-elses to switches at most of the sites suggested by the
	new option. At the remaining sites, switching to switches would have
	nontrivial downsides. This typically happens with the switched-on type
	has many functors, and we treat one or two specially (e.g. cons/2 in
	the cons_id type).

	Perform misc cleanups in the vicinity of the if-then-else to switch
	conversions.

	In a few cases, improve the error messages generated.

compiler/accumulator.m:
compiler/hlds_goal.m:
	(Rename and) move insts for particular kinds of goal from
	accumulator.m to hlds_goal.m, to allow them to be used in other
	modules. Using these insts allowed us to eliminate some if-then-elses
	entirely.

compiler/exprn_aux.m:
	Instead of fixing some if-then-elses, delete the predicates containing
	them, since they aren't used, and (as pointed out by the new option)
	would need considerable other fixing if they were ever needed again.

compiler/lp_rational.m:
	Add prefixes to the names of the function symbols on some types,
	since without those prefixes, it was hard to figure out what type
	the switch corresponding to an old if-then-else was switching on.

tests/invalid/reserve_tag.err_exp:
	Expect a new, improved error message.
2007-11-23 07:36:01 +00:00
Zoltan Somogyi
168f531867 Add new fields to the goal_info structure for region based memory management.
Estimated hours taken: 4
Branches: main

Add new fields to the goal_info structure for region based memory management.
The fields are currently unused, but (a) Quan will add the code to fill them
in, and then (b) I will modify the code generator to use the filled in fields.

compiler/hlds_goal.m:
	Make the change described above.

	Group all the procedures that access goal_info components together.
	Some of the getters were predicates while some were functions, so
	this diff changes them all to be functions. (The setters remain
	predicates.)

compiler/*.m:
	Trivial changes to conform to the change in hlds_goal.m.

	In simplify.m, break up a huge (800+ line) predicate into smaller
	pieces.
2007-08-07 07:10:09 +00:00
Julien Fischer
9958d3883c Fix some formatting.
Estimated hours taken: 0
Branches: main

Fix some formatting.

compiler/distance_granularity.m:
compiler/exception_analysis.m:
compiler/implicit_parallelism.m:
compiler/inst_graph.m:
compiler/interval.m:
compiler/layout_out.m:
compiler/lp_rational.m:
compiler/make.program_target.m:
compiler/modules.m:
compiler/prog_data.m:
compiler/purity.m:
compiler/recompilation.check.m:
compiler/term_constr_data.m:
compiler/term_util.m:
compiler/xml_documentation.m:
deep_profiler/mdprof_cgi.m:
library/pqueue.m:
profiler/output.m:
	Fix the positioning of commas.

	s/[_|_]/[_ | _]/ in a spot.
2007-05-23 10:09:24 +00:00
Zoltan Somogyi
b56885be93 Fix a bug that caused bootchecks with --optimize-constructor-last-call to fail.
Estimated hours taken: 12
Branches: main

Fix a bug that caused bootchecks with --optimize-constructor-last-call to fail.

The problem was not in lco.m, but in follow_code.m. In some cases,
(specifically, the LCMC version of insert_2 in sparse_bitset.m),
follow_code.m moved an impure goal (store_at_ref) into the arms of an
if-then-else without marking those arms, or the if-then-else, as impure.
The next pass, simplify, then deleted the entire if-then-else, since it
had no outputs. (The store_at_ref that originally appeared after the
if-then-else was the only consumer of its only output.)

The fix is to get follow_code.m to make branched control structures such as
if-then-elses, as well as their arms, semipure or impure if a goal being moved
into them is semipure or impure, or if they came from an semipure or impure
conjunction.

Improve the optimization of the LCMC version of sparse_bitset.insert_2, which
had a foreign_proc invocation of bits_per_int in it: replace such invocations
with a unification of the bits_per_int constant if not cross compiling.

Add a new option, --optimize-constructor-last-call-null. When set, LCMC will
assign NULLs to the fields not yet filled in, to avoid any junk happens to be
there from being followed by the garbage collector's mark phase.

This diff also makes several other changes that helped me to track down
the bug above.

compiler/follow_code.m:
	Make the fix described above.

	Delete all the provisions for --prev-code; it won't be implemented.

	Don't export a predicate that is not now used anywhere else.

compiler/simplify.m:
	Make the optimization described above.

compiler/lco.m:
	Make sure that the LCMC specialized procedure is a predicate, not a
	function: having a function with the mode LCMC_insert_2(in, in) = in
	looks wrong.

	To avoid name collisions when a function and a predicate with the same
	name and arity have LCMC applied to them, include the predicate vs
	function status of the original procedure included in the name of the
	new procedure.

	Update the sym_name of calls to LCMC variants, not just the pred_id,
	because without that, the HLDS dump looks misleading.

compiler/pred_table.m:
	Don't have optimizations like LCMC insert new predicates at the front
	of the list of predicates. Maintain the list of predicates in the
	module as a two part list, to allow efficient addition of new pred_ids
	at the (logical) end without using O(N^2) algorithms. Having predicates
	in chronological order makes it easier to look at HLDS dumps and
	.c files.

compiler/hlds_module.m:
	Make module_info_predids return a module_info that is physically
	updated though logically unchanged.

compiler/options.m:
	Add --optimize-constructor-last-call-null.

	Make the options --dump-hlds-pred-id, --debug-opt-pred-id and
	--debug-opt-pred-name into accumulating options, to allow the user
	to specify more than one predicate to be dumped (e.g. insert_2 and
	its LCMC variant).

	Delete --prev-code.

doc/user_guide.texi:
	Document the changes in options.m.

compiler/code_info.m:
	Record the value of --optimize-constructor-last-call-null in the
	code_info, to avoid lookup at every cell construction.

compiler/unify_gen.m:
compiler/var_locn.m:
	When deciding whether a cell can be static or not, make sure that
	we never make static a cell that has some fields initialized with
	dummy zeros, to be filled in for real later.

compiler/hlds_out.m:
	For goals that are semipure or impure, note this fact. This info was
	lost when I changed the representation of impurity from markers to a
	field.

mdbcomp/prim_data.m:
	Rename some ambiguous function symbols.

compiler/intermod.m:
compiler/trans_opt.m:
	Rename the main predicates (and some function symbols) of these modules
	to avoid ambiguity and to make them more expressive.

compiler/llds.m:
	Don't print line numbers for foreign_code fragments if the user has
	specified --no-line-numbers.

compiler/make.dependencies.m:
compiler/mercury_to_mercury.m:
compiler/recompilation.usage.m:
	Don't use io.write to write out information to files we may need to
	parse again, because this is vulnerable to changes to the names of
	function symbols (e.g. the one to mdbcomp/prim_data.m).

	The compiler still contains some uses of io.write, but they are
	for debugging. I added an item to the todo list of the one exception,
	ilasm.m.

compiler/recompilation.m:
	Rename a misleading function symbol name.

compiler/parse_tree.m:
	Don't import recompilation.m here. It is not needed (all the components
	of parse_tree that need recompilation.m already import it themselves),
	and deleting the import avoids recompiling almost everything when
	recompilation.m changes.

compiler/*.m:
	Conform to the changes above.

compiler/*.m:
browser/*.m:
slice/*.m:
	Conform to the change to mdbcomp.

library/sparse_bitset.m:
	Use some better variable names.
2007-01-19 07:05:06 +00:00
Jerome Tannier
7651d83206 This change adds two new passes to the compiler.
Estimated hours taken: 80
Branches: main

This change adds two new passes to the compiler. The first one,
implicit_parallelism, uses deep profiling feedback information, generated by
mdprof_feedback, to introduce parallel conjunctions where it could be
worthwhile. It deals with both independent and dependent parallelism.

The second new pass, distance_granularity, applies a transformation that
controls the granularity of parallelism for recursive procedures using the
distance metric.

This change also fixes a bug in mdprof_feedback regarding the construction of
the list of CSSs.

compiler/implicit_parallelism.m:
    New module which uses the profiling feedback file generated by
    mdprof_feedback to introduce parallel conjunction where it could be
    useful.

compiler/distance_granularity.m:
    New module. A program transformation that implements granularity control
    of parallel execution using the distance metric.

compiler/dep_par_conj.m:
    Moved find_shared_variables into the interface (needed for
    implicit_parallelism.m).

compiler/goal_util.m:
    Add two new predicates: flatten_conj and create_conj.

compiler/hhf.m:
    Delete flatten_conj and use the one of goal_util instead.

compiler/hlds_pred.m:
    Add a predicate to set the arity of a predicate (needed for
    distance_granularity).

compiler/mercury_compile.m:
    Add the calls to apply implicit parallelism and to control granularity
    using the distance metric.

compiler/options:
    Add implicit-parallelism, feedback-file and distance-granularity options.

compiler/pred_table.m:
    Add a predicate to get the next pred_id available (needed for
    distance_granularity).

compiler/prog_util.m:
    Extend the predicate make_pred_name and the type new_pred_id for
    creating a predicate name for distance_granularity.

compiler/transform_hlds.m:
    Include implicit_parallelism and distance_granularity.

deep_profiler/mdprof_feedback.m:
    Rename distribution to measure.
    Add handling of dump_stages and dump_options.
    Insert elements into the list of CSSs in the correct order.

deep_profiler/dump.m:
    Add "all" option to dump everything out of the Deep.data file.

doc/user_guide.texi:
    Add the following options: --distance-granularity, --implicit-parallelism and
    --feedback-file.

tests/par_conj/Mercury.options:
tests/par_conj/dg_fib.{m,exp}:
tests/par_conj/dg_fib_func.{m,exp}:
    Add two test cases for the distance_granularity module:dg_fib and
    dg_fib_func. As things are, we do not check whether the granularity
    control transformation using the distance metric is applied correctly or
    not. We only check the output of these test cases.
2007-01-13 12:23:18 +00:00