Paul Bone c877dceb2b Refactor profiler feedback code for implicit parallelism.
This change mostly re-factors the goal representation used to feedback implicit
parallelism information to the compiler.  The goal_rep datatype is now used
rather than the much simpler datatype.  (goal_rep is the same type that is used
by the declarative debugger).

This makes it easier for the compiler to match HLDS goals against goals from
the implicit parallelism analysis and will probably help in the future if the
analysis wants the compiler to re-order goals.

It also makes it easier to pretty-print the feedback sent to the compiler in
more detail.

mdbcomp/feedback.m:
    As above, redefine pard_goal as a type alias to
    goal_rep(pard_goal_annotation).

    Added a new type, candidate_par_conjunctions_proc, it represents candidate
    parallelisations within a procedure along with shared information for the
    procedure.

    Add a new predicate, convert_candidate_par_conjunctions_proc.

    Increment the feedback file format version number.

mdbcomp/program_representation.m:
    XXX: See about refactoring bytecode in/out put into one place.

    Add a new predicate transform_goal_rep for transforming a goal_rep
    structure from one arbitrary annotation type to another.

    Add extra predicates to aid in converting a prog_rep structure to and from
    bytecode.  This includes cut_byte/2 and can_fail_byte/2.

deep_profiler/program_representation_utils.m:
    Export print_goal_to_strings/4 so that it can be used when printing the
    feedback file reports.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Conform to changes in mdbcomp/feedback.m

    Wrap some lines at 76 characters.

    Improve explanations in comments.

    Use the goal_rep pretty-printer to print the candidate parallel
    conjunctions feedback report.

deep_profiler/mdprof_feedback.m:
    Conform to changes in deep_profiler/mdprof_fb.automatic_parallelism.m

deep_profiler/program_representation_utils.m:
    Modify print_goal_to_strings to print determinisms and annotations on
    separate lines before each goal.

deep_profiler/display_report.m:
    Modify pretty printing of coverage annotations so that they make sense
    after modifying print_goal_to_strings/4.

compiler/implicit_parallelism.m:
    Refactor goal matching code that compares HLDS goals to feedback goals.
    Goal matching is now more accurate and can more easily support goal
    re-ordering when parallelising code (this is not implemented yet).

    The code that builds parallel conjunctions has also been refactored.

    This pass now generates warnings if it is not able to parallelise
    a candidate parallel conjunction in the feedback data.

    Insert deeper and later parallelizations before shallower or earlier ones,
    this makes it easier to continue to parallelise a procedure as it's goal
    tree changes due to parallelisation.

    Silently ignore duplicate candidate parallel conjunctions.

    Refuse to parallelise a procedure that has been parallelized explicitly.

compiler/prog_rep.m:
    Re-factor the hlds_goal to bytecode transformation, this transformation now
    goes via goal_rep.  We use the hlds_goal to goal_rep portion of this
    transformation in compiler/implicit_parallelism.m.

    Add variable names prefixed with DCG_ to the list of those introduced by
    the compiler.

compiler/goal_util.m:
    Modify maybe_transform_goal_at_goal_path so that it returns a value that
    can describe the different kinds of error that may be encountered.

    Add a new predicate, maybe_transform_goal_at_goal_path_with_instmap.  Given
    a goal, goal path and initial inst map this predicate recurses the goal
    structure following the goal path and maintaining the inst map.  It then
    uses a higher order value to transform the goal at it's destination before
    re-constructing the goal.  It is different to
    maybe_transform_goal_at_goal_path in that it passes the instmap to it's
    higher order argument, the instmap is correct for the state immediately
    before executing the goal in question.

compiler/hlds_pred.m:
    Include the procedure's varset in the information used to construct the
    program representation data that is included in deep profiling builds.

compiler/instmap.m:
    Add a useful function, apply_instmap_delta_sv.  This is the same as
    apply_instmap_delta except that it's arguments are in a more convenient
    order for state variable notation.

compiler/stack_layout.m:
    Export compute_var_number_map for the use of implicit_parallelism.m and
    prog_rep.m

compiler/error_util.m:
    Add a new error phase, 'phase_auto_parallelism'.  This is used for warnings
    issued from the automatic parallelisation transformation.

compiler/deep_profiling.m:
    Conform to changes in hlds_pred.m

compiler/mercury_compile_middle_passes.m:
    Conform to changes in implicit_parallelism.m

compiler/type_constraints.m:
    Conform to changes in goal_util.
2010-07-04 10:24:09 +00:00
2010-07-03 08:05:23 +00:00
2010-02-11 04:41:03 +00:00
2010-06-16 01:29:06 +00:00

Threadscope
===========

This file contains information about threadscope profiling for Mercury.

 1. Contact Info
 2. Supported Systems.
 3. Threadscope Profiling Tools


Contact Info
------------

    Paul Bone
    pbone@csse.unimelb.edu.au

    Mercuiry Project
    mercury@csse.unimelb.edu.au
    http://www.mercury.csse.unimelb.edu.au


Supported Systems
-----------------

Threadscope uses the RDTSCP or RDTSC instructions found on some x86 and x86_64
processors to get fast, high precision timing information.  These instructions
read the time stamp counter (TSC), this is incremented for every clock cycle.
Processors must increment this at a constant rate, regardless of their power
state, (see /proc/cpuinfo for constant_tsc).

TSC must also be synchronised between processors in the same system, although
it may be possible to work around this, let me know if you have such a system
(See contact info).

AMD processors do not seem to store their clock frequency in their brand ID
string.  On these systems Theadscope profiles are not to scale since clock
counts cannot be converted into time in nanoseconds.  The threadscope profile
will count one nanosecond for each clock tick.

I have had success with the following processors:

    Intel Core2
    Intel Xeon CPU X5472 (in a dual socket system).

Processors that do not work correctly:

    AMD Athlon 64 X2 

Threadscope Profiling Tools
---------------------------

Mercury supports threadscope profiling.  See the profiling section in the user
guide. 

The Threadscope profiling tools are written in Haskell and are known to work
with GHC 6.10.  threadscope depends upon the following Haskell libraries:

    array
    binary
    containers
    filepath
    ghc-events
    gtk2hs
    mtl

Many of these will be provided with GHC or packaged for/by your operating
system.

ghc-events is not packaged by most operating systems at this stage, It can be
retrieved from hackage:

    http://hackage.haskell.org/package/ghc-events

threadscope itself can also be retrieved from hackage:

    http://hackage.haskell.org/package/threadscope

Information about how to install Haskell packages can be found here:

    http://haskell.org/haskellwiki/Cabal/How_to_install_a_Cabal_package

Description
No description provided
Readme MIT 145 MiB
Languages
Mercury 85.4%
C 8.7%
Shell 1.4%
Makefile 1%
JavaScript 1%
Other 2%