mirror of
https://github.com/Mercury-Language/mercury.git
synced 2026-05-01 17:24:34 +00:00
Estimated hours taken: 2 Branches: main Move the intermodule analysis framework into the `compiler' directory, in preparation for making it specific to the Mercury compiler, rather than having it generic in case some fictional being might want to use it with a .NET compiler one day. This will make it easier to use and modify. compiler/analysis.file.m: compiler/analysis.m: Copy these files from the `analysis' directory. compiler/top_level.m: Include analysis.m as a new package. compiler/Mercury.options: Add a bug workaround line from analysis/Mercury.options. analysis/Mercury.options: analysis/Mmakefile: analysis/analysis.file.m: analysis/analysis.m: analysis/mer_analysis.m: Replace the contents of these files with comments that the analysis framework is now in the `compiler' directory. We don't actually delete them so their histories remain easily accessible. analysis/README: Mention that the code has been moved. compiler/notes/compiler_design.html: compiler/notes/overall_design.html: Update documentation. Mmake.workspace: Mmakefile: configure.in: compiler/.mgnuc_copts: compiler/COMP_FLAGS.in: compiler/Mmakefile: deep_profiler/.mgnuc_copts: scripts/Mmake.vars.in: scripts/c2init.in: scripts/mercury_config.in: scripts/prepare_tmp_dir_fixed_part.in: tools/binary: tools/binary_step: tools/bootcheck: tools/lmc.in: tools/make_arena: compiler/notes/coding_standards.html: Remove references to the `analysis' directory and `libmer_analysis'.
244 lines
11 KiB
Plaintext
244 lines
11 KiB
Plaintext
-----------------------------------------------------------------------------
|
|
| Copyright (C) 2003, 2006, 2008 The University of Melbourne.
|
|
| This file may only be copied under the terms of the GNU General
|
|
| Public License - see the file COPYING in the Mercury distribution.
|
|
-----------------------------------------------------------------------------
|
|
|
|
This directory formerly contained an implementation of the inter-module
|
|
analysis framework described in
|
|
|
|
Nicholas Nethercote. The Analysis Framework of HAL,
|
|
Chapter 7: Inter-module Analysis, Master's Thesis,
|
|
University of Melbourne, September 2001, revised April 2002.
|
|
<http://www.cl.cam.ac.uk/~njn25/pubs/masters2001.ps.gz>.
|
|
|
|
The code has now been moved into the compiler directory, in the package
|
|
`analysis'. The files that remain (Mmakefile, etc.) are unused.
|
|
|
|
This framework records call and answer patterns for arbitrary analyses,
|
|
and performs dependency analysis to force recompilation where necessary
|
|
when modules change.
|
|
|
|
TODO:
|
|
- dependency tracking and invalidation after source modifications
|
|
(partially done but requires testing with more analyses)
|
|
- garbage collection of unused versions
|
|
- least fixpoint analyses
|
|
|
|
DESIGN:
|
|
|
|
The analysis framework is a library which links into the client
|
|
compiler, allowing the class methods to examine compiler data
|
|
structures. The interface is as compiler-independent as possible,
|
|
so that compilers which can interface with Mercury code via .NET
|
|
can use it.
|
|
|
|
Clients of the library must define an instance of the typeclass
|
|
`analysis__compiler', which describes the analyses the compiler
|
|
wants to perform.
|
|
|
|
Each analysis is described by a call pattern type and an
|
|
answer pattern type. A call pattern describes the information
|
|
known about the argument variables before analysing a call
|
|
(by executing it in the abstract domain used by the analysis).
|
|
An answer pattern describes the information known after analysing
|
|
the call. Call and answer patterns must form a partial order, and
|
|
must be convertible to strings.
|
|
|
|
Analysis database
|
|
=================
|
|
|
|
Before analysing a module, the client should call
|
|
`analysis__lookup_results' and `analysis__lookup_requests' to find the
|
|
call patterns for which the module should be analysed, in addition to
|
|
the "default" call pattern (top). `lookup_results' will give the call
|
|
patterns which we have encountered before whereas `lookup_requests'
|
|
will give the call patterns which have been recently requested by
|
|
other modules.
|
|
|
|
When analysing a module, at each call to an imported function
|
|
the client should call `analysis__lookup_results' or
|
|
`analysis__lookup_best_result' to find the results which
|
|
match the call pattern.
|
|
|
|
If no results exist, the client should call `analysis__record_request',
|
|
to ask that a specialized version be created on the next compilation
|
|
of the client module. The client should also record the suboptimal
|
|
result that was used in place of a more specialized result in the
|
|
imported module's analysis registry.
|
|
|
|
There is currently no way to analyse higher-order or class method
|
|
calls. It might be possible to analyse such calls where the set of
|
|
possibly called predicates is known, but it is better to optimize away
|
|
higher-order or class method calls where possible.
|
|
|
|
The client should call `analysis__record_dependency' for each external
|
|
analysis result that is made use of by the module.
|
|
|
|
When compilation of a module is complete, the client should
|
|
call `analysis__write_analysis_files' to write out all
|
|
information collected during the compilation.
|
|
|
|
Called by analysis passes to record analysis requests and lookup
|
|
answers for imported functions. The status of each answer recorded
|
|
in the database is one of the following:
|
|
|
|
* invalid - the answer was computed using information which has changed,
|
|
and must be recomputed. `invalid' entries may not be used in analysis
|
|
or in generating code.
|
|
|
|
* fixpoint_invalid - the entry is for a least fixpoint analysis, and
|
|
depends on an answer which has changed so that the new answer
|
|
is strictly less precise than the old answer (moving towards to
|
|
correct answer). `fixpoint_invalid' entries may be used when analysing
|
|
a module, but code must not be generated which uses `fixpoint_invalid'
|
|
results (even indirectly). In addition, code must not be generated when
|
|
compiling a module in a strongly connected component of the analysis
|
|
dependency graph which contains `fixpoint_invalid' entries. (Note that
|
|
the method for handling least fixpoint analyses is not described in
|
|
Nicholas Nethercote's thesis).
|
|
This is not yet implemented.
|
|
|
|
* suboptimal - the entry does not depend on any `invalid' or
|
|
`fixpoint_invalid' entries, but may be improved by further
|
|
recompilation. `suboptimal' entries do not need to be recompiled,
|
|
but efficiency may be improved if they are. `suboptimal' annotations
|
|
are only possible for greatest fixpoint analyses (least fixpoint
|
|
analyses start with a "super-optimal" answer and work towards the
|
|
correct answer).
|
|
|
|
* optimal - the entry does not depend on any `invalid', `fixpoint_invalid'
|
|
or `suboptimal' results. Modules containing only `optimal' entries do
|
|
not need recompilation.
|
|
|
|
Analysis dependency checker (NYI)
|
|
=================================
|
|
|
|
Examines the dependencies between analysis results and the state
|
|
of the compilation, then orders recompilations so that there are no
|
|
`invalid' or `fixpoint_invalid' entries (with an option to eliminate
|
|
`suboptimal' entries).
|
|
|
|
Each client compiler should have an option which invokes the analysis
|
|
dependency checker rather than compiling code. This adjusts the status
|
|
of entries in the database, then invokes the compiler's build tools
|
|
(through a typeclass method) to recompile modules in the correct order.
|
|
|
|
If the implementation of a function changes, all of its answers are
|
|
marked as invalid, and the results of the functions it directly uses
|
|
in the SCC of the analysis dependency graph containing it are reset
|
|
to `top' (marked `suboptimal') for greatest fixpoint analyses, or
|
|
`bottom' (marked `fixpoint_invalid') for least fixpoint analyses.
|
|
This ensures that the new result for the function is not computed
|
|
using potentially invalid information.
|
|
|
|
After each compilation, the dependency checker examines the changes
|
|
in the analysis results for each function.
|
|
|
|
For greatest fixpoint analyses, if the new answer is
|
|
- less precise than or incomparable with the old result,
|
|
all users of the call pattern are marked `invalid'.
|
|
- equal to the old result, no entries need to be marked.
|
|
- more precise than the old result, callers are marked
|
|
as `suboptimal'.
|
|
|
|
For least fixpoint analyses, if the new answer is
|
|
- less precise than or incomparable with the old result,
|
|
all users of the call pattern are marked `invalid'.
|
|
- equal to the old result, no entries need to be marked.
|
|
- more precise than the old result, callers are marked
|
|
as `fixpoint_invalid'.
|
|
|
|
The new answer itself will be marked as `optimal'. This isn't
|
|
necessarily correct -- further recompilations may change its status
|
|
to `fixpoint_invalid' or `suboptimal' (or `invalid' if there
|
|
are source code changes).
|
|
|
|
Recompilation must proceed until there are no `invalid' or `fixpoint_invalid'
|
|
entries. Optionally, optimization can proceed until there are no new requests
|
|
or `suboptimal' answers.
|
|
|
|
It the responsibility of the analysis implementor to ensure termination of
|
|
the analysis process by not generating an infinite number of requests.
|
|
|
|
Note: although `mmc' does not have an analysis dependency checker as
|
|
described above, `mmc --make' now understands the intermodule analysis
|
|
framework. Before the pass to compile modules to target code, there
|
|
is an analysis pass. This is similar to the optimisation interface
|
|
pass which generated `.opt' and `.trans_opt' files. `mmc --make' will
|
|
repeat the analysis pass as long as there are invalid analysis
|
|
registries, with the option to repeat the analysis pass as many times
|
|
as desired in order to bring `suboptimal' modules to `optimal'.
|
|
|
|
Granularity of dependencies
|
|
===========================
|
|
|
|
The description in Nicholas Nethercote's thesis uses fine-grained
|
|
dependency tracking, where for each exported answer only the imported
|
|
analysis results used to compute that answer are recorded.
|
|
|
|
For simplicity, the initial Mercury implementation will only record
|
|
dependencies of entire modules on particular analysis results
|
|
(effectively the exported results depend on all imported analysis
|
|
results used in that compilation). This is worthwhile because none of
|
|
the analyses in the Mercury compiler currently record the information
|
|
required for the more precise approach, and I would expect that other
|
|
compilers not designed for inter-module analysis would also not
|
|
record that information.
|
|
|
|
Differences of the implementation from Nick's thesis
|
|
====================================================
|
|
|
|
We do not store "version calling patterns" in the .analysis files
|
|
However, we have .request files which contain calling patterns for
|
|
which we had no answer, and lookup_best_result() predicate
|
|
automatically matches a calling pattern to the best possible result
|
|
available.
|
|
|
|
We do not support "perfect calls" any differently from any other
|
|
calls.
|
|
|
|
Our IMDGs only record entire modules as being dependent on some
|
|
analysis result so if an analysis result changes, we can only
|
|
mark/invalidate an entire module which makes use of the result.
|
|
This means having an analysis status associated with each individial
|
|
analysis result is somewhat pointless at the moment, as individial
|
|
statuses will never be marked/invalidated, and we also can't make
|
|
use of them to reanalyse only the procedures that actually need
|
|
analysing.
|
|
|
|
Still to do
|
|
===========
|
|
|
|
Page numbers are references to Nick's thesis.
|
|
|
|
We do not yet remove analysis results for procedures which no longer
|
|
exist. (p. 107)
|
|
|
|
In the get_ext_answer procedure Nick uses a glb function in addition
|
|
to a is_more_precise(Ans', Ans_glb) call. I don't know if this is
|
|
necessary. (p. 108)
|
|
|
|
Garbage collection of calling patterns that are no longer used in the
|
|
program. Our IMDGs do not record exactly which calling patterns are
|
|
in use so I'm not sure how this would be done.
|
|
|
|
Removal of IMDG arcs for modules that used to import a module, but no
|
|
longer do so. This would not be hard if we remember which
|
|
modules were imported by M previously in M.imdg. (p. 110)
|
|
|
|
There is no special handling yet of procedure level cycles.
|
|
We cannot follow the approach of p. 116 as we have not enough
|
|
information in our IMDGs.
|
|
|
|
Our treatment of library modules is fairly dumb. We just assume that
|
|
library modules have read-only analysis files, and that we cannot make
|
|
requests to or reanalysis library modules. (p. 117)
|
|
|
|
Modules are currently always analysed from the bottom of the module
|
|
dependency graph upwards. When some analyses start making meaningful
|
|
requests (unlike now where all call patterns are of the type
|
|
`any_call'), the order that modules are analysed in should be changed
|
|
to reduce unnecessary reanalyses.
|
|
|