mirror of
https://github.com/Mercury-Language/mercury.git
synced 2025-12-16 06:14:59 +00:00
Estimated hours taken: 30 Branches: main Add a post-processing pass directly after mode checking that tries to transform procedures to avoid intermediate partially instantiated data structures. The Erlang backend in particular cannot handle partially instantiated data structures. compiler/delay_partial_inst.m: New module. compiler/check_hlds.m: Import delay_partial_inst.m compiler/modes.m: Call the delay partial instantiations pass after mode checking succeeds if it is enabled. compiler/options.m: Add a new internal option `--delay-partial-instantiations', disabled by default. compiler/handle_options.m: Make Erlang target imply --delay-partial-instantiations. compiler/notes/compiler_design.html: Mention delay_partial_inst.m tests/hard_coded/Mercury.options: tests/hard_coded/Mmakefile: tests/hard_coded/delay_partial_test.exp: tests/hard_coded/delay_partial_test.m: tests/hard_coded/delay_partial_test2.exp: tests/hard_coded/delay_partial_test2.m: Add test cases for --delay-partial-instantiations. compiler/goal_util.m: Fix a comment.
1786 lines
59 KiB
HTML
1786 lines
59 KiB
HTML
<html>
|
|
<head>
|
|
<title>
|
|
Notes On The Design Of The Mercury Compiler
|
|
</title>
|
|
</head>
|
|
|
|
<body bgcolor="#ffffff" text="#000000">
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<p>
|
|
This file contains an overview of the design of the compiler.
|
|
|
|
<p>
|
|
See also <a href="overall_design.html">overall_design.html</a>
|
|
for an overview of how the different sub-systems (compiler,
|
|
library, runtime, etc.) fit together.
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h2> OUTLINE </h2>
|
|
|
|
<p>
|
|
|
|
The main job of the compiler is to translate Mercury into C, although it
|
|
can also translate (subsets of) Mercury to some other languages:
|
|
Mercury bytecode (for a planned bytecode interpreter) and MSIL (for the
|
|
Microsoft .NET platform).
|
|
|
|
<p>
|
|
|
|
The top-level of the compiler is in the file mercury_compile.m,
|
|
which is a sub-module of the top_level.m package.
|
|
The basic design is that compilation is broken into the following
|
|
stages:
|
|
|
|
<ul>
|
|
<li> 1. parsing (source files -> HLDS)
|
|
<li> 2. semantic analysis and error checking (HLDS -> annotated HLDS)
|
|
<li> 3. high-level transformations (annotated HLDS -> annotated HLDS)
|
|
<li> 4. code generation (annotated HLDS -> target representation)
|
|
<li> 5. low-level optimizations
|
|
(target representation -> target representation)
|
|
<li> 6. output code (target representation -> target code)
|
|
</ul>
|
|
|
|
|
|
<p>
|
|
Note that in reality the separation is not quite as simple as that.
|
|
Although parsing is listed as step 1 and semantic analysis is listed
|
|
as step 2, the last stage of parsing actually includes some semantic checks.
|
|
And although optimization is listed as steps 3 and 5, it also occurs in
|
|
steps 2, 4, and 6. For example, elimination of assignments to dead
|
|
variables is done in mode analysis; middle-recursion optimization and
|
|
the use of static constants for ground terms is done in code
|
|
generation; and a few low-level optimizations are done in llds_out.m
|
|
as we are spitting out the C code.
|
|
|
|
<p>
|
|
|
|
In addition, the compiler is actually a multi-targeted compiler
|
|
with several different back-ends.
|
|
|
|
<p>
|
|
|
|
The modules in the compiler are structured by being grouped into
|
|
"packages". A "package" is just a meta-module,
|
|
i.e. a module that contains other modules as sub-modules.
|
|
(The sub-modules are almost always stored in separate files,
|
|
which are named only for their final module name.)
|
|
We have a package for the top-level, a package for each main pass, and
|
|
finally there are also some packages for library modules that are used
|
|
by more than one pass.
|
|
<p>
|
|
|
|
Taking all this into account, the structure looks like this:
|
|
|
|
<ul type=disc>
|
|
<li> At the top of the dependency graph is the top_level.m package,
|
|
which currently contains only the one module mercury_compile.m
|
|
which invokes all the different passes in the compiler.
|
|
<li> The next level down is all of the different passes of the compiler.
|
|
In general, we try to stick by the principle that later passes can
|
|
depend on data structures defined in earlier passes, but not vice
|
|
versa.
|
|
<ul type=disc>
|
|
<li> front-end
|
|
<ul type=disc>
|
|
<li> 1. parsing (source files -> HLDS)
|
|
<br> Packages: parse_tree.m and hlds.m
|
|
<li> 2. semantic analysis and error checking
|
|
(HLDS -> annotated HLDS)
|
|
<br> Package: check_hlds.m
|
|
<li> 3. high-level transformations
|
|
(annotated HLDS -> annotated HLDS)
|
|
<br> Package: transform_hlds.m
|
|
</ul>
|
|
<li> back-ends
|
|
<ul type=disc>
|
|
<li> a. LLDS back-end
|
|
<br> Package: ll_backend.m
|
|
<ul type=disc>
|
|
<li> 3a. LLDS-back-end-specific HLDS->HLDS transformations
|
|
<li> 4a. code generation (annotated HLDS -> LLDS)
|
|
<li> 5a. low-level optimizations (LLDS -> LLDS)
|
|
<li> 6a. output code (LLDS -> C)
|
|
</ul>
|
|
<li> b. MLDS back-end
|
|
<br> Package: ml_backend.m
|
|
<ul type=disc>
|
|
<li> 4b. code generation (annotated HLDS -> MLDS)
|
|
<li> 5b. MLDS transformations (MLDS -> MLDS)
|
|
<li> 6b. output code
|
|
(MLDS -> C or MLDS -> MSIL or MLDS -> Java, etc.)
|
|
</ul>
|
|
<li> c. bytecode back-end
|
|
<br> Package: bytecode_backend.m
|
|
<ul type=disc>
|
|
<li> 4c. code generation (annotated HLDS -> bytecode)
|
|
</ul>
|
|
<li> d. Erlang back-end
|
|
<br> Package: erl_backend.m
|
|
<ul type=disc>
|
|
<li> 4d. code generation (annotated HLDS -> ELDS)
|
|
<li> 6d. output code
|
|
(ELDS -> Erlang)
|
|
</ul>
|
|
<li> There's also a package backend_libs.m which contains
|
|
modules which are shared between several different back-ends.
|
|
</ul>
|
|
</ul>
|
|
<li> Finally, at the bottom of the dependency graph there is the package
|
|
libs.m. libs.m contains the option handling code, and also library
|
|
modules which are not sufficiently general or sufficiently useful to
|
|
go in the Mercury standard library.
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
In addition to the packages mentioned above, there are also packages
|
|
for the build system: make.m contains the support for the `--make' option,
|
|
and recompilation.m contains the support for the `--smart-recompilation'
|
|
option.
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h2> DETAILED DESIGN </h2>
|
|
|
|
<p>
|
|
This section describes the role of each module in the compiler.
|
|
For more information about the design of a particular module,
|
|
see the documentation at the start of that module's source code.
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
<p>
|
|
|
|
The action is co-ordinated from mercury_compile.m or make.m (if `--make'
|
|
was specified on the command line).
|
|
|
|
|
|
<h3> Option handling </h3>
|
|
|
|
<p>
|
|
|
|
Option handling is part of the libs.m package.
|
|
|
|
<p>
|
|
|
|
The command-line options are defined in the module options.m.
|
|
mercury_compile.m calls library/getopt.m, passing the predicates
|
|
defined in options.m as arguments, to parse them. It then invokes
|
|
handle_options.m to postprocess the option set. The results are
|
|
stored in the io.state, using the type globals defined in globals.m.
|
|
|
|
|
|
<h3> Build system </h3>
|
|
|
|
<p>
|
|
|
|
Support for `--make' is in the make.m package,
|
|
which contains the following modules:
|
|
|
|
<dl>
|
|
|
|
<dt> make.m
|
|
<dd>
|
|
Categorizes targets passed on the command line and passes
|
|
them to the appropriate module to be built.
|
|
|
|
<dt> make.program_target.m
|
|
<dd>
|
|
Handles whole program `mmc --make' targets, including
|
|
executables, libraries and cleanup.
|
|
|
|
<dt> make.module_target.m
|
|
<dd>
|
|
Handles targets built by a compilation action associated
|
|
with a single module, for example making interface files,
|
|
|
|
<dt> make.dependencies.m
|
|
<dd>
|
|
Compute dependencies between targets and between modules.
|
|
|
|
<dt> make.module_dep_file.m
|
|
<dd>
|
|
Record the dependency information for each module between
|
|
compilations.
|
|
|
|
<dt> make.util.m
|
|
<dd>
|
|
Utility predicates.
|
|
|
|
<dt> options_file.m
|
|
<dd>
|
|
Read the options files specified by the `--options-file'
|
|
option. Also used by mercury_compile.m to collect the value
|
|
of DEFAULT_MCFLAGS, which contains the auto-configured flags
|
|
passed to the compiler.
|
|
|
|
</dl>
|
|
|
|
The build process also invokes routines in compile_target_code.m,
|
|
which is part of the backend_libs.m package (see below).
|
|
|
|
<p>
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> FRONT END </h3>
|
|
<h4> 1. Parsing </h4>
|
|
<h5> The parse_tree.m package </h5>
|
|
|
|
<p>
|
|
The first part of parsing is in the parse_tree.m package,
|
|
which contains the modules listed below
|
|
(except for the library/*.m modules,
|
|
which are in the standard library).
|
|
This part produces the parse_tree.m data structure,
|
|
which is intended to match up as closely as possible
|
|
with the source code, so that it is suitable for tasks
|
|
such as pretty-printing.
|
|
|
|
<p>
|
|
|
|
<ul>
|
|
|
|
<li> <p> lexical analysis (library/lexer.m)
|
|
|
|
<li> <p> stage 1 parsing - convert strings to terms. <p>
|
|
|
|
library/parser.m contains the code to do this, while
|
|
library/term.m and library/varset.m contain the term and varset
|
|
data structures that result, and predicates for manipulating them.
|
|
|
|
<li> <p> stage 2 parsing - convert terms to `items'
|
|
(declarations, clauses, etc.)
|
|
|
|
<p>
|
|
The result of this stage is a parse tree that has a one-to-one
|
|
correspondence with the source code. The parse tree data structure
|
|
definition is in prog_data.m and prog_item.m, while the code to create
|
|
it is in prog_io.m and its submodules prog_io_dcg.m (which handles
|
|
clauses using Definite Clause Grammar notation), prog_io_goal.m (which
|
|
handles goals), prog_io_pragma.m (which handles pragma declarations),
|
|
prog_io_typeclass.m (which handles typeclass and instance
|
|
declarations) and prog_io_util.m (which defines predicates and types
|
|
needed by the other prog_io*.m modules.
|
|
|
|
<p>
|
|
|
|
The modules prog_out.m and mercury_to_mercury.m contain predicates
|
|
for printing the parse tree.
|
|
prog_util.m contains some utility predicates
|
|
for manipulating the parse tree,
|
|
prog_mode contains utility predicates
|
|
for manipulating insts and modes,
|
|
prog_type contains utility predicates
|
|
for manipulating types,
|
|
prog_type_subst contains predicates
|
|
for performing substitutions on types,
|
|
prog_foreign contains utility predicates
|
|
for manipulating foreign code,
|
|
prog_mutable contains utility predicates
|
|
for manipulating mutable variables,
|
|
prog_event contains utility predicates for working with events,
|
|
while error_util.m contains predicates
|
|
for printing nicely formatting error messages.
|
|
|
|
<li><p> imports and exports are handled at this point (modules.m)
|
|
|
|
<p>
|
|
modules.m has the code to write out `.int', `.int2', `.int3',
|
|
`.d' and `.dep' files.
|
|
|
|
<p>
|
|
source_file_map.m contains code to read, write and search
|
|
the mapping between module names and file names.
|
|
|
|
<li><p> module qualification of types, insts and modes
|
|
|
|
<p>
|
|
module_qual.m - <br>
|
|
Adds module qualifiers to all types insts and modes,
|
|
checking that a given type, inst or mode exists and that
|
|
there is only possible match. This is done here because
|
|
it must be done before the `.int' and `.int2' interface files
|
|
are written. This also checks whether imports are really needed
|
|
in the interface.
|
|
|
|
<p>
|
|
Notes on module qualification:
|
|
<ul>
|
|
<li> all types, typeclasses, insts and modes occurring in pred, func,
|
|
type, typeclass and mode declarations are module qualified by
|
|
module_qual.m.
|
|
<li> all types, insts and modes occurring in lambda expressions,
|
|
explicit type qualifications, and clause mode annotations
|
|
are module qualified in make_hlds.m.
|
|
<li> constructors occurring in predicate and function mode declarations
|
|
are module qualified during type checking.
|
|
<li> predicate and function calls and constructors within goals
|
|
are module qualified during mode analysis.
|
|
</ul>
|
|
|
|
|
|
<li><p> reading and writing of optimization interfaces
|
|
(intermod.m and trans_opt.m -- these are part of the
|
|
hlds.m package, not the parse_tree.m package).
|
|
|
|
<p>
|
|
<module>.opt contains clauses for exported preds suitable for
|
|
inlining or higher-order specialization. The `.opt' file for the
|
|
current module is written after type-checking. `.opt' files
|
|
for imported modules are read here.
|
|
<module>.opt contains termination analysis information
|
|
for exported preds (eventually it ought to contain other
|
|
"transitive" information too, e.g. for optimization, but
|
|
currently it is only used for termination analysis).
|
|
`.trans_opt' files for imported modules are read here.
|
|
The `.trans_opt' file for the current module is written
|
|
after the end of semantic analysis.
|
|
|
|
<li><p> expansion of equivalence types (equiv_type.m)
|
|
|
|
<p>
|
|
`with_type` and `with_inst` annotations on predicate
|
|
and function type and mode declarations are also expanded.
|
|
|
|
<p>
|
|
Expansion of equivalence types is really part of type-checking,
|
|
but is done on the item_list rather than on the HLDS because it
|
|
turned out to be much easier to implement that way.
|
|
</ul>
|
|
|
|
<p>
|
|
That's all the modules in the parse_tree.m package.
|
|
|
|
<h5> The hlds.m package </h5>
|
|
<p>
|
|
Once the stages listed above are complete, we then convert from the parse_tree
|
|
data structure to a simplified data structure, which no longer attempts
|
|
to maintain a one-to-one correspondence with the source code.
|
|
This simplified data structure is called the High Level Data Structure (HLDS),
|
|
which is defined in the hlds.m package.
|
|
|
|
<p>
|
|
The last stage of parsing is this conversion to HLDS,
|
|
which is done mostly by the following submodules
|
|
of the make_hlds module in the hlds package.
|
|
<dl>
|
|
|
|
<dt>
|
|
make_hlds_passes.m
|
|
<dd>
|
|
This submodule calls the others to perform the conversion, in several passes.
|
|
(We cannot do everything in one pass;
|
|
for example, we need to have seen a predicate's declaration
|
|
before we can process its clauses.)
|
|
|
|
<dt>
|
|
superhomogeneous.m
|
|
<dd>
|
|
Performs the conversion of unifications into superhomogeneous form.
|
|
|
|
<dt>
|
|
state_var.m
|
|
<dd>
|
|
Expands away state variable syntax.
|
|
|
|
<dt>
|
|
field_access.m
|
|
<dd>
|
|
Expands away field access syntax.
|
|
|
|
<dt>
|
|
add_clause.m
|
|
<dd>
|
|
Converts clauses from parse_tree format to hlds format.
|
|
Handles their addition to procedures,
|
|
which is nontrivial in the presence of mode-specific clauses.
|
|
Eliminates universal quantification
|
|
(using `all [Vs] G' ===> `not (some [Vs] (not G))')
|
|
and implication (using `A => B' ===> `not(A, not B)').
|
|
|
|
<dt>
|
|
add_pred.m
|
|
<dd>
|
|
Handles type and mode declarations for predicates.
|
|
|
|
<dt>
|
|
add_type.m
|
|
<dd>
|
|
Handles the declarations of types.
|
|
|
|
<dt>
|
|
add_mode.m
|
|
<dd>
|
|
Handles the declarations of insts and modes,
|
|
including checking for circular insts and modes.
|
|
|
|
<dt>
|
|
add_special_pred.m
|
|
<dd>
|
|
Adds unify, compare, and (if needed) index and init predicates
|
|
to the HLDS as necessary.
|
|
|
|
<dt>
|
|
add_solver.m
|
|
<dd>
|
|
Adds the casting predicates needed by solver types to the HLDS as necessary.
|
|
|
|
<dt>
|
|
add_class.m
|
|
<dd>
|
|
Handles typeclass and instance declarations.
|
|
|
|
<dt>
|
|
qual_info.m
|
|
<dd>
|
|
Handles the abstract data types used for module qualification.
|
|
|
|
<dt>
|
|
make_hlds_warn.m
|
|
<dd>
|
|
Looks for constructs that merit warnings,
|
|
such as singleton variables and variables with overlapping scopes.
|
|
|
|
<dt>
|
|
make_hlds_error.m
|
|
<dd>
|
|
Error messages used by more than one submodule of make_hlds.m.
|
|
|
|
<dt>
|
|
add_pragma.m
|
|
<dd>
|
|
Adds most kinds of pragmas to the HLDS,
|
|
including import/export pragmas, tabling pragmas and foreign code.
|
|
|
|
</dl>
|
|
|
|
Fact table pragmas are handled by fact_table.m
|
|
(which is part of the ll_backend.m package).
|
|
That module also reads the facts from the declared file
|
|
and compiles them into a separate C file
|
|
used by the foreign_proc body of the relevant predicate.
|
|
|
|
The conversion of the item list to HLDS also involves make_tags.m,
|
|
which chooses the data representation for each discriminated union type
|
|
by assigning tags to each functor.
|
|
|
|
<p>
|
|
The HLDS data structure itself is spread over the following modules:
|
|
|
|
<ol>
|
|
<li>
|
|
hlds_args.m defines the parts of the HLDS concerned with predicate
|
|
and function argument lists.
|
|
<li>
|
|
hlds_data.m defines the parts of the HLDS concerned with
|
|
function symbols, types, insts, modes and determinisms;
|
|
<li>
|
|
hlds_goal.m defines the part of the HLDS concerned with the
|
|
structure of goals, including the annotations on goals.
|
|
<li>
|
|
hlds_clauses.m defines the part of the HLDS concerning clauses.
|
|
<li>
|
|
hlds_rtti.m defines the part of the HLDS concerning RTTI.
|
|
<li>
|
|
hlds_pred.m defines the part of the HLDS concerning predicates and procedures;
|
|
<li>
|
|
pred_table.m defines the tables that index predicates and functions
|
|
on various combinations of (qualified and unqualified) names and arity.
|
|
<li>
|
|
hlds_module.m defines the top-level parts of the HLDS,
|
|
including the type module_info.
|
|
</ol>
|
|
|
|
<p>
|
|
The module hlds_out.m contains predicates to dump the HLDS to a file.
|
|
|
|
<p>
|
|
The hlds.m package also contains some utility modules that contain
|
|
various library routines which are used by other modules that manipulate
|
|
the HLDS:
|
|
|
|
<dl>
|
|
<dt> hlds_code_util.m
|
|
<dd> Utility routines for use during HLDS generation.
|
|
|
|
<dt> goal_form.m
|
|
<dd> Contains predicates for determining whether
|
|
HLDS goals match various criteria.
|
|
|
|
<dt> goal_util.m
|
|
<dd> Contains various miscellaneous utility predicates for manipulating
|
|
HLDS goals, e.g. for renaming variables.
|
|
|
|
<dt> passes_aux.m
|
|
<dd> Contains code to write progress messages, and higher-order code
|
|
to traverse all the predicates defined in the current module
|
|
and do something with each one.
|
|
|
|
<dt> hlds_error_util.m:
|
|
<dd> Utility routines for printing nicely formatted error messages
|
|
for symptoms involving HLDS data structures.
|
|
For symptoms involving only structures defined in prog_data,
|
|
use parse_tree.error_util.
|
|
|
|
<dt> code_model.m:
|
|
<dd> Defines a type for classifying determinisms
|
|
in ways useful to the various backends,
|
|
and utility predicates on that type.
|
|
|
|
<dt> arg_info.m:
|
|
<dd> Utility routines that the various backends use
|
|
to analyze procedures' argument lists
|
|
and decide on parameter passing conventions.
|
|
|
|
<dt> hhf.m:
|
|
<dd> Facilities for translating the bodies of predicates
|
|
to hyperhomogeneous form, for constraint based mode analysis.
|
|
|
|
<dt> inst_graph.m:
|
|
<dd> Defines the inst_graph data type,
|
|
which describes the structures of insts for constraint based mode analysis,
|
|
as well as predicates operating on that type.
|
|
</dl>
|
|
|
|
<h4> 2. Semantic analysis and error checking </h4>
|
|
|
|
<p>
|
|
This is the check_hlds.m package,
|
|
with support from the mode_robdd.m package for constraint based mode analysis.
|
|
|
|
<p>
|
|
|
|
Any pass which can report errors or warnings must be part of this stage,
|
|
so that the compiler does the right thing for options such as
|
|
`--halt-at-warn' (which turns warnings into errors) and
|
|
`--error-check-only' (which makes the compiler only compile up to this stage).
|
|
|
|
<dl>
|
|
|
|
<dt> implicit quantification
|
|
|
|
<dd>
|
|
quantification.m (XXX which for some reason is part of the hlds.m
|
|
package rather than the check_hlds.m package)
|
|
handles implicit quantification and computes
|
|
the set of non-local variables for each sub-goal.
|
|
It also expands away bi-implication (unlike the expansion
|
|
of implication and universal quantification, this expansion
|
|
cannot be done until after quantification).
|
|
This pass is called from the `transform' predicate in make_hlds.m.
|
|
<p>
|
|
|
|
<dt> checking typeclass instances (check_typeclass.m)
|
|
<dd>
|
|
check_typeclass.m both checks that instance declarations satisfy all
|
|
the appropriate superclass constraints
|
|
(including functional dependencies)
|
|
and performs a source-to-source transformation on the
|
|
methods from the instance declarations.
|
|
The transformed code is checked for type, mode, uniqueness, purity
|
|
and determinism correctness by the later passes, which has the effect
|
|
of checking the correctness of the instance methods themselves
|
|
(ie. that the instance methods match those expected by the typeclass
|
|
declaration).
|
|
During the transformation,
|
|
pred_ids and proc_ids are assigned to the methods for each instance.
|
|
|
|
<p>
|
|
While checking that the superclasses of a class are satisfied
|
|
by the instance declaration, a set of constraint_proofs are built up
|
|
for the superclass constraints. These are used by polymorphism.m when
|
|
generating the base_typeclass_info for the instance.
|
|
|
|
<p>
|
|
This module also checks that there are no ambiguous pred/func
|
|
declarations (that is, it checks that all type variables in constraints
|
|
are determined by type variables in arguments),
|
|
checks that there are no cycles in the typeclass hierarchy,
|
|
and checks that each abstract instance has a corresponding
|
|
typeclass instance.
|
|
<p>
|
|
|
|
<dt> check user defined insts for consistency with types
|
|
<dd>
|
|
inst_check.m checks that all user defined bound insts are consistent
|
|
with at least one type in scope
|
|
(i.e. that the set of function symbols
|
|
in the bound list for the inst are a subset of the allowed function
|
|
symbols for at least one type in scope).
|
|
|
|
<p>
|
|
A warning is issued if it finds any user defined bound insts not
|
|
consistent with any types in scope.
|
|
<p>
|
|
|
|
<dt> improving the names of head variables
|
|
<dd>
|
|
headvar_names.m tries to replace names of the form HeadVar__n
|
|
with actual names given by the programmer.
|
|
<p>
|
|
For efficiency, this phase not a standalone pass,
|
|
but is instead invoked by the typechecker.
|
|
|
|
<dt> type checking
|
|
|
|
<dd>
|
|
<ul>
|
|
<li> typecheck.m handles type checking, overloading resolution &
|
|
module name resolution, and almost fully qualifies all predicate
|
|
and functor names. It sets the map(var, type) field in the
|
|
pred_info. However, typecheck.m doesn't figure out the pred_id
|
|
for function calls or calls to overloaded predicates; that can't
|
|
be done in a single pass of typechecking, and so it is done
|
|
later on (in post_typecheck.m, for both preds and function calls)
|
|
<li> typecheck_info.m defines the main data structures used by
|
|
typechecking.
|
|
<li> typecheck_errors.m handles outputting of type errors.
|
|
<li> typeclasses.m checks typeclass constraints, and
|
|
any redundant constraints that are eliminated are recorded (as
|
|
constraint_proofs) in the pred_info for future reference.
|
|
<li> type_util.m contains utility predicates dealing with types
|
|
that are used in a variety of different places within the compiler
|
|
<li> post_typecheck.m may also be considered to logically be a part
|
|
of typechecking, but it is actually called from purity
|
|
analysis (see below). It contains the stuff related to
|
|
type checking that can't be done in the main type checking pass.
|
|
It also removes assertions from further processing.
|
|
post_typecheck.m reports errors for unbound type and inst variables,
|
|
for unsatisfied type class constraints and for indistinguishable
|
|
predicate or function modes.
|
|
</ul>
|
|
<p>
|
|
|
|
<dt> assertions
|
|
|
|
<dd>
|
|
assertion.m (XXX in the hlds.m package)
|
|
is the abstract interface to the assertion table.
|
|
Currently all the compiler does is type check the assertions and
|
|
record for each predicate that is used in an assertion, which
|
|
assertion it is used in. The set up of the assertion table occurs
|
|
in post_typecheck.finish_assertion.
|
|
<p>
|
|
|
|
<dt> purity analysis
|
|
|
|
<dd>
|
|
purity.m is responsible for purity checking, as well as
|
|
defining the <CODE>purity</CODE> type and a few public
|
|
operations on it. It also calls post_typecheck.m to
|
|
complete the handling of predicate
|
|
overloading for cases which typecheck.m is unable to handle,
|
|
and to check for unbound type variables.
|
|
Elimination of double negation is also done here; that needs to
|
|
be done after quantification analysis and before mode analysis.
|
|
Calls to `private_builtin.unsafe_type_cast/2' are converted
|
|
into `generic_call(unsafe_cast, ...)' goals here.
|
|
<p>
|
|
|
|
<dt> polymorphism transformation
|
|
|
|
<dd>
|
|
polymorphism.m handles introduction of type_info arguments for
|
|
polymorphic predicates and introduction of typeclass_info arguments
|
|
for typeclass-constrained predicates.
|
|
This phase needs to come before mode analysis so that mode analysis
|
|
can properly reorder code involving existential types.
|
|
(It also needs to come before simplification so that simplify.m's
|
|
optimization of goals with no output variables doesn't do the
|
|
wrong thing for goals whose only output is the type_info for
|
|
an existentially quantified type parameter.)
|
|
<p>
|
|
This phase also
|
|
converts higher-order predicate terms into lambda expressions,
|
|
and copies the clauses to the proc_infos in preparation for
|
|
mode analysis.
|
|
<p>
|
|
The polymorphism.m module also exports some utility routines that
|
|
are used by other modules. These include some routines for generating
|
|
code to create type_infos, which are used by simplify.m and magic.m
|
|
when those modules introduce new calls to polymorphic procedures.
|
|
<p>
|
|
When it has finished, polymorphism.m calls clause_to_proc.m to
|
|
make duplicate copies of the clauses for each different mode of
|
|
a predicate; all later stages work on procedures, not predicates.
|
|
<p>
|
|
|
|
<dt> mode analysis
|
|
|
|
<dd>
|
|
<ul>
|
|
<li> modes.m is the main mode analysis module.
|
|
It checks that the code is mode-correct, reordering it
|
|
if necessary, and annotates each goal with a delta-instmap
|
|
that specifies the changes in instantiatedness of each
|
|
variable over that goal.
|
|
<li> modecheck_unify.m is the sub-module which analyses
|
|
unification goals.
|
|
It also module qualifies data constructors.
|
|
<li> modecheck_call.m is the sub-module which analyses calls.
|
|
|
|
<p>
|
|
|
|
The following sub-modules are used:
|
|
<dl>
|
|
<dt> mode_info.m
|
|
<dd>
|
|
(the main data structure for mode analysis)
|
|
<dt> delay_info.m
|
|
<dd>
|
|
(a sub-component of the mode_info data
|
|
structure used for storing the information
|
|
for scheduling: which goals are currently
|
|
delayed, what variables they are delayed on, etc.)
|
|
<dt> instmap.m (XXX in the hlds.m package)
|
|
<dd>
|
|
Defines the instmap and instmap_delta ADTs
|
|
which store information on what instantiations
|
|
a set of variables may be bound to.
|
|
<dt> inst_match.m
|
|
<dd>
|
|
This contains the code for examining insts and
|
|
checking whether they match.
|
|
<dt> inst_util.m
|
|
<dd>
|
|
This contains the code for creating new insts from
|
|
old ones: unifying them, merging them and so on.
|
|
<dt> mode_errors.m
|
|
<dd>
|
|
This module contains all the code to
|
|
print error messages for mode errors
|
|
</dl>
|
|
<li> mode_util.m contains miscellaneous useful predicates dealing
|
|
with modes (many of these are used by lots of later stages
|
|
of the compiler)
|
|
<li> mode_debug.m contains utility code for tracing the actions
|
|
of the mode checker.
|
|
<li> delay_partial_inst.m adds a post-processing pass on mode-correct
|
|
procedures to avoid creating intermediate, partially instantiated
|
|
data structures.
|
|
</ul>
|
|
<p>
|
|
|
|
<dt> constraint based mode analysis
|
|
|
|
<dd> This is an experimental alternative
|
|
to the usual mode analysis algorithm.
|
|
It works by building a system of boolean constraints
|
|
about where (parts of) variables can be bound,
|
|
and then solving those constraints.
|
|
|
|
<ul>
|
|
<li> mode_constraints.m is the module that finds the constraints
|
|
and adds them to the constraint store.
|
|
<li> mode_ordering.m is the module that uses solutions of the
|
|
constraint system to find an ordering for the goals in conjunctions.
|
|
<li> mode_constraint_robdd.m is the interface to the modules
|
|
that perform constraint solving using reduced ordered binary decision
|
|
diagrams (robdds).
|
|
<li> We have several implementations of solvers using robdds.
|
|
Each solver is in a module named mode_robdd.X.m, and they all belong
|
|
to the top-level mode_robdd.m.
|
|
</ul>
|
|
<p>
|
|
|
|
<dt> constraint based mode analysis propagation solver
|
|
|
|
<dd> This is a new alternative
|
|
for the constraint based mode analysis algorithm.
|
|
It will perform conjunct reordering for mercury
|
|
programs of a limited syntax (it calls error if
|
|
it encounters higher order code or a parallel
|
|
conjunction, or is asked to infer modes).
|
|
|
|
|
|
<ul>
|
|
<li> prop_mode_constraints.m is the interface to the old
|
|
mode_constraints.m. It builds constraints for an SCC.
|
|
<li> build_mode_constraints.m is the module that traverses a predicate
|
|
to build constraints for it.
|
|
<li> abstract_mode_constraints.m describes data structures for the
|
|
constraints themselves.
|
|
<li> ordering_mode_constraints.m solves constraints to determine
|
|
the producing and consuming goals for program variables, and
|
|
performs conjunct reordering based on the result.
|
|
<li> mcsolver.m contains the constraint solver used by
|
|
ordering_mode_constraints.m.
|
|
</ul>
|
|
<p>
|
|
|
|
<dt> indexing and determinism analysis
|
|
|
|
<dd>
|
|
<ul>
|
|
<li> switch_detection.m transforms into switches those disjunctions
|
|
in which several disjuncts test the same variable against different
|
|
function symbols.
|
|
<li> cse_detection.m looks for disjunctions in which each disjunct tests
|
|
the same variable against the same function symbols, and hoists any
|
|
such unifications out of the disjunction.
|
|
If cse_detection.m modifies the code,
|
|
it will re-run mode analysis and switch detection.
|
|
<li> det_analysis.m annotates each goal with its determinism;
|
|
it inserts cuts in the form of "some" goals wherever the determinisms
|
|
and delta instantiations of the goals involved make it necessary.
|
|
Any errors found during determinism analysis are reported by
|
|
det_report.m.
|
|
det_util.m contains utility predicates used in several modules.
|
|
</ul>
|
|
<p>
|
|
|
|
<dt> checking of unique modes (unique_modes.m)
|
|
|
|
<dd>
|
|
unique_modes.m checks that non-backtrackable unique modes were
|
|
not used in a context which might require backtracking.
|
|
Note that what unique_modes.m does is quite similar to
|
|
what modes.m does, and unique_modes calls lots of predicates
|
|
defined in modes.m to do it.
|
|
<p>
|
|
|
|
<dt> stratification checking
|
|
|
|
<dd>
|
|
The module stratify.m implements the `--warn-non-stratification'
|
|
warning, which is an optional warning that checks for loops
|
|
through negation.
|
|
<p>
|
|
|
|
<dt> simplification (simplify.m)
|
|
|
|
<dd>
|
|
simplify.m finds and exploits opportunities for simplifying the
|
|
internal form of the program, both to optimize the code and to
|
|
massage the code into a form the code generator will accept.
|
|
It also warns the programmer about any constructs that are so simple
|
|
that they should not have been included in the program in the first
|
|
place. (That's why this pass needs to be part of semantic analysis:
|
|
because it can report warnings.)
|
|
simplify.m converts complicated unifications into procedure calls.
|
|
simplify.m calls common.m which looks for (a) construction unifications
|
|
that construct a term that is the same as one that already exists,
|
|
or (b) repeated calls to a predicate with the same inputs, and replaces
|
|
them with assignment unifications.
|
|
simplify.m also attempts to partially evaluate calls to builtin
|
|
procedures if the inputs are all constants (this is const_prop.m
|
|
in the transform_hlds.m package).
|
|
simplify.m also calls format_call.m to look for
|
|
(possibly) incorrect uses of string.format io.format.
|
|
<p>
|
|
|
|
<dt> unused imports (unused_imports.m)
|
|
|
|
<dd>
|
|
unused_imports.m determines which imports of the module
|
|
are not required for the module to compile. It also identifies
|
|
which imports of a module can be moved from the interface to the
|
|
implementation.
|
|
<p>
|
|
|
|
<dt> xml documentation (xml_documentation.m)
|
|
|
|
<dd>
|
|
xml_documentation.m outputs a XML representation of all the
|
|
declarations in the module. This XML representation is designed
|
|
to be transformed via XSL into more human readable documentation.
|
|
<p>
|
|
|
|
</dl>
|
|
|
|
<h4> 3. High-level transformations </h4>
|
|
|
|
<p>
|
|
This is the transform_hlds.m package.
|
|
|
|
<p>
|
|
|
|
The first pass of this stage does tabling transformations (table_gen.m).
|
|
This involves the insertion of several calls to tabling predicates
|
|
defined in mercury_builtin.m and the addition of some scaffolding structure.
|
|
Note that this pass can change the evaluation methods of some procedures to
|
|
eval_table_io, so it should come before any passes that require definitive
|
|
evaluation methods (e.g. inlining).
|
|
|
|
<p>
|
|
|
|
The next pass of this stage is a code simplification, namely
|
|
removal of lambda expressions (lambda.m):
|
|
|
|
<ul>
|
|
<li>
|
|
lambda.m converts lambda expressions into higher-order predicate
|
|
terms referring to freshly introduced separate predicates.
|
|
This pass needs to come after unique_modes.m to ensure that
|
|
the modes we give to the introduced predicates are correct.
|
|
It also needs to come after polymorphism.m since polymorphism.m
|
|
doesn't handle higher-order predicate constants.
|
|
</ul>
|
|
|
|
(Is there any good reason why lambda.m comes after table_gen.m?)
|
|
|
|
|
|
<p>
|
|
|
|
Expansion of equivalence types (equiv_type_hlds.m)
|
|
|
|
<ul>
|
|
<li>
|
|
This pass expands equivalences which are not meant to
|
|
be visible to the user of imported modules. This
|
|
is necessary for the IL back-end and in some cases
|
|
for `:- pragma export' involving foreign types on
|
|
the C back-end.
|
|
|
|
<p>
|
|
|
|
It's also needed by the MLDS->C back-end, for
|
|
--high-level-data, and for cases involving abstract
|
|
equivalence types which are defined as "float".
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
Exception analysis. (exception_analysis.m)
|
|
|
|
<ul>
|
|
<li>
|
|
This pass annotates each module with information about whether
|
|
the procedures in the module may throw an exception or not.
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
The next pass is termination analysis. The various modules involved are:
|
|
|
|
<ul>
|
|
<li>
|
|
termination.m is the control module. It sets the argument size and
|
|
termination properties of builtin and compiler generated procedures,
|
|
invokes term_pass1.m and term_pass2.m
|
|
and writes .trans_opt files and error messages as appropriate.
|
|
<li>
|
|
term_pass1.m analyzes the argument size properties of user-defined procedures,
|
|
<li>
|
|
term_pass2.m analyzes the termination properties of user-defined procedures.
|
|
<li>
|
|
term_traversal.m contains code common to the two passes.
|
|
<li>
|
|
term_errors.m defines the various kinds of termination errors
|
|
and prints the messages appropriate for each.
|
|
<li>
|
|
term_util.m defines the main types used in termination analysis
|
|
and contains utility predicates.
|
|
<li>
|
|
post_term_analysis.m contains error checking routines and optimizations
|
|
that depend upon the information obtained by termination analysis.
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
Trail usage analysis. (trailing_analysis.m)
|
|
|
|
<ul>
|
|
<li>
|
|
This pass annotates each module with information about whether
|
|
the procedures in the module modify the trail or not. This
|
|
information can be used to avoid redundant trailing operations.
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
Minimal model tabling analysis. (tabling_analysis.m)
|
|
|
|
<ul>
|
|
<li>
|
|
This pass annotates each goal in a module with information about
|
|
whether the goal calls procedures that are evaluated using
|
|
minimal model tabling. This information can be used to reduce
|
|
the overhead of minimal model tabling.
|
|
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
Most of the remaining HLDS-to-HLDS transformations are optimizations:
|
|
|
|
<ul>
|
|
<li> specialization of higher-order and polymorphic predicates where the
|
|
value of the higher-order/type_info/typeclass_info arguments are known
|
|
(higher_order.m)
|
|
|
|
<li> attempt to introduce accumulators (accumulator.m). This optimizes
|
|
procedures whose tail consists of independent associative computations
|
|
or independent chains of commutative computations into a tail
|
|
recursive form by the introduction of accumulators. If lco is turned
|
|
on it can also transform some procedures so that only construction
|
|
unifications are after the recursive call. This pass must come before
|
|
lco, unused_args (eliminating arguments makes it hard to relate the
|
|
code back to the assertion) and inlining (can make the associative
|
|
call disappear).
|
|
<p>
|
|
This pass makes use of the goal_store.m module, which is a dictionary-like
|
|
data structure for storing HLDS goals.
|
|
|
|
<li> inlining (i.e. unfolding) of simple procedures (inlining.m)
|
|
|
|
<li> loop_inv.m: loop invariant hoisting. This transformation moves
|
|
computations within loops that are the same on every iteration to the outside
|
|
of the loop so that the invariant computations are only computed once. The
|
|
transformation turns a single looping predicate containing invariant
|
|
computations into two: one that computes the invariants on the first
|
|
iteration and then loops by calling the second predicate with extra arguments
|
|
for the invariant values. This pass should come after inlining, since
|
|
inlining can expose important opportunities for loop invariant hoisting.
|
|
Such opportunities might not be visible before inlining because only
|
|
*part* of the body of a called procedure is loop-invariant.
|
|
|
|
<li> deforestation and partial evaluation (deforest.m). This optimizes
|
|
multiple traversals of data structures within a conjunction, and
|
|
avoids creating intermediate data structures. It also performs
|
|
loop unrolling where the clause used is known at compile time.
|
|
deforest.m makes use of the following sub-modules (`pd_' stands for
|
|
"partial deduction"):
|
|
<ul>
|
|
<li> constraint.m transforms goals so that goals which can fail are
|
|
executed earlier.
|
|
<li> pd_cost.m contains some predicates to estimate the improvement
|
|
caused by deforest.m.
|
|
<li> pd_debug.m produces debugging output.
|
|
<li> pd_info.m contains a state type for deforestation.
|
|
<li> pd_term.m contains predicates to check that the deforestation
|
|
algorithm terminates.
|
|
<li> pd_util.m contains various utility predicates.
|
|
</ul>
|
|
|
|
<li> issue warnings about unused arguments from predicates, and create
|
|
specialized versions without them (unused_args.m); type_infos are often unused.
|
|
|
|
<li> delay_construct.m pushes construction unifications to the right in
|
|
semidet conjunctions, in an effort to reduce the probability that it will
|
|
need to be executed.
|
|
|
|
<li> unneeded_code.m looks for goals whose results are either not needed
|
|
at all, or needed in some branches of computation but not others. Provided
|
|
that the goal in question satisfies some requirements (e.g. it is pure,
|
|
it cannot fail etc), it either deletes the goal or moves it to the
|
|
computation branches where its output is needed.
|
|
|
|
<dt> lco.m finds predicates whose implementations would benefit
|
|
from last call optimization modulo constructor application.
|
|
|
|
<li> elimination of dead procedures (dead_proc_elim.m). Inlining, higher-order
|
|
specialization and the elimination of unused args can make procedures dead
|
|
even if the user doesn't, and automatically constructed unification and
|
|
comparison predicates are often dead as well.
|
|
|
|
<li> elimination of dead procedures (dead_proc_elim.m). Inlining, higher-order
|
|
specialization and the elimination of unused args can make procedures dead
|
|
even the user doesn't, and automatically constructed unification and
|
|
comparison predicates are often dead as well.
|
|
|
|
<li> tupling.m looks for predicates that pass around several arguments,
|
|
and modifies the code to pass around a single tuple of these arguments
|
|
instead if this looks like reducing the cost of parameter passing.
|
|
|
|
<li> untupling.m does the opposite of tupling.m: it replaces tuple arguments
|
|
with their components. This can be useful both for finding out how much
|
|
tupling has already been done manually in the source code, and to break up
|
|
manual tupling in favor of possibly more profitable automatic tupling.
|
|
|
|
<li> dep_par_conj.m transforms parallel conjunctions to add the wait and signal
|
|
operations required by dependent AND parallelism. To maximize the amount of
|
|
parallelism available, it tries to push the signals as early as possible
|
|
in producers and the waits as late as possible in the consumers, creating
|
|
specialized versions of predicates as needed.
|
|
|
|
<li> granularity.m tries to ensure that programs do not generate too much
|
|
parallelism. Its goal is to minimize parallelism's overhead while still
|
|
gaining all the parallelism the machine can actually exploit.
|
|
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
The module transform.m contains stuff that is supposed to be useful
|
|
for high-level optimizations (but which is not yet used).
|
|
|
|
<p>
|
|
|
|
The last three HLDS-to-HLDS transformations implement
|
|
term size profiling (size_prof.m and complexity.m) and
|
|
deep profiling (deep_profiling.m, in the ll_backend.m package).
|
|
Both passes insert into procedure bodies, among other things,
|
|
calls to procedures (some of which are impure)
|
|
that record profiling information.
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> a. LLDS BACK-END </h3>
|
|
|
|
<p>
|
|
This is the ll_backend.m package.
|
|
|
|
<h4> 3a. LLDS-specific HLDS -> HLDS transformations </h4>
|
|
|
|
Before LLDS code generation, there are a few more passes which
|
|
annotate the HLDS with information used for LLDS code generation,
|
|
or perform LLDS-specific transformations on the HLDS:
|
|
|
|
<dl>
|
|
<dt> reducing the number of variables that have to be
|
|
saved across procedure calls (saved_vars.m)
|
|
<dd>
|
|
We do this by putting the code that generates
|
|
the value of a variable just before the use of
|
|
that variable, duplicating the variable and the
|
|
code that produces it if necessary, provided
|
|
the cost of doing so is smaller than the cost
|
|
of saving and restoring the variable would be.
|
|
|
|
<dt> transforming procedure definitions to reduce the number
|
|
of variables that need their own stack slots
|
|
(stack_opt.m)
|
|
<dd>
|
|
The main algorithm in stack_opt.m figures out when
|
|
variable A can be reached from a cell pointed to by
|
|
variable B, so that storing variable B on the stack
|
|
obviates the need to store variable A on the stack
|
|
as well.
|
|
This algorithm relies on an implementation of
|
|
the maximal matching algorithm in matching.m.
|
|
<dt> migration of builtins following branched structures
|
|
(follow_code.m)
|
|
<dd>
|
|
This transformation the results of follow_vars.m
|
|
(see below)
|
|
<dt> simplification again (simplify.m, in the check_hlds.m
|
|
package)
|
|
<dd>
|
|
We run this pass a second time in case the intervening
|
|
transformations have created new opportunities for
|
|
simplification. It needs to be run immediately
|
|
before code generation, because it enforces some
|
|
invariants that the LLDS code generator relies on.
|
|
<dt> annotation of goals with liveness information (liveness.m)
|
|
<dd>
|
|
This records the birth and death of each variable
|
|
in the HLDS goal_info.
|
|
<dt> allocation of stack slots
|
|
<dd>
|
|
This is done by stack_alloc.m, with the assistance of
|
|
the following modules:
|
|
|
|
<ul>
|
|
<li> live_vars.m works out which variables need
|
|
to be saved on the stack when.
|
|
|
|
<li> graph_colour.m (in the libs.m package)
|
|
contains the algorithm that
|
|
stack_alloc.m calls to convert sets of variables
|
|
that must be saved on the stack at the same time
|
|
to an assignment of a stack slot to each such variable.
|
|
</ul>
|
|
<dt> allocating the follow vars (follow_vars.m)
|
|
<dd>
|
|
Traverses backwards over the HLDS, annotating some
|
|
goals with information about what locations variables
|
|
will be needed in next. This allows us to generate
|
|
more efficient code by putting variables in the right
|
|
spot directly. This module is not called from
|
|
mercury_compile.m; it is called from store_alloc.m.
|
|
<dt> allocating the store map (store_alloc.m)
|
|
<dd>
|
|
Annotates each branched goal with variable location
|
|
information so that we can generate correct code
|
|
by putting variables in the same spot at the end
|
|
of each branch.
|
|
<dt> computing goal paths (goal_path.m
|
|
in the check_hlds.m package)
|
|
<dd>
|
|
The goal path of a goal defines its position in
|
|
the procedure body. This transformation attaches
|
|
its goal path to every goal, for use by the debugger.
|
|
</dl>
|
|
|
|
<h4> 4a. Code generation. </h4>
|
|
<dl>
|
|
<dt> code generation
|
|
|
|
<dd>
|
|
Code generation converts HLDS into LLDS.
|
|
For the LLDS back-end, this is also the point at which we
|
|
insert code to handle debugging and trailing, and to do
|
|
heap reclamation on failure.
|
|
The top level code generation module is proc_gen.m,
|
|
which looks after the generation of code for procedures
|
|
(including prologues and epilogues).
|
|
The predicate for generating code for arbitrary goals is in code_gen.m,
|
|
but that module handles only sequential conjunctions; it calls
|
|
other modules to handle other kinds of goals:
|
|
|
|
<ul>
|
|
<li> ite_gen.m (if-then-elses)
|
|
<li> call_gen.m (predicate calls and also calls to
|
|
out-of-line unification procedures)
|
|
<li> disj_gen.m (disjunctions)
|
|
<li> par_conj_gen.m (parallel conjunctions)
|
|
<li> unify_gen.m (unifications)
|
|
<li> switch_gen.m (switches), which has sub-modules
|
|
<ul>
|
|
<li> dense_switch.m
|
|
<li> lookup_switch.m
|
|
<li> string_switch.m
|
|
<li> tag_switch.m
|
|
<li> switch_util.m -- this is in the backend_libs.m
|
|
package, since it is also used by MLDS back-end
|
|
</ul>
|
|
<li> commit_gen.m (commits)
|
|
<li> pragma_c_gen.m (embedded C code)
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
The code generator also calls middle_rec.m to do middle recursion
|
|
optimization, which is implemented during code generation.
|
|
|
|
<p>
|
|
|
|
The code generation modules make use of
|
|
<dl>
|
|
<dt> code_info.m
|
|
<dd>
|
|
The main data structure for the code generator.
|
|
<dt> var_locn.m
|
|
<dd>
|
|
This defines the var_locn type, which is a
|
|
sub-component of the code_info data structure;
|
|
it keeps track of the values and locations of variables.
|
|
It implements eager code generation.
|
|
<dt> exprn_aux.m
|
|
<dd>
|
|
Various utility predicates.
|
|
<dt> code_util.m
|
|
<dd>
|
|
Some miscellaneous preds used for code generation.
|
|
<dt> lookup_util.m
|
|
<dd>
|
|
Some miscellaneous preds used for lookup switch
|
|
(and lookup disjunction) generation.
|
|
<dt> continuation_info.m
|
|
<dd>
|
|
For accurate garbage collection, collects
|
|
information about each live value after calls,
|
|
and saves information about procedures.
|
|
<dt> trace_gen.m
|
|
<dd>
|
|
Inserts calls to the runtime debugger.
|
|
<dt> trace_params.m (in the libs.m package, since it
|
|
is considered part of option handling)
|
|
<dd>
|
|
Holds the parameter settings controlling
|
|
the handling of execution tracing.
|
|
</dl>
|
|
|
|
<dt> code generation for `pragma export' declarations (export.m)
|
|
<dd> This is handled separately from the other parts of code generation.
|
|
mercury_compile.m calls the procedures `export.produce_header_file'
|
|
and `export.get_pragma_exported_procs' to produce C code fragments
|
|
which declare/define the C functions which are the interface stubs
|
|
for procedures exported to C.
|
|
|
|
<dt> generation of constants for RTTI data structures
|
|
<dd> This could also be considered a part of code generation,
|
|
but for the LLDS back-end this is currently done as part
|
|
of the output phase (see below).
|
|
|
|
</dl>
|
|
|
|
<p>
|
|
|
|
The result of code generation is the Low Level Data Structure (llds.m),
|
|
which may also contains some data structures whose types are defined in rtti.m.
|
|
The code for each procedure is generated as a tree of code fragments
|
|
which is then flattened (tree.m).
|
|
|
|
|
|
<h4> 5a. Low-level optimization (LLDS). </h4>
|
|
|
|
<p>
|
|
|
|
Most of the various LLDS-to-LLDS optimizations are invoked from optimize.m.
|
|
They are:
|
|
|
|
<ul>
|
|
<li> optimization of jumps to jumps (jumpopt.m)
|
|
|
|
<li> elimination of duplicate code sequences within procedures (dupelim.m)
|
|
|
|
<li> elimination of duplicate procedure bodies (dupproc.m,
|
|
invoked directly from mercury_compile.m)
|
|
|
|
<li> optimization of stack frame allocation/deallocation (frameopt.m)
|
|
|
|
<li> filling branch delay slots (delay_slot.m)
|
|
|
|
<li> dead code and dead label removal (labelopt.m)
|
|
|
|
<li> peephole optimization (peephole.m)
|
|
|
|
<li> introduction of local C variables (use_local_vars.m)
|
|
|
|
<li> removal of redundant assignments, i.e. assignments that assign a value
|
|
that the target location already holds (reassign.m)
|
|
|
|
</ul>
|
|
|
|
In addition, stdlabel.m performs standardization of labels.
|
|
This is not an optimization itself,
|
|
but it allows other optimizations to be evaluated more easily.
|
|
|
|
<p>
|
|
|
|
The module opt_debug.m contains utility routines used for debugging
|
|
these LLDS-to-LLDS optimizations.
|
|
|
|
<p>
|
|
|
|
Several of these optimizations (frameopt and use_local_vars) also
|
|
use livemap.m, a module that finds the set of locations live at each label.
|
|
|
|
<p>
|
|
|
|
Use_local_vars numbering also introduces
|
|
references to temporary variables in extended basic blocks
|
|
in the LLDS representation of the C code.
|
|
The transformation to insert the block scopes
|
|
and declare the temporary variables is performed by wrap_blocks.m.
|
|
|
|
<p>
|
|
|
|
Depending on which optimization flags are enabled,
|
|
optimize.m may invoke many of these passes multiple times.
|
|
|
|
<p>
|
|
|
|
Some of the low-level optimization passes use basic_block.m,
|
|
which defines predicates for converting sequences of instructions to
|
|
basic block format and back, as well as opt_util.m, which contains
|
|
miscellaneous predicates for LLDS-to-LLDS optimization.
|
|
|
|
|
|
<h4> 6a. Output C code </h4>
|
|
|
|
<ul>
|
|
<li> type_ctor_info.m
|
|
(in the backend_libs.m package, since it is shared with the MLDS back-end)
|
|
generates the type_ctor_gen_info structures that list
|
|
items of information (including unification, index and compare predicates)
|
|
associated with each declared type constructor that go into the static
|
|
type_ctor_info data structure. If the type_ctor_gen_info structure is not
|
|
eliminated as inaccessible, this module adds the corresponding type_ctor_info
|
|
structure to the RTTI data structures defined in rtti.m,
|
|
which are part of the LLDS.
|
|
|
|
<li> base_typeclass_info.m
|
|
(in the backend_libs.m package, since it is shared with the MLDS back-end)
|
|
generates the base_typeclass_info structures that
|
|
list the methods of a class for each instance declaration. These are added to
|
|
the RTTI data structures, which are part of the LLDS.
|
|
|
|
<li> stack_layout.m generates the stack_layout structures for
|
|
accurate garbage collection. Tables are created from the data
|
|
collected in continuation_info.m.
|
|
|
|
Stack_layout.m uses prog_rep.m to generate bytecode representations
|
|
of procedure bodies for use by the declarative debugger.
|
|
|
|
<li> Type_ctor_info structures and stack_layout structures both contain
|
|
pseudo_type_infos, which are type_infos with holes for type variables;
|
|
these are generated by pseudo_type_info.m
|
|
(in the backend_libs.m package, since it is shared with the MLDS back-end).
|
|
|
|
<li> llds_common.m extracts static terms from the main body of the LLDS, and
|
|
puts them at the front. If a static term originally appeared several times,
|
|
it will now appear as a single static term with multiple references to it.
|
|
[XXX FIXME this module has now been replaced by global_data.m]
|
|
|
|
<li> transform_llds.m is responsible for doing any source to source
|
|
transformations on the llds which are required to make the C output
|
|
acceptable to various C compilers. Currently computed gotos can have
|
|
their maximum size limited to avoid a fixed limit in lcc.
|
|
|
|
<li> Final generation of C code is done in llds_out.m, which subcontracts the
|
|
output of RTTI structures to rtti_out.m and of other static
|
|
compiler-generated data structures (such as those used by the debugger,
|
|
the deep profiler, and in the future by the garbage collector)
|
|
to layout_out.m.
|
|
</ul>
|
|
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> b. MLDS BACK-END </h3>
|
|
|
|
<p>
|
|
|
|
This is the ml_backend.m package.
|
|
|
|
<p>
|
|
|
|
The original LLDS code generator generates very low-level code,
|
|
since the LLDS was designed to map easily to RISC architectures.
|
|
We have developed a new back-end that generates much higher-level
|
|
code, suitable for generating Java, high-level C, etc.
|
|
This back-end uses the Medium Level Data Structure (mlds.m) as its
|
|
intermediate representation.
|
|
|
|
<h4> 3b. pre-passes to annotate/transform the HLDS </h4>
|
|
|
|
<p>
|
|
Before code generation there is a pass which annotates the HLDS with
|
|
information used for code generation:
|
|
|
|
<ul>
|
|
<li> mark_static_terms.m marks construction unifications
|
|
which can be implemented using static constants rather
|
|
than heap allocation.
|
|
</ul>
|
|
|
|
<p>
|
|
For the MLDS back-end, we've tried to keep the code generator simple.
|
|
So we prefer to do things as HLDS to HLDS transformations where possible,
|
|
rather than complicating the HLDS to MLDS code generator.
|
|
Thus we have a pass which transforms the HLDS to handle trailing:
|
|
|
|
<ul>
|
|
<li> add_trail_ops.m inserts code to manipulate the trail,
|
|
in particular ensuring that we apply the appropriate
|
|
trail operations before each choice point, when execution
|
|
resumes after backtracking, and whenever we do a commit.
|
|
The trail operations are represented as (and implemented as)
|
|
calls to impure procedures defined in library/private_builtin.m.
|
|
<li> add_heap_ops.m is very similar to add_trail_ops.m;
|
|
it inserts code to do heap reclamation on backtracking.
|
|
</ul>
|
|
|
|
<h4> 4b. MLDS code generation </h4>
|
|
<ul>
|
|
<li> ml_code_gen.m converts HLDS code to MLDS.
|
|
The following sub-modules are used to handle different constructs:
|
|
<ul>
|
|
<li> ml_unify_gen.m
|
|
<li> ml_closure_gen.m
|
|
<li> ml_call_gen.m
|
|
<li> ml_switch_gen.m, which in turn has sub-modules
|
|
<ul>
|
|
<li> ml_dense_switch.m
|
|
<li> ml_string_switch.m
|
|
<li> ml_tag_switch.m
|
|
<li> switch_util.m (in the backend_libs.m package,
|
|
since it is also used by LLDS back-end)
|
|
</ul>
|
|
</ul>
|
|
The module ml_code_util.m provides utility routines for
|
|
MLDS code generation. The module ml_util.m provides some
|
|
general utility routines for the MLDS.
|
|
<li> ml_type_gen.m converts HLDS types to MLDS.
|
|
<li> type_ctor_info.m and base_typeclass_info.m generate
|
|
the RTTI data structures defined in rtti.m and pseudo_type_info.m
|
|
(those four modules are in the backend_libs.m package, since they
|
|
are shared with the LLDS back-end)
|
|
and then rtti_to_mlds.m converts these to MLDS.
|
|
</ul>
|
|
|
|
<h4> 5b. MLDS transformations </h4>
|
|
<ul>
|
|
<li> ml_tailcall.m annotates the MLDS with information about tailcalls.
|
|
It also has a pass to implement the `--warn-non-tail-recursion' option.
|
|
<li> ml_optimize.m does MLDS->MLDS optimizations
|
|
<li> ml_elim_nested.m does two MLDS transformations that happen
|
|
to have a lot in common: (1) eliminating nested functions
|
|
and (2) adding code to handle accurate garbage collection.
|
|
</ul>
|
|
|
|
<h4> 6b. MLDS output </h4>
|
|
|
|
<p>
|
|
There are currently four backends that generate code from MLDS:
|
|
one generates C/C++ code,
|
|
one generates assembler (by interfacing with the GCC back-end),
|
|
one generates Microsoft's Intermediate Language (MSIL or IL),
|
|
and one generates Java.
|
|
|
|
<ul>
|
|
<li>mlds_to_c.m converts MLDS to C/C++ code.
|
|
</ul>
|
|
|
|
<p>
|
|
|
|
The MLDS->asm backend is logically part of the MLDS back-ends,
|
|
but it is in a module of its own (mlds_to_gcc.m), rather than being
|
|
part of the ml_backend package, so that we can distribute a version
|
|
of the Mercury compiler which does not include it. There is a wrapper
|
|
module called maybe_mlds_to_gcc.m which is generated at configuration time
|
|
so that mlds_to_gcc.m will be linked in iff the GCC back-end is available.
|
|
|
|
<p>
|
|
|
|
The MLDS->IL backend is broken into several submodules.
|
|
<ul>
|
|
<li> mlds_to_ilasm.m converts MLDS to IL assembler and writes it to a .il file.
|
|
<li> mlds_to_il.m converts MLDS to IL
|
|
<li> ilds.m contains representations of IL
|
|
<li> ilasm.m contains output routines for writing IL to assembler.
|
|
<li> il_peephole.m performs peephole optimization on IL instructions.
|
|
</ul>
|
|
After IL assembler has been emitted, ILASM in invoked to turn the .il
|
|
file into a .dll or .exe.
|
|
|
|
<p>
|
|
|
|
The MLDS->Java backend is broken into two submodules.
|
|
<ul>
|
|
<li> mlds_to_java.m converts MLDS to Java and writes it to a .java file.
|
|
<li> java_util.m contains some utility routines.
|
|
</ul>
|
|
After the Java code has been emitted, a Java compiler (normally javac)
|
|
is invoked to turn the .java file into a .class file containing Java bytecodes.
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> c. BYTECODE BACK-END </h3>
|
|
|
|
<p>
|
|
This is the bytecode_backend.m package.
|
|
|
|
<p>
|
|
|
|
The Mercury compiler can translate Mercury programs into bytecode for
|
|
interpretation by a bytecode interpreter. The intent of this is to
|
|
achieve faster turn-around time during development. However, the
|
|
bytecode interpreter has not yet been written.
|
|
|
|
<ul>
|
|
<li> bytecode.m defines the internal representation of bytecodes, and contains
|
|
the predicates to emit them in two forms. The raw bytecode form is emitted
|
|
into <filename>.bytecode for interpretation, while a human-readable
|
|
form is emitted into <filename>.bytedebug for visual inspection.
|
|
|
|
<li> bytecode_gen.m contains the predicates that translate HLDS into bytecode.
|
|
|
|
<li> bytecode_data.m contains the predicates that translate ints, strings
|
|
and floats into bytecode.
|
|
</ul>
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> d. ERLANG BACK-END </h3>
|
|
|
|
<p>
|
|
This is the erl_backend.m package.
|
|
|
|
<p>
|
|
|
|
The Mercury compiler can translate Mercury programs into Erlang.
|
|
The intent of this is to take advantage of the features of the
|
|
Erlang implementation (concurrency, fault tolerance, etc.)
|
|
However, the backend is still incomplete.
|
|
This back-end uses the Erlang Data Structure (elds.m) as its
|
|
intermediate representation.
|
|
|
|
<h4> 4d. ELDS code generation </h4>
|
|
<ul>
|
|
<li> erl_code_gen.m converts HLDS code to ELDS.
|
|
The following sub-modules are used to handle different constructs:
|
|
<ul>
|
|
<li> erl_unify_gen.m
|
|
<li> erl_call_gen.m
|
|
</ul>
|
|
The module erl_code_util.m provides utility routines for
|
|
ELDS code generation.
|
|
<li> erl_rtti.m converts RTTI data structures defined in rtti.m into
|
|
ELDS functions which return the same information when called.
|
|
</ul>
|
|
|
|
<h4> 6d. ELDS output </h4>
|
|
|
|
<ul>
|
|
<li>elds_to_erlang.m converts ELDS to Erlang code.
|
|
</ul>
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> SMART RECOMPILATION </h3>
|
|
|
|
<p>
|
|
This is the recompilation.m package.
|
|
|
|
<p>
|
|
|
|
The Mercury compiler can record program dependency information
|
|
to avoid unnecessary recompilations when an imported module's
|
|
interface changes in a way which does not invalidate previously
|
|
compiled code.
|
|
|
|
<ul>
|
|
<li> recompilation.m contains types used by the other smart
|
|
recompilation modules.
|
|
|
|
<li> recompilation_version.m generates version numbers for program items
|
|
in interface files.
|
|
|
|
<li> recompilation_usage.m works out which program items were used
|
|
during a compilation.
|
|
|
|
<li> recompilation_check.m is called before recompiling a module.
|
|
It uses the information written by recompilation_version.m and
|
|
recompilation_usage.m to work out whether the recompilation is
|
|
actually needed.
|
|
</ul>
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> MISCELLANEOUS </h3>
|
|
|
|
|
|
The modules special_pred.m (in the hlds.m package) and unify_proc.m
|
|
(in the check_hlds.m package) contain stuff for handling the special
|
|
compiler-generated predicates which are generated for
|
|
each type: unify/2, compare/3, and index/1 (used in the
|
|
implementation of compare/3).
|
|
|
|
<p>
|
|
This module is part of the transform_hlds.m package.
|
|
|
|
<dl>
|
|
<dt> dependency_graph.m:
|
|
<dd>
|
|
This contains predicates to compute the call graph for a
|
|
module, and to print it out to a file.
|
|
(The call graph file is used by the profiler.)
|
|
The call graph may eventually also be used by det_analysis.m,
|
|
inlining.m, and other parts of the compiler which could benefit
|
|
from traversing the predicates in a module in a bottom-up or
|
|
top-down fashion with respect to the call graph.
|
|
</dl>
|
|
|
|
<p>
|
|
The following modules are part of the backend_libs.m package.
|
|
|
|
<dl>
|
|
<dt> builtin_ops:
|
|
<dd>
|
|
This module defines the types unary_op and binary_op
|
|
which are used by several of the different back-ends:
|
|
bytecode.m, llds.m, and mlds.m.
|
|
|
|
<dt> c_util:
|
|
<dd>
|
|
This module defines utility routines useful for generating
|
|
C code. It is used by both llds_out.m and mlds_to_c.m.
|
|
|
|
<dt> name_mangle:
|
|
<dd>
|
|
This module defines utility routines useful for mangling
|
|
names to forms acceptable as identifiers in target languages.
|
|
|
|
<dt> compile_target_code.m
|
|
<dd>
|
|
Invoke C, C#, IL, Java, etc. compilers and linkers to compile
|
|
and link the generated code.
|
|
|
|
</dl>
|
|
|
|
<p>
|
|
The following modules are part of the libs.m package.
|
|
|
|
<dl>
|
|
|
|
<dt> process_util.m:
|
|
<dd>
|
|
Predicates to deal with process creation and signal handling.
|
|
This module is mainly used by make.m and its sub-modules.
|
|
|
|
<dt> timestamp.m
|
|
<dd>
|
|
Contains an ADT representing timestamps used by smart
|
|
recompilation and `mmc --make'.
|
|
|
|
<dt> graph_color.m
|
|
<dd>
|
|
Graph colouring. <br>
|
|
This is used by the LLDS back-end for register allocation
|
|
|
|
<dt> tree.m
|
|
<dd>
|
|
A simple tree data type. <br>
|
|
Used by the LLDS, and IL back-ends for collecting
|
|
together the different fragments of the generated code.
|
|
|
|
<dt> lp.m
|
|
<dd>
|
|
Implements the linear programming algorithm for optimizing
|
|
a set of linear constraints with respect to a linear
|
|
cost function. This is used by termination analyser.
|
|
|
|
<dt> lp_rational.m
|
|
<dd>
|
|
Implements the linear programming algorithm for optimizing
|
|
a set of linear constraints with respect to a linear
|
|
cost function, for rational numbers.
|
|
This is used by termination analyser.
|
|
|
|
<dt> rat.m
|
|
<dd>
|
|
Implements rational numbers.
|
|
|
|
<dt> compiler_util.m:
|
|
<dd>
|
|
Generic utility predicates, mainly for error handling.
|
|
</dl>
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> CURRENTLY UNDOCUMENTED </h3>
|
|
|
|
<ul>
|
|
<li> mmc_analysis.m
|
|
</ul>
|
|
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
<h3> CURRENTLY USELESS </h3>
|
|
|
|
<dl>
|
|
<dt> atsort.m (in the libs.m package)
|
|
<dd>
|
|
Approximate topological sort.
|
|
This was once used for traversing the call graph,
|
|
but nowadays we use relation.atsort from library/relation.m.
|
|
|
|
</dl>
|
|
|
|
<hr>
|
|
<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
|
|
|
|
Last update was $Date: 2007-06-25 00:58:13 $ by $Author: wangp $@cs.mu.oz.au. <br>
|
|
</body>
|
|
</html>
|