mercury/compiler/notes/COMPILER_DESIGN

-----------------------------------------------------------------------------

This file contains various notes about the design of the compiler.

-----------------------------------------------------------------------------

OUTLINE

The top-level of the compiler is in the file mercury_compile.m.
The basic design is that compilation is broken into the following
stages:

	1. parsing (source files -> HLDS)
	2. semantic analysis and error checking (HLDS -> annotated HLDS)
	3. high-level transformations (annotated HLDS -> annotated HLDS)
	4. code generation (annotated HLDS -> LLDS)
	5. low-level optimizations (LLDS -> LLDS)
	6. output C code (LLDS -> C)

Note that in reality the separation is not quite as simple as that.
Although parsing is listed as step 1 and semantic analysis is listed
as step 2, the last stage of parsing actually includes some semantic checks.
And although optimization is listed as steps 3 and 5, it also occurs in
steps 2, 4, and 6.  For example, elimination of assignments to dead
variables is done in mode analysis; middle-recursion optimization and
the use of static constants for ground terms is done in code
generation; and a few low-level optimizations are done in llds_out.m
as we are spitting out the C code.

-----------------------------------------------------------------------------

DETAILED DESIGN (well, more detailed than the OUTLINE anyway ;-)

The action is co-ordinated from mercury_compile.m.

0. Option handling

The command-line options are defined in the module options.m.
mercury_compile.pp calls library/getopt.m, passing the predicates
defined in options.m as arguments, to parse them.  It then invokes
handle_options.m to postprocess the option set.  The results are
stored in the io__state, using the type globals defined in globals.m.

1. Parsing

* lexical analysis (library/lexer.m)

* stage 1 parsing - convert strings to terms.

	library/parser.m contains the code to do this, while
	library/term.m and library/varset.m contain the term and varset
	data structures that result, and predicates for manipulating them.

* stage 2 parsing - convert terms to `items' (declarations, clauses, etc.)

	The result of this stage is a parse tree that has a one-to-one
	correspondence with the source code.  The parse tree data structure
	definition is in prog_data, while the code to create it is in
	prog_io.m.  The modules prog_out.m and mercury_to_mercury.m
	contain predicates for printing the parse tree.
	prog_util.m contains some utility predicates for manipulating
	the parse tree.

* imports and exports are handled at this point (modules.m)

	modules.m has the code to write out `.int', `.int2', `.int3',
	`.d' and `.dep' files.

* module qualification of types, insts and modes

	module_qual.m -
	Adds module qualifiers to all types insts and modes,
	checking that a given type, inst or mode exists and that
	there is only possible match.  This is done here because
	it must be done before the `.int' and `.int2' interface files
	are written. This also checks whether imports are really needed
	in the interface.

* reading and writing of optimization interfaces (intermod.m).

	<module>.opt contains clauses for exported preds suitable for
	inlining or higher-order specialization. The .opt file for the
	current module is written after type-checking. .opt files
	for imported modules are read here.

* expansion of equivalence types (equiv_type.m)

	This is really part of type-checking, but is done
	on the item_list rather than on the HLDS because it
	turned out to be much easier to implement that way.

* conversion to superhomogeneous form and into HLDS

	make_hlds.m transforms the code into superhomogenous form,
	and at the same time converts the parse tree into the HLDS.
	make_hlds.m also calls make_tags.m which chooses the data
	representation for each discriminated union type by
	assigning tags to each functor.

The result at this stage is the High Level Data Structure,
which is defined in four files:

	- hlds_data.m defines the parts of the HLDS concerned with
	  function symbols, types, insts, modes and determinisms;
	- hlds_goal.m defines the part of the HLDS concerned with the
	  structure of goals, including the annotations on goals;
	- hlds_pred.m defines the part of the HLDS concerning
	  predicates and procedures;
	- hlds_module.m defines the top-level parts of the HLDS,
	  including the type module_info.

The module hlds_out.m contains predicates to dump the HLDS to a file.
The module goal_util.m contains predicates for renaming variables
in an HLDS goal.

2. Semantic analysis and error checking

* implicit quantification

	quantification.m handles implicit quantification and computes
	the set of non-local variables for each sub-goal

* type checking

	- typecheck.m handles type checking, overloading resolution &
	  module name resolution, and almost fully qualifies all predicate
	  and functor names.  It sets the map(var, type) field in the
	  pred_info.  However, typecheck.m doesn't figure out the pred_id
	  for function calls or calls to overloaded predicates; that can't
	  be done in a single pass of typechecking, and so it is done
	  later on in modes.m.  When it has finished, typecheck.m calls
	  clause_to_proc.m to make duplicate copies of the clauses for
	  each different mode of a predicate; all later stages work on
	  procedures, not predicates.
	- type_util.m contains utility predicates dealing with types
	  that are used in a variety of different places within the compiler

* mode analysis

	- modes.m is the main mode analysis module.
	  It checks that the code is mode-correct, reordering it
	  if necessary, and annotates each goal with a delta-instmap
	  that specifies the changes in instantiatedness of each
	  variable over that goal.  It also converts higher-order
	  pred terms into lambda expressions.  Modes of lambda
	  expressions are module qualified during mode analysis.
	  It also converts function calls into predicate calls, and
	  does the final step of figuring out which pred_id to use
	  for a call to an overloaded predicate.

	  It uses the following sub-modules:
		mode_info.m (the main data structure for mode analysis)
		delay_info.m (a sub-component of the mode_info data
			structure used for storing the information
			for scheduling: which goals are currently
			delayed, what variables they are delayed on, etc.)
		instmap.m
			Defines the instmap and instmap_delta ADTs
			which store information on what instantiations
			a set of variables may be bound to.
		inst_match.m
			This contains the code for dealing with insts:
			abstractly unifying them, checking whether two
			insts match, etc.
		mode_errors.m
			This module contains all the code to
			print error messages for mode errors
	- mode_util.m contains miscellaneous useful predicates dealing
	  with modes (many of these are used by lots of later stages
	  of the compiler)

* indexing and determinism analysis

	- switch_detection.m transforms into switches those disjunctions
	  in which several disjuncts test the same variable against different
	  function symbols.
	- cse_detection.m looks for disjunctions in which each disjunct tests
	  the same variable against the same function symbols, and hoists any
	  such unifications out of the disjunction.
	  If cse_detection.m modifies the code,
	  it will re-run mode analysis and switch detection.
	- det_analysis.m annotates each goal with its determinism;
	  it inserts cuts in the form of "some" goals wherever the determinisms
	  and delta instantiations of the goals involved make it necessary.
	  Any errors found during determinism analysis are reported by
	  det_report.m.
	  Det_util.m contains utility predicates used in several modules.

* checking of unique modes (unique_modes.m)

	unique_modes.m checks that non-backtrackable unique modes were
	not used in a context which might require backtracking.
	Note that what unique_modes.m does is quite similar to
	what modes.m does, and unique_modes calls lots of predicates
	defined in modes.m to do it.

* simplification (simplify.m)

	simplify.m finds and exploits opportunities for simplifying the
	internal form of the program, both to optimize the code and to
	massage the code into a form the code generator will accept.
	It also warns the programmer about any constructs that are so simple
	that they should not have been included in the program in the first
	place.
	simplify.m calls common.m which looks for construction unifications
	which constructs a term that is the same as one that already exists,
	or repeated calls to a predicate with the same inputs and replaces
	them with assignment unifications.

3. High-level transformations

The first two passes of this stage are code simplifications.

* introduction of type_info arguments for polymorphic predicates and
  transformation of complicated unifications into predicate calls
  (polymorphism.m)

* removal of lambda expressions (lambda.m)

	lambda.m converts lambda expressions into higher-order predicate
        terms referring to freshly introduced separate predicates.
	This pass needs to come after unique_modes.m to ensure that
	the modes we give to the introduced predicates are correct.
	It also needs to come after polymorphism.m since polymorphism.m
	doesn't handle higher-order predicate constants.

To improve efficiency, the above two passes are actually combined into
one - polymorphism.m calls calls lambda__transform_lambda directly.

Most of the remaining HLDS-to-HLDS transformations are optimizations:

* specialization of higher-order predicates where the value of the
  higher-order arguments are known (higher_order.m)

* inlining (i.e. unfolding) of simple procedures (inlining.m)

* constraint propagation (constraint.m)

	Not yet working.

* issue warnings about unused arguments from predicates, and create
  specialized versions without them (unused_args.m); type_infos are
  often unused

* elimination of dead procedures (dead_proc_elim.m). Inlining, higher-order
  specialization and the elimination of unused args can make procedures dead
  even the user doesn't, and automatically constructed unification and
  comparison predicates are often dead as well.

* reducing the number of variables that have to be saved across procedure calls
  (saved_vars.m). We do this by putting the code that generates the value of
  a variable just before the use of that variable, duplicating the variable
  and the code that produces it if necessary, provided the cost of doing so
  is smaller than the cost of saving and restoring the variable would be.

The module transform.m contains stuff that is supposed to be useful
for high-level optimizations (but which is not yet used).

Eventually we plan to make Mercury the programming language of the Aditi
deductive database system. When this happens, we will need to be able to
apply the magic set transformation, which is defined for predicates
whose definitions are disjunctive normal form. The module dnf.m translates
definitions into DNF, introducing auxiliary predicates as necessary.

4. Code generation

* pre-passes to annotate the HLDS

	Before code generation there are a few more passes which
	annotate the HLDS with information used for code generation:

		choosing registers for procedure arguments (arg_info.m)
			Currently uses one of two simple algorithms, but
			we may add other algorithms later.
		annotation of goals with liveness information (liveness.m)
			This records the birth and death of each variable
			in the HLDS goal_info.
		allocation of stack slots
			This is done by live_vars.m, which works
			out which variables need to be saved on the
			stack when, and then uses graph_colour.m to determine
			a good allocation of variables to stack slots.
		migration of builtins following branched structures
			This transformation, which is performed by
			follow_code.m, improves the results of follow_vars.
		allocating the follow vars (follow_vars.m)
			Traverses backwards over the HLDS, annotating each
			branched structure with the variable target locations
			for the following call, so we can generate
			efficient code by putting variables in the right spot.
			This module is not called from mercury_compile.m;
			it is called from store_alloc.m.
		allocating the store map (store_alloc.m)
			Allocates locations for variables at the end of
			branched goals.  Annotates the goal_info for
			each branched goal with allocation of variable
			target locations before so that we can generate
			correct code by putting variables in the same
			spot in each branch.

* code generation

	For code generation itself, the main module is code_gen.pp.
	It handles conjunctions and negations, but calls sub-modules
	to do most of the other work:

		ite_gen.m (if-then-elses)
		call_gen.m (predicate calls and also calls to
			out-of-line unification procedures)
		disj_gen.m (disjunctions)
		unify_gen.m (unifications)
		switch_gen.m (switches), which has sub-modules
			dense_switch.m
			lookup_switch.m
			string_switch.m
			tag_switch.m
		pragma_c_gen.m (embedded C code)

	It also calls middle_rec.m to do middle recursion optimization.

	The code generation modules make use of
		code_info.m
			The main data structure for the code generator
		code_exprn.m
			This defines the exprn_info type, which is
			a sub-component of the code_info data structure
			which holds the information about
			the contents of registers and
			the values/locations of variables.
		exprn_aux.m
			Various preds which use exprn_info
		code_util.m
			Some miscellaneous preds used for code generation
		code_aux.m
			Some miscellaneous preds which, unlike those in
			code_util, use code_info

The result of code generation is the Low Level Data Structure (llds.m).
The code is generated as a tree of code fragments which is then
flattened (tree.m).

5. Low-level optimization

The various LLDS-to-LLDS optimizations are invoked from optimize.m.
They are:

* optimization of jumps to jumps (jumpopt.m)

* elimination of duplicate code sequences (dupelim.m)

* optimization of stack frame allocation/deallocation (frameopt.m)

* dead code and dead label removal (labelopt.m)

* value numbering

	This is done by value_number.m, which has the following sub-modules:

	vn_block.m
		Traverse an extended basic block, building up tables showing
		the actions that must be taken, and the current and desired
		contents of locations.
	vn_cost.m
		Computes the cost of instruction sequences.
		Value numbering should never replace an instruction
		sequence with a more expensive sequence. Unfortunately,
		computing costs accurately is very difficult.
	vn_debug.m
		Predicates to dump data structures used in value
		numbering.
	vn_filter.m
		Module to eliminate useless temporaries introduced by
		value numbering. Not generating them in the first place
		would be better, but would be quite difficult.
	vn_flush.m
		Given the tables built up by vn_block and a list of nodes
		computed by vn_order, generate code to assign the required
		values to each temporary and live location in what is
		hopefully the fastest and most compact way.
	vn_order.m
		Given tables built up by vn_block showing the actions that
		must be taken, and the current and desired contents of
		locations, find out which shared subexpressions should
		have temporaries allocated to them and in what order these
		temporaries and the live locations should be assigned to.
		This module uses the module atsort.m to perform an approximate
		topological sort on the nodes of the location dependency
		graph it operations on (since the graph may have cycles,
		a precise topological sort may not exist).
	vn_table.m
		Abstract data type showing the current and desired
		contents of locations.
	vn_temploc.m
		Abstract data type to keep track of the availability
		of registers and temporaries.
	vn_type.m
		This module defines the types used by the other
		modules of the value numbering optimization.
	vn_util.m
		Utility predicates.

	Several of these modules (and also frameopt, above) use livemap.m,
	which finds the set of locations live at each label.

* peephole optimization (peephole.m)

Depending on which optimization flags are enabled,
optimize.m may invoke many of these passes multiple times.

Some of the low-level optimization passes use opt_util.m, which
contains miscellaneous predicates for LLDS-to-LLDS optimization.

6. Output C code

* base_type_info.m generates the base_type_info structures that list the
  unification, index and compare predicates associated with each declared
  type constructor. These are added to the LLDS.

* base_type_layout.m generates the base_type_layout structures that give
  information on how to interpret values of a given type. The base_type_layout
  structure of each declared type constructor are added to the LLDS.

* llds_common.m extracts static terms from the main body of the LLDS, and
  puts them at the front. If a static term originally appeared several times,
  it will now appear as a single tatic term with multiple references to it.

* Final generation of C code is done in llds_out.m.

-----------------------------------------------------------------------------

MISCELLANEOUS

	special_pred.m, unify_proc.m:
		These modules contain stuff for handling the special
		compiler-generated predicates which are generated for
		each type: unify/2, compare/3, index/1 (used in the
		implementation of compare/3), and also type_to_term/2
		and term_to_type/2 (but those last two are disabled
		at the moment).

	dependency_graph.m:
		This contains predicates to compute the call graph for a
		module, and to print it out to a file.
		(The call graph file is used by the profiler.)
		The call graph may eventually also be used by det_analysis.m,
		inlining.m, and other parts of the compiler which could benefit
		from traversing the predicates in a module in a bottom-up or
		top-down fashion with respect to the call graph.

	passes_aux.m
		Contains code to write progress messages, and higher-order
		code to traverse all the predicates defined in the current
		module and do something with each one.

	opt_debug.m:
		Utility routines for debugging the LLDS-to-LLDS optimizations.

-----------------------------------------------------------------------------

CURRENTLY USELESS

The following modules do not serve any function at the moment.
Some of them are obsolete; other are work-in-progress.
(For some of them its hard to say which!)

	mercury_to_goedel.m:
		This converts from item_list to Goedel source code.
		It works for simple programs, but doesn't handle
		various Mercury constructs such as lambda expressions,
		higher-order predicates, and functor overloading.

	mercury_to_c.m:
		The very incomplete beginnings of an alternate
		code generator.  When finished, it will convert HLDS
		to high-level C code (without going via LLDS).

	shapes.m, garbage_out.m:
		These two modules generate information for the
		native garbage collector.

-----------------------------------------------------------------------------