Files
mercury/compiler/mlds.m
Zoltan Somogyi 46a67b0b48 When the typechecker finds highly ambiguous overloading, print what symbols
Estimated hours taken: 16
Branches: main

When the typechecker finds highly ambiguous overloading, print what symbols
were overloaded, and where they occurred. Without this information, it is
very hard to fix the error if the predicate body is at all large.

Fix some software engineering problems encountered during this process.
Modify some predicates in error_util in order to simplify their typical usage.
Change the type_ctor type to be not simply a sym_name - int pair but a type
with its own identifying type constructor. Change several other types that
were also sym_name - int pairs (mode_id, inst_id, item_name, module_qual.id
and the related simple_call_id) to have their own function symbols too.

compiler/typecheck_info.m:
	Add a field to the typecheck_info structure that records the overloaded
	symbols encountered.

compiler/typecheck.m:
	When processing ambiguous predicate and function symbols, record this
	fact in the typecheck_info.

	Add a field to the cons_type_info structure to make this possible.

compiler/typecheck_errors.m:
	When printing the message about highly ambiguous overloading,
	what the overloaded symbols were and where they occurred.

compiler/error_util.m:
	Make error_msg_specs usable with plain in and out modes by separating
	out the capability requiring special modes (storing a higher order
	value in a function symbol) into its own, rarely used type.

	Make component_list_to_line_pieces a bit more flexible.

compiler/prog_data.m:
compiler/module_qual.m:
compiler/recompilation.m:
	Change the types listed above from being equivalence types (pairs)
	to being proper discriminated union types.

compiler/*.m:
	Conform to the changes above.

	In some cases, simplify the code's use of error_util.

tests/warnings/ambiguous_overloading.{m,exp}:
	Greatly extend this test case to test the new functionality.

tests/recompilation/*.err_exp.2
	Reflect the fact that the expected messages now use the standard
	error_util way of quoting sym_name/arity pairs.
2006-04-20 05:37:13 +00:00

2049 lines
81 KiB
Mathematica

%-----------------------------------------------------------------------------%
% vim: ft=mercury ts=4 sw=4 et
%-----------------------------------------------------------------------------%
% Copyright (C) 1999-2006 The University of Melbourne.
% This file may only be copied under the terms of the GNU General
% Public License - see the file COPYING in the Mercury distribution.
%-----------------------------------------------------------------------------%
% MLDS - The Medium-Level Data Structure.
% Main author: fjh.
% This module defines the MLDS data structure itself.
% The MLDS is an intermediate data structure used in compilation;
% we compile Mercury source -> parse tree -> HLDS -> MLDS -> target (e.g. C).
% See notes/compiler_design.html for more information about the MLDS & LLDS.
%
% The MLDS is intended to be suitable for generating target code in
% languages such as Java, Java bytecode, high-level C, C++, or C--, etc.
% This is as opposed to the LLDS, which is better suited for generating
% assembler or the very low-level C or GNU C (lower level than C--) that
% the original Mercury compiler backend generates. In the LLDS, we are
% doing all our own register allocation and stack manipulation, but with
% MLDS, those tasks are the responsibility of the target.
% The one really important simplification that we do make relative to the
% HLDS is that MLDS does not have any support for non-determinism, so the
% HLDS->MLDS compiler must compile non-deterministic code into code that
% uses continuation passing.
% The MLDS data structure is quite full-featured, including for example
% support for arbitrary nesting, multiple return values, and tagged pointers.
% However, many
% of the intended target languages don't support all of those features.
% Therefore the HLDS->MLDS compiler must ensure that the final MLDS code that
% it eventually generates does not use features which the target does not
% support. This will (presumably) be accomplished by having handle_options.m
% set various flags according to the selected target, and having the
% HLDS->MLDS compiler take account of those flags and either generate simpler
% MLDS code in the first place or run some extra simplification passes over
% the MLDS code before invoking the MLDS->target compiler.
%
%-----------------------------------------------------------------------------%
%
% Mapping Mercury names to MLDS names
%
% 1. Modules
%
% Mercury module names map directly to MLDS package names, except that
% modules in the Mercury standard library map get a `mercury' prefix,
% e.g. `mercury.builtin', `mercury.io', `mercury.univ', etc.
% [Rationale: omitting the `mercury' prefix would lead to namespace
% pollution in the generated target language code.]
% 2. Procedure names
%
% HLDS procedure names map directly to MLDS function names.
% MLDS function names are structured terms that include sufficient
% information (arity, pred_or_func indicator, mode number, etc.)
% to avoid any ambiguities.
%
% [Rationale: the reason for keeping structured names rather than flattened
% names at the MLDS level is that we don't want to do name mangling at
% the HLDS -> MLDS stage, since we don't know what restrictions the target
% language will impose. For example, some target languages (e.g. C++, Java)
% will support overloading, while others (e.g. C) will not.]
% 3. Procedure signatures
%
% MLDS function signatures are determined by the HLDS procedure's
% argument types, modes, and determinism.
% Procedures arguments with type `io.state' or `store(_)' are not passed.
% Procedure arguments with top_unused modes are not passed.
% Procedures arguments with top_in modes are passed as input.
% Procedures arguments with top_out modes are normally passed by reference.
% However, several alternative approaches are also supported (see below).
%
% Procedures with determinism model_det need no special handling.
% Procedures with determinism model_semi must return a boolean.
% Procedures with determinism model_non get passed a continuation;
% if the procedure succeeds, it must call the continuation, and if
% it fails, it must return.
%
% With the `--copy-out' option, arguments with top_out modes will be returned
% by value. This requires the target language to support multiple return
% values. The MLDS->target code generator can of course handle that by mapping
% functions with multiple return values into functions that return a struct,
% array, or tuple.
% With the `--nondet-copy-out' option, arguments for nondet procedures with
% top_out modes will be passed as arguments to the continuation.
%
% [Rationale: this mapping is designed to be as simple as possible,
% and to map as naturally to the target language as possible.
% The complication with the `--copy-out' and `--nondet-copy-out'
% options is needed to cater to different target languages.
% Some target languages (e.g. C) fully support pass-by-reference,
% and then it is most consistent to use it everywhere.
% Some (e.g. IL) support pass-by-reference, but don't allow references
% to be stored in the environment structs needed for continuations,
% and so can't use pass-by-reference for nondet procedures.
% Others (e.g. Java) don't support pass-by-reference at all.]
% 4. Variables
%
% MLDS variable names are determined by the HLDS variable name and
% (in some cases, to avoid ambiguity) variable number. The MLDS
% variable name is a structured term that keeps the original variable
% name separate from the distinguishing variable number.
% It is up to each individual backend to mangle the variable name
% and number to avoid ambiguity where necessary.
% All references to MLDS variables must however be fully qualified
% with the name of the enclosing entity that defines them.
%
% [Rationale: the reason for keeping structured names rather than
% mangled names at the MLDS level is that in some cases the mangling is
% undesirable, as the original HLDS variable names are what is required
% (for instance, when interfacing with foreign code which includes
% references to the original HLDS variable names).]
% 5. Global data
%
% MLDS names for global data are structured; they hold some
% information about the kind of global data (see the mlds_data_name type).
%
% It's not clear whether this is actually useful.
% And we're not completely consistent about applying this approach.
% Many of the passes which create global data items do not actually
% generate structured names, but instead just generate var/1 names
% where the variable name already contains some mangling to ensure uniqueness.
% Examples of this include "string_table" and "next_slots_table" (for
% string switches), "float_*" (boxed floating point constants),
% and "obj_*" (reserved objects, for representing constants in
% discriminated union types).
% 6. Types
%
% If there is an MLDS type corresponding to a Mercury type, then
% the Mercury type name maps directly to the MLDS type name,
% suitably module-qualified of course.
% The MLDS type name includes the type arity (arity overloading is allowed).
% However, if a low-level data representation scheme is used,
% then some Mercury types may not have corresponding MLDS type names
% (that is, the corresponding MLDS type may be just `MR_Word' or its
% equivalent).
% 7. Data constructors.
%
% With --high-level-data, Mercury data constructors normally get mapped to
% MLDS types nested within the MLDS type for the Mercury data type to which
% the constructors belong. There are some exceptions; see ml_type_gen.m
% for full details. The MLDS type name includes the constructor arity.
% [Rationale: Mercury allows data constructors to be overloaded on
% their result type within a single module, and also allows them
% to be overloaded on arity within a single type. Nesting resolves
% the first problem, and keeping the arity in the MLDS type name
% resolves the second.]
% XXX There is a problem in the Java back-end with Mercury types for which
% the constructor name is the same as the type name.
%
% XXX Also need to think about equivalence types and no_tag types.
%
% XXX There are problems with the RTTI definitions for the Java back-end.
% Under the current system, the definitions are output as static variables
% with static initializers, ordered so that subdefinitions always appear before
% the definition which uses them. This is neccessary because in Java, static
% initializers are performed at runtime in textual order, and if a definition
% relies on another static variable for its constructor but said variable has
% not been initialized, then it is treated as `null' by the JVM with no
% warning.
% The problem with this approach is that it won't work for cyclic definitions.
% eg:
% :- type foo ---> f(bar) ; g.
% :- type bar ---> f2(foo) ; g2
% At some point this should be changed so that initialization is performed by 2
% phases: first allocate all of the objects, then fill in the fields.
% 8. Insts and modes
%
% Inst and mode definitions do not get translated into MLDS.
%
% [Inst and mode definitions do however affect the signatures of the
% MLDS functions used to implement each procedure.]
% 9. Type classes.
%
% Currently type classes are handled early and at a fairly low level.
% It's not yet clear how easy it will be to convert this to MLDS.
% It may depend to some degree on the target language.
% But if it is possible, then when it is done the target language
% generator will be able to ignore the issue of type classes.
%
% For language interoperability, however, it might be nice if the
% translation were done at higher level.
%
% Mercury type classes should map directly to MLDS interfaces.
%
% Mercury instance definitions should map to classes which implement the
% corresponding interface. Note that if there is an instance declaration
% `:- instance foo(bar)', then the MLDS type for `bar' will *not* implement
% the MLDS interface for `foo' -- instead, there will be a new MLDS type
% for `instance foo(bar)' which implements that interface.
%-----------------------------------------------------------------------------%
%
% Mapping MLDS names to the target language
%
% Ultimately the exact choice of how MLDS names are mapped to
% the target language is up to the target language generator.
% So the remarks here are just guidelines, not strict rules.
% 1. Names.
%
% If the target language has standard conventions about certain categories
% of names beginning with uppercase or lowercase letters, then the target
% language generator should generate names that respect those conventions.
%
% An MLDS name may contain arbitrary characters.
% If the target language has restrictions on what names can be used
% in identifiers, then it is the responsibility of the target language
% generator to mangle MLDS names accordingly to ensure that they abide
% by those restrictions.
% 2. Packages.
%
% MLDS packages should be mapped directly to the corresponding notion
% in the target language, if possible.
% If the target doesn't have a notion of packages, then they should be
% mapped to names of the form "foo.bar.baz" or if dots are not allowed
% then to "foo__bar__baz".
% 3. Overloading.
%
% If the target does not support overloading, then any Mercury names which
% are overloaded within a single Mercury module must be qualified to avoid
% ambiguity. However, Mercury names which are not overloaded should not
% be qualified. If a name is overloaded but occurs only once in the module's
% interface then the version in the interface should not be qualified.
% The arity-zero version of type should not be arity-qualified
% unless this would cause ambiguity with an unqualified name generated by
% the previous rule. Likewise, the arity-zero version of function
% (i.e. a constant) should not be function-qualified or arity-qualified
% unless this would cause ambiguity with an unqualified name generated
% the aforementioned rule.
% The first mode of a predicate should not be mode-qualified.
%
% [Rationale: name mangling should be avoided where possible, because
% this makes it easier for the user to interoperate with other languages
% and to use tools which do not properly demangle Mercury names.]
% 4. Procedures.
%
% If a procedure name needs to be qualified, the qualification should be
% done by appending "_f" for functions or "_p" for predicates,
% optionally followed by the arity,
% optionally followed by an underscore and then the mode number,
% optionally followed by "_i" and then the MLDS function sequence number
% (for internal MLDS functions, used e.g. to implement backtracking).
%
% [Rationale: any qualifiers should go at the end of a name, so that
% command-line completion works even for mangled names (and hopefully
% even shows all the possible alternatives...).]
%
% To avoid ambiguity as to whether a name is qualified or not,
% any procedures whose unqualified name matches the pattern for qualified
% names, i.e. the regular expression `.*_[fp][0-9]*(_[0-9]+)?(_i[0-9]+)?',
% should always be qualified (even if not overloaded).
% 5. Types.
%
% If a type name needs to be qualified, the qualification should be
% done by appending an underscore followed by the arity.
%
% To avoid ambiguity as to whether a name is qualified or not,
% any types whose unqualified name matches the pattern for qualified
% names, i.e. the regular expression `.*_[0-9]+',
% should always be qualified (even if not overloaded).
%-----------------------------------------------------------------------------%
%
% Notes on garbage collection and liveness.
%
% "Liveness-accurate GC" is GC in which the collector does not trace local
% variables which are definitely not live according to a straight-forward
% static analysis of definite-deadness/possible-liveness. Liveness-accurate
% GC is desirable, in general, since tracing through variables which are
% no longer live may lead to excessive memory retention for some programs.
% However these programs are relatively rare, so liveness-accurate GC
% is not always worth the extra complication.
%
% The MLDS is therefore designed to optionally support liveness-accurate
% GC, if the target language supports it. If liveness-accurate GC is
% supported and enabled, then it is the responsibility of the target
% language implementation to do whatever is needed to avoid having the GC
% trace variables which have gone out of scope.
%
% That means that to support liveness-accurate the HLDS->MLDS code
% generator just needs to cover the cases where a straight-forward liveness
% calculation on the generated MLDS does not match up with the desired
% result of a straight-forward liveness calculation on the HLDS. That is,
% the HLDS->MLDS code generator must generate code to clobber variables
% which are no longer live according to a straight-forward liveness
% calculation on the HLDS but which have not gone out of scope in
% the generated MLDS. For example, with our current HLDS->MLDS code
% generation scheme, this is the case for variables in the `else' of a
% nondet if-then-else once the `then' has been entered.
% (XXX Currently ml_code_gen.m does _not_ clobber those variables, though.)
% The rationale for leaving most of the responsibility for liveness-accurate
% GC up to the MLDS back-end is as follows: at very least we need the
% target language implementation's _cooperation_, since if the MLDS code
% generator inserts statements to clobber variables that are no longer live,
% then an uncooperative target language implementation could just optimize
% them away, since they are assignments to dead variables. Given this need
% for the MLDS target back-end's cooperation, it makes sense to assign as
% much of the responsibily for this task as is possible to the MLDS target
% back-end, to keep the front-end simple and to keep the responsibility
% for this task in one place.
%
% But in cases such as nondet if-then-else, where HLDS-liveness does not
% match MLDS-liveness, we can't just leave it to the MLDS target back-end,
% because that would require assuming an unreasonably smart liveness
% calculation in the MLDS target back-end, so in such cases we do need
% to handle it in the HLDS->MLDS code generator.
%-----------------------------------------------------------------------------%
:- module ml_backend.mlds.
:- interface.
:- import_module backend_libs.builtin_ops.
:- import_module backend_libs.foreign.
:- import_module backend_libs.rtti.
:- import_module hlds.code_model.
:- import_module hlds.hlds_module.
:- import_module hlds.hlds_pred.
:- import_module libs.globals.
:- import_module mdbcomp.prim_data.
:- import_module parse_tree.prog_data.
:- import_module parse_tree.prog_type.
:- import_module parse_tree.prog_foreign.
:- import_module bool.
:- import_module list.
:- import_module map.
:- import_module maybe.
:- import_module pair.
%-----------------------------------------------------------------------------%
:- type mercury_module_name == prim_data.module_name.
%
% The type `mlds' is the actual MLDS.
% XXX we ought to make this type abstract
%
:- type mlds
---> mlds(
% The original Mercury module name.
name :: mercury_module_name,
% Code defined in some other language, e.g. for
% `pragma c_header_code', etc.
foreign_code :: map(foreign_language,
mlds_foreign_code),
% The MLDS code itself.
% Packages/classes to import
toplevel_imports :: mlds_imports,
% Definitions of code and data
defns :: mlds_defns,
% The names of init and final preds.
% XXX These only work for the C backend, because
% pragma export doesn't work for the other backends.
init_preds :: list(string),
final_preds :: list(string)
).
:- func mlds_get_module_name(mlds) = mercury_module_name.
:- type mlds_imports == list(mlds_import).
% For the C backend, we generate a `.mh' file containing the
% prototypes for C functions generated for `:- pragma export'
% declarations, and a `.mih' file containing the prototypes
% for the C functions generated by the compiler which should
% not be visible to the user.
%
% For languages which don't support multiple interfaces, it's
% probably OK to put everything in the compiler-visible interface.
:- type mercury_mlds_import_type
---> user_visible_interface
; compiler_visible_interface.
% An mlds_import specifies FIXME
% Currently an import just gives the name of the package to be imported.
% This might perhaps need to be expanded to cater to different kinds of
% imports, e.g. imports with wild-cards ("import java.lang.*").
:- type mlds_import
---> mercury_import(
mercury_mlds_import_type :: mercury_mlds_import_type,
import_name :: mlds_module_name
)
; foreign_import(
foreign_import_name :: foreign_import_name
).
:- type foreign_import_name
---> il_assembly_name(mlds_module_name).
% An mlds_module_name specifies the name of an mlds package or class.
:- type mlds_module_name.
% An mlds_package_name specifies the name of an mlds package.
:- type mlds_package_name == mlds_module_name.
% Given the name of a Mercury module, return the name of the corresponding
% MLDS package in which this module is defined.
%
:- func mercury_module_name_to_mlds(mercury_module_name) = mlds_module_name.
% Given the name of a Mercury module, and a package name, return the
% name of the corresponding MLDS module name in which this module is
% defined.
% In this case, the package is specified as the first parameter (c.f.
% mercury_module_name_to_mlds above).
%
:- func mercury_module_and_package_name_to_mlds(mercury_module_name,
mercury_module_name) = mlds_module_name.
% Given the name of a Mercury module return the fully qualified module
% name, e.g. for the name System.Object which is defined in the source
% package mscorlib, it will return System.Object.
%
:- func mlds_module_name_to_sym_name(mlds_package_name) = sym_name.
% Give the name of a Mercury module, return the name of the corresponding
% MLDS package.
%
:- func mlds_module_name_to_package_name(mlds_module_name) = sym_name.
% Is the current module a member of the std library,
% and if so which module is it?
%
:- pred is_std_lib_module(mlds_module_name::in, string::out) is semidet.
% Given an MLDS module name (e.g. `foo.bar'), append another class
% qualifier (e.g. for a class `baz'), and return the result (e.g.
% `foo.bar.baz'). The `arity' argument specifies the arity of the class.
% The qual_kind argument specifies the qualifier kind of the module_name
% argument.
%
:- func mlds_append_class_qualifier(mlds_module_name, mlds_qual_kind,
globals, mlds_class_name, arity) = mlds_module_name.
% Append a wrapper class qualifier to the module name and leave the
% package name unchanged.
%
:- func mlds_append_wrapper_class(mlds_module_name) = mlds_module_name.
% Append an arbitrary qualifier to the module name and leave the package
% name unchanged.
%
:- func mlds_append_name(mlds_module_name, string) = mlds_module_name.
% When targetting languages such as IL, C#, and Java, which don't support
% global methods or global variables, we need to wrap all the generated
% global functions and global data inside a wrapper class. This function
% returns the name to use for the wrapper class.
%
:- func wrapper_class_name = string.
:- type mlds_defns == list(mlds_defn).
:- type mlds_defn
---> mlds_defn(
% The name of the entity being declared.
mlds_entity_name :: mlds_entity_name,
% The source location.
mlds_context :: mlds_context,
% these contain the following:
% access, % public/private/protected
% member_type, % static/per_instance
% virtuality, % virtual/non_virtual
% finality, % final/overridable (funcs only)
% constness, % const/modifiable (data only)
% is_abstract, % abstract/concrete
% etc.
mlds_decl_flags :: mlds_decl_flags,
mlds_entity_defn :: mlds_entity_defn
).
% An mlds name may contain arbitrary characters.
% If the target language has restrictions on what names can be used
% in identifiers, then it is the responsibility of the target language
% generator to mangle these names accordingly.
:- type mlds_qualified_entity_name
== mlds_fully_qualified_name(mlds_entity_name).
:- type mlds_fully_qualified_name(T)
---> qual(mlds_module_name, mlds_qual_kind, T).
% For the Java back-end, we need to distinguish between module qualifiers
% and type qualifiers, because type names get the case of their initial
% letter inverted (i.e. lowercase => uppercase).
:- type mlds_qual_kind
---> module_qual
; type_qual.
:- type mlds_entity_name
---> type(mlds_class_name, arity) % Name, arity.
; data(mlds_data_name)
; function(
% Identifies the source code predicate or function.
mlds_pred_label,
proc_id,
% A sequence number used to distinguish different MLDS
% functions when compiling a single HLDS predicate into
% multiple MLDS functions (e.g. to handle backtracking).
maybe(mlds_func_sequence_num),
% Specifies the HLDS pred_id.
% This should generally not be needed much, since all the
% necessary information should be in the mlds_pred_label
% and/or in the mlds_entity_defn. However, the target
% generator may want to refer to the HLDS for additional
% information.
pred_id
)
; export(
% A pragma export name.
string
).
:- type mlds_func_sequence_num == int.
% This specifies information about some entity being defined
% The entity may be any of the following:
% constant or variable
% function
% class, including
% package (class with only static (one_copy) members)
% interface (abstract class, no data members)
% struct (value class)
% enum
:- type mlds_entity_defn
---> mlds_data(
% Represents a constant or variable.
mlds_type,
mlds_initializer,
% If accurate GC is enabled, we associate with each variable
% the code needed to trace that variable when doing GC.
mlds_maybe_gc_trace_code
)
; mlds_function(
% Represents functions.
maybe(pred_proc_id), % Identifies the original
% Mercury procedure, if any.
mlds_func_params, % The arguments & return types.
mlds_function_body, % The function body.
list(mlds_attribute) % Attributes.
)
; mlds_class(
% Represents packages, classes, interfaces, structs, enums.
mlds_class_defn
).
% If accurate GC is enabled, we associate with each variable
% (including function parameters) the code needed to trace that
% variable when doing GC.
% `no' here indicates that no GC tracing code is needed,
% e.g. because accurate GC isn't enabled, or because the
% variable can never contain pointers to objects on the heap.
:- type mlds_maybe_gc_trace_code == maybe(statement).
% It is possible for the function to be defined externally
% (i.e. the original Mercury procedure was declared `:- external').
% (If you want to generate an abstract body consider adding another
% alternative here).
:- type mlds_function_body
---> defined_here(statement)
; external.
% Note that `one_copy' variables *must* have an initializer
% (the GCC back-end relies on this).
% XXX Currently we only record the type for structs.
% We should do the same for objects and arrays.
:- type mlds_initializer
---> init_obj(mlds_rval)
; init_struct(mlds_type, list(mlds_initializer))
; init_array(list(mlds_initializer))
; no_initializer.
:- type mlds_func_params
---> mlds_func_params(
mlds_arguments, % names and types of arguments (inputs)
mlds_return_types % types of return values (outputs)
).
:- type mlds_arguments == list(mlds_argument).
:- type mlds_argument
---> mlds_argument(
mlds_entity_name, % argument name
mlds_type, % argument type
mlds_maybe_gc_trace_code % GC tracing code for this argument,
% if needed
).
:- type mlds_arg_types == list(mlds_type).
:- type mlds_return_types == list(mlds_type).
:- func mlds_get_arg_types(mlds_arguments) = list(mlds_type).
% An mlds_func_signature is like an mlds_func_params
% except that it only includes the function's type, not
% the parameter names.
:- type mlds_func_signature
---> mlds_func_signature(
mlds_arg_types, % argument types
mlds_return_types % return types
).
:- func mlds_get_func_signature(mlds_func_params) = mlds_func_signature.
:- type mlds_class_kind
---> mlds_class % A generic class: can inherit other classes and
% interfaces (but most targets will only support
% single inheritence, so usually there will be
% at most one class).
; mlds_package % A class with only static members (can only
% inherit other packages). Unlike other kinds
% of classes, packages should not be used as types.
; mlds_interface % A class with no variable data members (can only
% inherit other interfaces).
; mlds_struct % A value class (can only inherit other structs).
; mlds_enum. % A class with one integer member and a bunch
% of static consts (cannot inherit anything).
:- type mlds_class_name == string.
:- type mlds_class == mlds_fully_qualified_name(mlds_class_name).
% Note that standard C doesn't support empty structs,
% so when targetting C, it is the MLDS code generator's
% responsibility to ensure that each generated MLDS class
% has at least one base class or non-static data member.
:- type mlds_class_defn
---> mlds_class_defn(
kind :: mlds_class_kind,
imports :: mlds_imports,
% Imports these classes (or modules, packages, ...).
inherits :: list(mlds_class_id),
% Inherits these base classes.
implements :: list(mlds_interface_id),
% Implements these interfaces.
ctors :: mlds_defns,
% Has these constructors.
members :: mlds_defns
% Contains these members.
).
% Note: the definition of the `mlds_type' type is subject to change.
% In particular, we might add new alternatives here, so try to avoid
% switching on this type.
:- type mlds_type
---> mercury_type(
% Mercury data types
mer_type, % The exact Mercury type.
type_category, % What kind of type it is: enum, float, ...
exported_type % A representation of the type which can be
% used to determine the foreign language
% representation of the type.
)
; mlds_mercury_array_type(mlds_type)
% The Mercury array type is treated specially, some backends
% will treat it like any other mercury_type, whereas other may
% use a special representation for it.
% Arrays are type constructors in some backends, and so it is
% easier to represent it here as a special type constructor.
% (if we used the mercury_type representation above, we would
% only classify the topmost level of the type, whereas we
% really want to classify the element type for arrays, so
% we can generate int[] for array(int)).
% Note that mlds_mercury_array_type/1 is used for representing
% Mercury arrays, whereas mlds_array_type/1 (see below)
% is used for representing the target language's native arrays.
; mlds_cont_type(mlds_return_types)
% The type for the continuation functions used to handle
% nondeterminism
; mlds_commit_type
% mlds_commit_type is used for storing information about a
% commit. This is an abstract type; the exact definition
% will depend on the back-end. The only operations on this ADT
% are `try_commit' and `do_commit'. This type holds
% information about the `try_commit' stack frame that is
% needed to unwind the stack when a `do_commit' is executed.
%
% For the C back-end, if we're implementing do_commit/try_commit
% using setjmp/longmp, then mlds_commit_type will be jmp_buf.
% If we're implementing them using GNU C nested functions, then
% it will be `__label__'; in this case, the local variable
% of this "type" is actually a label, and doing a goto to that
% label will unwind the stack.
%
% If the back-end implements commits using the target
% language's, try/catch-style exception handling, as in
% Java/C++/etc., then the target language implementation's
% exception handling support will keep track of the information
% needed to unwind the stack, and so variables of this type
% don't need to be declared at all.
%
% See also the comments in ml_code_gen.m which show how commits
% can be implemented for different target languages.
; mlds_native_bool_type
; mlds_native_int_type
; mlds_native_float_type
; mlds_native_char_type
% MLDS native builtin types.
% These are the builtin types of the MLDS target language,
% whatever that may be.
% Currently we don't actually use many of these.
; mlds_foreign_type(
% This is a type of the target language.
foreign_language_type
)
; mlds_class_type(
% MLDS types defined using mlds_class_defn.
mlds_class, % name
arity,
mlds_class_kind
)
; mlds_array_type(mlds_type)
% MLDS array types.
% These are single-dimensional, and can be indexed
% using the `field' lval with an `offset' mlds_field_id;
% indices start at zero.
%
% Note that mlds_array_type/1 is used for representing
% the target language's native arrays, whereas
% mlds_mercury_array_type/1 (see above) is used for
% representing Mercury arrays.
%
% Currently MLDS array types are used for
% (a) static constants that would otherwise be allocated
% with a `new_object', if we're using the low-level
% data representation (--no-high-level-data)
% (b) for static constants of certain Mercury types which are
% always represented using the low-level data
% representation, regardless of --high-level-data,
% in particular closures and type_infos.
% (c) for any other arrays generated internally by the
% MLDS code generator, e.g. the arrays used for
% string switches.
; mlds_ptr_type(mlds_type)
% Pointer types.
% Currently these are used for handling output arguments.
; mlds_func_type(mlds_func_params)
% Function types.
% For the C back-end, these are mapped to function pointer types,
% since C doesn't have first-class function types.
; mlds_generic_type
% A generic type (e.g. `Word') that can hold any Mercury value.
% This is used for implementing polymorphism.
; mlds_generic_env_ptr_type
% A generic pointer type (e.g. `void *' in C) that can be used
% to point to the environment (set of local variables) of the
% containing function. This is used for handling nondeterminism,
% if the target language doesn't support nested functions, and
% also for handling closures for higher-order code.
; mlds_type_info_type
; mlds_pseudo_type_info_type
; mlds_rtti_type(rtti_id_maybe_element)
; mlds_unknown_type.
% A type used by the ML code generator for references to variables
% that have yet to be declared. This occurs once in ml_code_util.m
% where references to env_ptr's are generated but the declaration
% of these env_ptr's does not occur until the ml_elim_nested pass.
:- func mercury_type_to_mlds_type(module_info, mer_type) = mlds_type.
% Hmm... this is tentative.
:- type mlds_class_id == mlds_type.
:- type mlds_interface_id == mlds_type.
%-----------------------------------------------------------------------------%
%
% Declaration flags
%
:- type mlds_decl_flags.
:- type access
% The following used for class members (this includes globals,
% which are actually members of the top-level package)
---> public % Accessible to anyone.
; protected % Only accessible to the class and to derived classes.
; private % Only accessible to the class.
; default % Java "default" access or .NET assembly access:
% accessible to anything defined in the same package.
% The following is used for entities defined in a block/2 statement,
% i.e. local variables and nested functions.
; local. % Only accessible within the block in which the entity
% (variable or nested function) is defined.
:- type per_instance
---> one_copy % I.e. "static" storage duration (but not necessarily
% static linkage) or static member function. Note that
% `one_copy' variables *must* have an initializer
% (the GCC back-end relies on this.)
; per_instance.
% I.e. "auto" local variable in function, or non-static
% member of class.
:- type virtuality
---> non_virtual
; virtual.
:- type finality
---> overridable
; final.
:- type constness
---> modifiable
; const.
:- type abstractness
---> concrete
; abstract.
:- func access(mlds_decl_flags) = access.
:- func per_instance(mlds_decl_flags) = per_instance.
:- func virtuality(mlds_decl_flags) = virtuality.
:- func finality(mlds_decl_flags) = finality.
:- func constness(mlds_decl_flags) = constness.
:- func abstractness(mlds_decl_flags) = abstractness.
:- func set_access(mlds_decl_flags, access) = mlds_decl_flags.
:- func set_per_instance(mlds_decl_flags, per_instance) = mlds_decl_flags.
:- func set_virtuality(mlds_decl_flags, virtuality) = mlds_decl_flags.
:- func set_finality(mlds_decl_flags, finality) = mlds_decl_flags.
:- func set_constness(mlds_decl_flags, constness) = mlds_decl_flags.
:- func set_abstractness(mlds_decl_flags, abstractness) = mlds_decl_flags.
:- func init_decl_flags(access, per_instance, virtuality, finality,
constness, abstractness) = mlds_decl_flags.
%-----------------------------------------------------------------------------%
%
% Foreign code interfacing
% Foreign code required for the foreign language interface.
% When compiling to a language other than the foreign language,
% this part still needs to be generated as C (or whatever) code
% and compiled with a C (or whatever) compiler.
:- type mlds_foreign_code
---> mlds_foreign_code(
foreign_decl_info,
foreign_import_module_info,
list(user_foreign_code),
list(mlds_pragma_export)
).
% Information required to generate code for each `pragma export'.
:- type mlds_pragma_export
---> ml_pragma_export(
string, % Exported name
mlds_qualified_entity_name, % MLDS name for exported entity
mlds_func_params, % MLDS function parameters
mlds_context
).
%-----------------------------------------------------------------------------%
%
% Attributes
:- type mlds_attribute
---> custom(
mlds_type
).
%-----------------------------------------------------------------------------%
%
% Contexts (i.e. source code locations)
% mlds.context is probably == prog_context,
% but might also contain goal_path or other information.
:- type mlds_context.
:- func mlds_make_context(prog_context) = mlds_context.
:- func mlds_get_prog_context(mlds_context) = prog_context.
%-----------------------------------------------------------------------------%
%
% Statements
%
:- type statements == list(statement).
:- type statement
---> statement(
mlds_stmt,
mlds_context
).
:- type mlds_stmt
% Sequence.
---> block(mlds_defns, list(statement))
% Iteration.
; while(mlds_rval, statement, bool)
% The `bool' is true iff the loop is guaranteed to iterate at
% least once -- in that case, the compiler can generate a
% `do...while' loop rather than a `while...' loop.
% Selection (see also computed_goto).
; if_then_else(mlds_rval, statement, maybe(statement))
; switch(
% This representation for switches is very general:
% it allows switching on any type, and cases can match
% on ranges as well as on values.
% Many target languages only allow switches on ints or enums.
% Some (e.g. C#) also allow switches on strings.
% Most target languages only allow matching on values;
% only some (e.g. GNU C) allow matching on ranges.
% The MLDS code generator should only generate switches
% that the target will support.
%
% Note that unlike C, MLDS cases do NOT fall through; if you
% want to achieve that effect, you need to use an explicit
% goto.
% The value to switch on
mlds_type,
mlds_rval,
% The range of possible values which the value might take
% (if known).
mlds_switch_range,
% The different cases.
mlds_switch_cases,
% What to do if none of the cases match.
mlds_switch_default
)
% Transfer of control.
; label(mlds_label)
% Defines a label that can be used as the target of calls,
% gotos, etc.
; goto(mlds_goto_target)
% goto(Target): Branch to the specified address.
; computed_goto(mlds_rval, list(mlds_label))
% Evaluate rval, which should be an integer, and jump to the
% (rval+1)th label in the list. e.g. computed_goto(2, [A, B, C, D])
% will branch to label C.
% Function call/return.
; call(
mlds_func_signature, % Signature of the function.
mlds_rval, % The function to call.
maybe(mlds_rval), % For method calls, this field
% specifies the `this' object.
list(mlds_rval), % Ordinary function arguments.
list(mlds_lval), % Places to store the function return
% value(s).
call_kind % Indicates whether this call is a
% tail call.
)
; return(list(mlds_rval)) % Some targets will not support
% returning more than one value.
% Commits (a specialized form of exception handling).
; try_commit(mlds_lval, statement, statement)
; do_commit(mlds_rval)
% try_commit(Ref, GoalToTry, CommitHandlerGoal):
% Execute GoalToTry. If GoalToTry exits via a
% `commit(Ref)' instruction, then execute
% CommitHandlerGoal.
%
% do_commit(Ref):
% Unwind the stack to the corresponding `try_commit'
% statement for Ref, and branch to the CommitHandlerGoal
% that was specified in that try_commit instruction.
%
% For both try_commit and do_commit instructions, Ref should be
% the name of a local variable of type mlds_commit_type.
% (This variable can be used by the back-end's implementation
% of do_commit and try_commit to store information needed to unwind
% the stack.) There should be exactly one try_commit instruction
% for each Ref. do_commit(Ref) instructions should only be used
% in goals called from the GoalToTry goal in the try_commit
% instruction with the same Ref.
%
% The C and GCC back-ends require each try_commit to be put in
% its own nested function, to avoid problems with setjmp() and
% local vars not declared volatile. They also require each
% do_commit to be put in its own function -- this is needed when
% using __builtin_setjmp()/__builtin_longjmp() to ensure that
% the call to __builtin_longjmp() is not in the same function
% as the call to __builtin_setjmp().
% Exception handling.
%
% We use C++-style exceptions.
% For C, the back-end can simulate them using setjmp/longjmp.
%
% XXX This is tentative -- the current definition may be
% a bit too specific to C++-style exceptions.
% It might not be a good choice for different target languages.
% XXX Full exception handling support is not yet implemented.
% ; throw(mlds_type, mlds_rval)
% % Throw the specified exception.
% ; rethrow
% % Rethrow the current exception
% % (only valid inside an exception handler).
% ; try_catch(
% statement,
% list(mlds_exception_handler)
% )
% % Execute the specified statement, and if it throws an exception,
% % and the exception matches any of the exception handlers,
% % then execute the first matching exception handler.
% Atomic statements.
; atomic(mlds_atomic_statement).
%-----------------------------------------------------------------------------%
%
% Extra info for switches
%
% The range of possible values which the
% switch variable might take (if known)
:- type mlds_switch_range
---> range_unknown
; range(range_min::int, range_max::int).
% From range_min to range_max, inclusive.
% Each switch case consists of the conditions to match against,
% and the statement to execute if the match succeeds.
% Unlike C, cases do NOT fall through; if you want to achieve that
% effect, you need to use an explicit goto.
:- type mlds_switch_cases == list(mlds_switch_case).
:- type mlds_switch_case == pair(mlds_case_match_conds, statement).
% Case_match_conds should be a _non-empty_ list of conditions;
% if _any_ of the conditions match, this case will be selected.
:- type mlds_case_match_conds == list(mlds_case_match_cond).
% A case_match_cond specifies when a switch case will be selected
:- type mlds_case_match_cond
---> match_value(mlds_rval)
% match_value(Val) matches if the switch value is equal to
% the specified Val.
; match_range(mlds_rval, mlds_rval).
% match_range(Min, Max) matches if the switch value is between
% Min and Max, inclusive. Note that this should only be used
% if the target supports it; currently the C back-end supports
% this only if you're using the GNU C compiler.
% The switch_default specifies what to do if none of the switch
% conditions match.
:- type mlds_switch_default
---> default_is_unreachable
% The switch is exhaustive, so the default case should
% never be reached.
; default_do_nothing
% The default action is to just fall through to the statement
% after the switch.
; default_case(statement).
% % The default is to execute the specified statement.
%-----------------------------------------------------------------------------%
%
% Extra info for labels
:- type mlds_label == string.
:- type mlds_goto_target
---> label(mlds_label) % Branch to the specified label.
; break % Branch to just after the end of the
% immediately enclosing loop or switch,
% just like a C/C++/Java `break' statement.
% Not supported by all target languages.
; continue. % Branch to the end of the loop body for the
% immediately enclosing loop, just like a
% C/C++/Java/C# `continue' statement.
% Not supported by all target languages.
%-----------------------------------------------------------------------------%
%
% Extra info for calls
% The `call_kind' type indicates whether a call is a tail call
% and whether the call is know to never return.
%
% Marking a call as a tail_call is intended as a hint to
% the target language that this call can be implemented
% by removing the caller's stack frame and jumping to the
% destination, rather than a normal call.
% However, the target code generator need not obey this hint;
% it is permitted for the target code generator to ignore the
% hint and generate code which does not remove the caller's
% stack frame and/or which falls through to the following
% statement.
:- type call_kind
---> no_return_call % A call that never returns
% (this is a special case of a tail call)
; tail_call % A tail call.
; ordinary_call. % Just an ordinary call.
%-----------------------------------------------------------------------------%
%
% Extra info for exception handling
% XXX This is tentative -- the current definition may be
% a bit too specific to C++-style exceptions.
% It might not be a good choice for different target languages.
:- type mlds_exception_handler
---> handler(
maybe(mlds_type),
% If `yes(T)', specifies the type of exceptions to catch.
% If `no', it means catch all exceptions.
maybe(string)
% If `yes(Name)', gives the variable name to use for the
% exception value.
% If `no', then exception value will not be used.
).
%-----------------------------------------------------------------------------%
%
% atomic statements
%
:- type mlds_atomic_statement
---> comment(string)
% Insert a comment into the output code.
; assign(mlds_lval, mlds_rval)
% assign(Location, Value):
% Assign the value specified by rval to the location
% specified by lval.
% Heap management.
; delete_object(mlds_lval)
% Compile time garbage collect (ie explicitly
% deallocate) the memory used by the lval.
; new_object(
% new_object(Target, Tag, Type,
% Size, CtorName, Args, ArgTypes):
% Allocate a memory block of the given size,
% initialize it with a new object of the given
% type by calling the constructor with the specified
% arguments, and put its address in the given lval,
% possibly after tagging the address with a given tag.
% (Some targets might not support tags.)
% The target to assign the new object's address to.
mlds_lval,
% A (primary) tag to tag the address with before assigning
% the result to the target.
maybe(mlds_tag),
% Indicates whether or not there is a secondary tag. If so,
% it will be stored as the first argument in the argument list
% (see below).
bool,
% The type of the object being allocated.
mlds_type,
% The amount of memory that needs to be allocated for the new
% object, measured in words (NOT bytes!).
maybe(mlds_rval),
% The name of the constructor to invoke.
maybe(ctor_name),
% The arguments to the constructor.
list(mlds_rval),
% The types of the arguments to the constructor.
%
% Note that for --low-level-data, we box all fields of objects
% created with new_object, i.e. they are reprsented with type
% mlds_generic_type. We also do that for some fields even
% for --high-level-data (e.g. floating point fields for the
% MLDS->C and MLDS->asm back-ends). In such cases, the type
% here should be mlds_generic_type; it is the responsibility
% of the HLDS->MLDS code generator to insert code to box/unbox
% the arguments.
list(mlds_type)
)
; gc_check
% Check to see if we need to do a garbage collection,
% and if so, do it.
% This is used for accurate garbage collection
% with the MLDS->C back-end.
; mark_hp(mlds_lval)
% Tell the heap sub-system to store a marker (for later use in
% restore_hp/1 instructions) in the specified lval.
%
% It's OK for the target to treat this as a no-op, and probably
% that is what most targets will do.
; restore_hp(mlds_rval)
% The rval must be a marker as returned by mark_hp/1.
% The effect is to deallocate all the memory which
% was allocated since that call to mark_hp.
%
% It's OK for the target to treat this as a no-op,
% and probably that is what most targets will do.
% Trail management.
; trail_op(trail_op)
% Foreign language interfacing.
; inline_target_code(target_lang, list(target_code_component))
% Do whatever is specified by the target_code_components, which
% can be any piece of code in the specified target language (C,
% assembler, or whatever) that does not have any non-local flow
% of control. This is implemented by embedding the target code
% in the output stream of instructions or statements.
; outline_foreign_proc(
% Do whatever is specified by the string, which can be any
% piece of code in the specified foreign language (C#,
% managed C++, or whatever). This is implemented by calling
% an externally defined function, which the backend must
% generate the definition for (in some other file perhaps)
% and calling it. The lvals are use to generate the appropriate
% forwarding code.
% XXX We should also store the list of mlds_rvals where
% the input values come from.
% The foreign language this code is written in.
foreign_language,
list(outline_arg),
% Where to store return value(s).
list(mlds_lval),
% The user's foreign language code fragment.
string
).
% Stores information about each argument to an outline_foreign_proc.
:- type outline_arg
---> in(
mlds_type, % The type of the argument.
string, % The name of the argument in the foreign code.
mlds_rval % The rval which holds the value of this
% argument.
)
; out(
mlds_type, % The type of the argument.
string, % The name of the argument in the foreign code.
mlds_lval % The lval where we are to place the result
% calculated by the foreign code into.
)
; unused.
% This is just a random selection of possible languages
% that we might want to target...
:- type target_lang
---> lang_C
; lang_GNU_C
; lang_C_minus_minus
; lang_asm
; lang_il
; lang_java
; lang_java_asm
; lang_java_bytecode.
:- type target_code_component
---> user_target_code(
% User_target_code holds C code from the user's
% `pragma foreign_proc' declaration.
string,
maybe(prog_context),
target_code_attributes
)
; raw_target_code(
% Raw_target_code holds C code that the compiler has generated.
% To ensure that following `#line' directives work OK, either
% the string in a raw_target_code must end in `\n' (or `\n'
% followed by whitespace), or the following
% target_code_component must be a `name(Name)' component,
% for which we do not output #line directives.
string,
target_code_attributes
)
; target_code_input(mlds_rval)
; target_code_output(mlds_lval)
; name(mlds_qualified_entity_name).
:- type target_code_attributes == list(target_code_attribute).
:- type target_code_attribute
---> max_stack_size(int).
% max_stack_size(Size):
% This attribute declares the maximum stack usage of a
% particular piece of code. The unit that `Size' is measured
% in depends upon foreign language being used. Currently this
% attribute is only used (and is in fact required) by the
% `IL' foreign language interface, and is measured in units
% of stack items.
% Constructor id.
:- type ctor_name == mlds_qualified_ctor_id.
:- type mlds_ctor_id ---> ctor_id(mlds_class_name, arity).
:- type mlds_qualified_ctor_id == mlds_fully_qualified_name(mlds_ctor_id).
% Trail management.
% For documentation, see the corresponding LLDS instructions in llds.m.
%
:- type trail_op
---> store_ticket(mlds_lval)
; reset_ticket(mlds_rval, mlds_reset_trail_reason)
; discard_ticket
; prune_ticket
; mark_ticket_stack(mlds_lval)
; prune_tickets_to(mlds_rval).
% ; discard_tickets_to(mlds_rval). % used only by the library
%-----------------------------------------------------------------------------%
% An mlds_field_id represents some data within an object.
:- type mlds_field_id
---> offset(mlds_rval)
% offset(N) represents the field at offset N Words.
; named_field(mlds_fully_qualified_name(mlds_field_name), mlds_type).
% named_field(Name, CtorType) represents the field with the
% specified name. The CtorType gives the MLDS type for this
% particular constructor. The type of the object is given by
% the PtrType in the field(..) lval; CtorType may either be
% the same as PtrType, or it may be a pointer to a derived class.
% In the latter case, the MLDS->target code back-end is responsible
% for inserting a downcast from PtrType to CtorType before
% accessing the field.
:- type mlds_field_name == string.
% An mlds_var represents a variable or constant.
:- type mlds_var == mlds_fully_qualified_name(mlds_var_name).
:- type mlds_var_name
---> mlds_var_name(string, maybe(int)).
% Var name and perhaps a unique number to be added as a
% suffix where necessary.
% An lval represents a data location or variable that can be used
% as the target of an assignment.
:- type mlds_lval
% Values on the heap or fields of a structure.
---> field(
% field(Tag, Address, FieldId, FieldType, PtrType):
% Selects a field of a compound term.
% Address is a tagged pointer to a cell on the heap.
% The position in the cell, FieldId, is represented either
% as a field name or a number of words offset. If Tag is yes,
% the arg gives the value of the tag; if it is no, the tag bits
% will have to be masked off. The value of the tag should be
% given if it is known, since this will lead to faster code.
% The FieldType is the type of the field. The PtrType is the
% type of the pointer from which we are fetching the field.
%
% Note that for --low-level-data, we box all fields of objects
% created with new_object, i.e. they are reprsented with type
% mlds_generic_type. We also do that for some fields even
% for --high-level-data (e.g. floating point fields for the
% MLDS->C and MLDS->asm back-ends). In such cases, the type
% here should be mlds_generic_type, not the actual type of
% the field. If the actual type is different, then it is the
% HLDS->MLDS code generator's responsibility to insert the
% necessary code to handle boxing/unboxing.
field_tag :: maybe(mlds_tag),
field_addr :: mlds_rval,
field_field_id :: mlds_field_id,
field_type :: mlds_type,
field_ptr_type :: mlds_type
)
% Values somewhere in memory.
% This is the deference operator (e.g. unary `*' in C).
; mem_ref(
% The rval should have originally come from a mem_addr rval.
% The type is the type of the value being dereferenced.
mlds_rval,
mlds_type
)
% Variables.
% These may be local or they may come from some enclosing scope
% the variable name should be fully qualified.
; var(
mlds_var,
mlds_type
).
%-----------------------------------------------------------------------------%
%
% Expressions
% An rval is an expression that represents a value.
:- type mlds_rval
---> lval(mlds_lval)
% The value of an `lval' rval is just the value stored in
% the specified lval.
; mkword(mlds_tag, mlds_rval)
% Given a pointer and a tag, mkword returns a tagged pointer.
%
% (XXX It might be more consistent to make this a binary_op,
% with the tag argument just being an rval, rather than
% having `mkword' be a separate kind of rval.)
; const(mlds_rval_const)
; unop(mlds_unary_op, mlds_rval)
; binop(binary_op, mlds_rval, mlds_rval)
; mem_addr(mlds_lval)
% The address of a variable, etc.
; self(mlds_type).
% The equivalent of the `this' pointer in C++ with the type of the
% object. Note that this rval is valid iff we are targeting an
% object oriented backend and we are in an instance method
% (procedures which have the per_instance flag set).
:- type mlds_unary_op
---> box(mlds_type)
% box(MLDSType); convert from MLDSType to mlds_generic_type,
% by boxing if necessary, or just casting if not.
; unbox(mlds_type)
% unbox(MLDSType): convert from mlds_generic_type to MLDSType,
% applying the inverse transformation to box/1, i.e. unboxing
% if boxing was necessary, and just casting otherwise.
; cast(mlds_type)
% cast(MLDSType): Coerce the type of the rval to be MLDSType.
% XXX It might be worthwhile adding the type that we cast from.
; std_unop(builtin_ops.unary_op).
:- type mlds_rval_const
---> true
; false
; int_const(int)
; float_const(float)
; string_const(string)
; multi_string_const(int, string)
% A multi_string_const is a string containing embedded NULs,
% whose real length is given by the integer, and not the location
% of the first null character.
; code_addr_const(mlds_code_addr)
; data_addr_const(mlds_data_addr)
; null(mlds_type).
% A null value, of the given type. Usually the type will be a
% pointer (mlds_ptr_type) but it could also be string or a
% func_type. (Null is not a valid value of type string or
% func_type, but null values of those types may be useful as
% placeholders in cases where the value will never be used.)
:- type mlds_code_addr
---> proc(
mlds_qualified_proc_label,
mlds_func_signature
)
; internal(
mlds_qualified_proc_label,
mlds_func_sequence_num,
mlds_func_signature
).
:- type mlds_data_addr
---> data_addr(
mlds_module_name, % The name of the module module.
mlds_data_name % The id of the variable inside the module.
).
:- type mlds_data == mlds_fully_qualified_name(mlds_data_name).
:- type mlds_data_name
---> var(mlds_var_name)
% Ordinary variables.
; common(int)
% Compiler-introduced constants representing global constants.
% These are called "common" because they may be common
% subexpressions.
% Stuff for handling polymorphism/RTTI and type classes.
; rtti(rtti_id)
% Stuff for handling debugging and accurate garbage collection.
% (Those features are not yet implemented for the MLDS back-end,
% so these data_names are not yet used.)
; module_layout
% Layout information for the current module.
; proc_layout(mlds_proc_label)
% Layout structure for the given procedure.
; internal_layout(mlds_proc_label, mlds_func_sequence_num)
% Layout structure for the given internal MLDS func.
% Stuff for tabling
; tabling_pointer(mlds_proc_label).
% A variable that contains a pointer that points to the table
% used to implement memoization, loopcheck or minimal model
% semantics for the given procedure.
%-----------------------------------------------------------------------------%
% Note: the types `tag' and `reset_trail_reason' here are all
% defined exactly the same as the ones in llds.m. The definitions are
% duplicated here because we don't want mlds.m to depend on llds.m.
% (Alternatively, we could move both these definitions into a new module
% imported by both mlds.m and llds.m, but these definitions are small enough
% and simple enough that I don't think it is worth creating a new module
% just for them.)
% A tag should be a small non-negative integer.
:- type mlds_tag == int.
% see runtime/mercury_trail.h
:- type mlds_reset_trail_reason
---> undo
; commit
; solve
; exception
; gc.
%-----------------------------------------------------------------------------%
:- type mlds_qualified_proc_label
== mlds_fully_qualified_name(mlds_proc_label).
:- type mlds_proc_label
== pair(mlds_pred_label, proc_id).
:- type mlds_qualified_pred_label
== mlds_fully_qualified_name(mlds_pred_label).
% An mlds_pred_label is a structure containing information that
% uniquely identifies a HLDS predicate within a given module.
%
% Note that a predicate can have both a declaring and a defining module.
% The defining module is the module that provides the code for the
% predicate, the declaring module contains the `:- pred' declaration.
% When these are different, as for specialised versions of predicates
% from `.opt' files, the defining module's name is added as a
% qualifier to the pred name.
:- type mlds_pred_label
---> pred(
pred_or_func, % predicate/function
maybe(mercury_module_name),
% The declaring module,
% if different to the defining module
string, % Name.
arity, % Arity.
code_model, % Code model.
bool % Function without return value
% (i.e. non-default mode).
)
; special_pred(
string, % pred name
maybe(mercury_module_name),
% The module declaring the type,
% if this is different to module defining
% the special_pred.
string, % The type name.
arity % The type arity.
).
%-----------------------------------------------------------------------------%
% Invert the case of the first letter of the string.
% This is used for the Java back-end.
%
:- func flip_initial_case(string) = string.
% Invert the case of the first letter of the last component of
% a (possibly) qualified name. This is used for the Java back-end.
%
:- func flip_initial_case_of_final_part(sym_name) = sym_name.
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
:- implementation.
:- import_module backend_libs.foreign.
:- import_module check_hlds.type_util.
:- import_module hlds.hlds_data.
:- import_module libs.compiler_util.
:- import_module libs.globals.
:- import_module parse_tree.modules.
:- import_module parse_tree.prog_type.
:- import_module char.
:- import_module int.
:- import_module library.
:- import_module string.
:- import_module term.
%-----------------------------------------------------------------------------%
mlds_get_module_name(mlds(ModuleName, _, _, _, _, _)) = ModuleName.
%-----------------------------------------------------------------------------%
% Currently mlds_contexts just contain a prog_context.
:- type mlds_context
---> mlds_context(prog_context).
mlds_make_context(Context) = mlds_context(Context).
mlds_get_prog_context(mlds_context(Context)) = Context.
%-----------------------------------------------------------------------------%
% There is some special-case handling for arrays, foreign types and some
% other types here, but apart from that, currently we return mlds_types
% that are just the same as Mercury types, except that we also store the type
% category, so that we can tell if the type is an enumeration or not, without
% needing to refer to the HLDS type_table.
% XXX It might be a better idea to get rid of the mercury_type/2 MLDS type
% and instead fully convert all Mercury types to MLDS types.
mercury_type_to_mlds_type(ModuleInfo, Type) = MLDSType :-
(
type_to_ctor_and_args(Type, TypeCtor, [ElemType]),
TypeCtor = type_ctor(qualified(unqualified("array"), "array"), 1)
->
MLDSElemType = mercury_type_to_mlds_type(ModuleInfo, ElemType),
MLDSType = mlds_mercury_array_type(MLDSElemType)
;
type_to_ctor_and_args(Type, TypeCtor, [RefType]),
TypeCtor = type_ctor(qualified(mercury_private_builtin_module,
"store_by_ref_type"), 1)
->
MLDSRefType = mercury_type_to_mlds_type(ModuleInfo, RefType),
MLDSType = mlds_ptr_type(MLDSRefType)
;
type_to_ctor_and_args(Type, TypeCtor, _),
module_info_get_type_table(ModuleInfo, Types),
map.search(Types, TypeCtor, TypeDefn),
hlds_data.get_type_defn_body(TypeDefn, Body),
Body = foreign_type(foreign_type_body(MaybeIL, MaybeC, MaybeJava))
->
module_info_get_globals(ModuleInfo, Globals),
globals.get_target(Globals, Target),
(
Target = c,
(
MaybeC = yes(Data),
Data = foreign_type_lang_data(CForeignType, _, _),
ForeignType = c(CForeignType)
;
MaybeC = no,
% This is checked by check_foreign_type in make_hlds.
unexpected(this_file,
"mercury_type_to_mlds_type: No C foreign type")
)
;
Target = il,
(
MaybeIL = yes(Data),
Data = foreign_type_lang_data(ILForeignType, _, _),
ForeignType = il(ILForeignType)
;
MaybeIL = no,
% This is checked by check_foreign_type in make_hlds.
unexpected(this_file,
"mercury_type_to_mlds_type: No IL foreign type")
)
;
Target = java,
(
MaybeJava = yes(Data),
Data = foreign_type_lang_data(JavaForeignType, _, _),
ForeignType = java(JavaForeignType)
;
MaybeJava = no,
% This is checked by check_foreign_type in make_hlds.
unexpected(this_file,
"mercury_type_to_mlds_type: no Java foreign type")
)
;
Target = asm,
(
MaybeC = yes(Data),
Data = foreign_type_lang_data(CForeignType, _, _),
ForeignType = c(CForeignType)
;
MaybeC = no,
% XXX This ought to be checked by the front-end, e.g.
% check_foreign_type in make_hlds.
sorry(this_file,
"mercury_type_to_mlds_type: No C foreign type")
)
),
MLDSType = mlds_foreign_type(ForeignType)
;
classify_type(ModuleInfo, Type) = Category,
ExportedType = to_exported_type(ModuleInfo, Type),
MLDSType = mercury_type(Type, Category, ExportedType)
).
%-----------------------------------------------------------------------------%
mlds_get_func_signature(mlds_func_params(Parameters, RetTypes)) =
mlds_func_signature(ParamTypes, RetTypes) :-
ParamTypes = mlds_get_arg_types(Parameters).
mlds_get_arg_types(Parameters) = ArgTypes :-
GetArgType = (func(mlds_argument(_, Type, _)) = Type),
ArgTypes = list.map(GetArgType, Parameters).
%-----------------------------------------------------------------------------%
% An MLDS module name consists of two parts. One part is the package
% which the module name is defined in, and the other part is the actual
% module name. For example the module name System.XML could be defined
% in the package XML.
%
% Note that modules in the Mercury standard library map get a `mercury'
% prefix e.g. `mercury.builtin', `mercury.io', `mercury.univ', etc.,
% when mapped to MLDS package names.
% :- type mlds_module_name == prim_data.module_name.
:- type mlds_module_name
---> name(
package_name :: prim_data.module_name,
module_name :: prim_data.module_name
).
mercury_module_and_package_name_to_mlds(MLDS_Package, MercuryModule)
= name(MLDS_Package, MercuryModule).
mercury_module_name_to_mlds(MercuryModule)
= name(MLDS_Package, MLDS_Package) :-
(
MercuryModule = unqualified(ModuleName),
mercury_std_library_module(ModuleName)
->
MLDS_Package = qualified(unqualified("mercury"), ModuleName)
;
MLDS_Package = MercuryModule
).
is_std_lib_module(Module, UnqualifiedName) :-
Name = Module ^ module_name,
( Name = unqualified(UnqualifiedName)
; Name = qualified(unqualified("mercury"), UnqualifiedName)
),
mercury_std_library_module(UnqualifiedName).
mlds_module_name_to_sym_name(Module) = Module ^ module_name.
mlds_module_name_to_package_name(Module) = Module ^ package_name.
mlds_append_class_qualifier(name(Package, Module), QualKind, Globals,
ClassName, ClassArity) =
name(Package, qualified(AdjustedModule, ClassQualifier)) :-
% For the Java back-end, we flip the initial case of an type qualifiers,
% in order to match the usual Java conventions.
(
globals.get_target(Globals, CompilationTarget),
CompilationTarget = java,
QualKind = type_qual
->
AdjustedModule = flip_initial_case_of_final_part(Module)
;
AdjustedModule = Module
),
string.format("%s_%d", [s(ClassName), i(ClassArity)], ClassQualifier).
mlds_append_wrapper_class(Name) = mlds_append_name(Name, wrapper_class_name).
mlds_append_name(name(Package, Module), Name)
= name(Package, qualified(Module, Name)).
wrapper_class_name = "mercury_code".
flip_initial_case_of_final_part(unqualified(Name)) =
unqualified(flip_initial_case(Name)).
flip_initial_case_of_final_part(qualified(Qual, Name)) =
qualified(Qual, flip_initial_case(Name)).
% Invert the case of the first letter of the string.
flip_initial_case(S0) = S :-
( string.first_char(S0, First0, Rest) ->
( char.is_upper(First0) ->
First = char.to_lower(First0)
; char.is_lower(First0) ->
First = char.to_upper(First0)
;
First = First0
),
string.first_char(S, First, Rest)
;
S = S0
).
%-----------------------------------------------------------------------------%
% We represent the set of declaration flags as a bunch of bit-fields packed
% into a single int.
:- type mlds_decl_flags == int.
% Here we define which bits are used to store each bitfield.
%
% It would be nicer to use a language builtin, e.g. index/2, for these.
% But currently builtin.index/2 does not work in the reverse mode,
% and you can't use construct.construct/4 since that numbers the
% alternatives in a different order than builtin.index/2.
%
% It would also be nice to use a typeclass:
% :- typeclass bitfield(T) where [
% func bits(T) = int,
% func mask(T::unused) = (int::out) is det
% ].
% But currently that is too cumbersome, since you can't define class
% methods inline.
%
% On the other hand, doing it manually may be more efficient than either
% of those two approaches.
:- func access_bits(access) = int.
:- mode access_bits(in) = out is det.
:- mode access_bits(out) = in is semidet.
access_bits(public) = 0x00.
access_bits(private) = 0x01.
access_bits(protected) = 0x02.
access_bits(default) = 0x03.
access_bits(local) = 0x04.
% 0x5 - 0x7 reserved
:- func access_mask = int.
access_mask = 0x07.
:- func per_instance_bits(per_instance) = int.
:- mode per_instance_bits(in) = out is det.
:- mode per_instance_bits(out) = in is semidet.
per_instance_bits(one_copy) = 0x00.
per_instance_bits(per_instance) = 0x08.
:- func per_instance_mask = int.
per_instance_mask = per_instance_bits(per_instance).
:- func virtuality_bits(virtuality) = int.
:- mode virtuality_bits(in) = out is det.
:- mode virtuality_bits(out) = in is semidet.
virtuality_bits(non_virtual) = 0x00.
virtuality_bits(virtual) = 0x10.
:- func virtuality_mask = int.
virtuality_mask = virtuality_bits(virtual).
:- func finality_bits(finality) = int.
:- mode finality_bits(in) = out is det.
:- mode finality_bits(out) = in is semidet.
finality_bits(overridable) = 0x00.
finality_bits(final) = 0x20.
:- func finality_mask = int.
finality_mask = finality_bits(final).
:- func constness_bits(constness) = int.
:- mode constness_bits(in) = out is det.
:- mode constness_bits(out) = in is semidet.
constness_bits(modifiable) = 0x00.
constness_bits(const) = 0x40.
:- func constness_mask = int.
constness_mask = constness_bits(const).
:- func abstractness_bits(abstractness) = int.
:- mode abstractness_bits(in) = out is det.
:- mode abstractness_bits(out) = in is semidet.
abstractness_bits(abstract) = 0x00.
abstractness_bits(concrete) = 0x80.
:- func abstractness_mask = int.
abstractness_mask = abstractness_bits(concrete).
%
% Here we define the functions to lookup a member of the set.
%
access(Flags) = promise_det(pred(Access::out) is semidet :-
Flags /\ access_mask = access_bits(Access)).
per_instance(Flags) = promise_det(pred(PerInstance::out) is semidet :-
Flags /\ per_instance_mask = per_instance_bits(PerInstance)).
virtuality(Flags) = promise_det(pred(Virtuality::out) is semidet :-
Flags /\ virtuality_mask = virtuality_bits(Virtuality)).
finality(Flags) = promise_det(pred(Finality::out) is semidet :-
Flags /\ finality_mask = finality_bits(Finality)).
constness(Flags) = promise_det(pred(Constness::out) is semidet :-
Flags /\ constness_mask = constness_bits(Constness)).
abstractness(Flags) = promise_det(pred(Abstractness::out) is semidet :-
Flags /\ abstractness_mask = abstractness_bits(Abstractness)).
:- func promise_det(pred(T)) = T.
:- mode promise_det(pred(out) is semidet) = out is det.
promise_det(Pred) = X :-
( if Pred(X0)
then X = X0
else unexpected(this_file, "promise_det failed")
).
%
% Here we define the functions to set a member of the set.
%
set_access(Flags, Access) =
Flags /\ \access_mask \/ access_bits(Access).
set_per_instance(Flags, PerInstance) =
Flags /\ \per_instance_mask \/ per_instance_bits(PerInstance).
set_virtuality(Flags, Virtuality) =
Flags /\ \virtuality_mask \/ virtuality_bits(Virtuality).
set_finality(Flags, Finality) =
Flags /\ \finality_mask \/ finality_bits(Finality).
set_constness(Flags, Constness) =
Flags /\ \constness_mask \/ constness_bits(Constness).
set_abstractness(Flags, Abstractness) =
Flags /\ \abstractness_mask \/ abstractness_bits(Abstractness).
init_decl_flags(Access, PerInstance, Virtuality, Finality, Constness,
Abstractness) =
access_bits(Access) \/
per_instance_bits(PerInstance) \/
virtuality_bits(Virtuality) \/
finality_bits(Finality) \/
constness_bits(Constness) \/
abstractness_bits(Abstractness).
%-----------------------------------------------------------------------------%
:- func this_file = string.
this_file = "mlds.m".
%-----------------------------------------------------------------------------%