mercury/doc/mercury_reference_manual.texi

\input texinfo
@setfilename mercury_reference_manual.info
@settitle The Mercury Language Reference Manual
@c vim: ts=4 sw=4 expandtab ft=texinfo

@dircategory The Mercury Programming Language
@direntry
* Mercury Language: (mercury_reference_manual).  The Mercury Language Reference Manual.
@end direntry

@c @smallbook
@c @cropmarks
@finalout
@setchapternewpage on
@ifnottex
This file documents the Mercury programming language, version <VERSION>.

Copyright @copyright{} 1995--2012 The University of Melbourne.@*
Copyright @copyright{} 2013--2026 The Mercury team.

Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.

@ignore
Permission is granted to process this file through Tex and print the
results, provided the printed document carries copying permission
notice identical to this one except for the removal of this paragraph
(this paragraph not being relevant to the printed manual).

@end ignore
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided also that
the entire resulting derived work is distributed under the terms of a
permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions.
@end ifnottex

@titlepage
@title The Mercury Language Reference Manual
@subtitle Version <VERSION>
@author Fergus Henderson
@author Thomas Conway
@author Zoltan Somogyi
@author David Jeffery
@author Peter Schachte
@author Simon Taylor
@author Chris Speirs
@author Tyson Dowd
@author Ralph Becket
@author Mark Brown
@author Peter Wang
@page
@vskip 0pt plus 1filll
Copyright @copyright{} 1995--2012 The University of Melbourne.@*
Copyright @copyright{} 2013--2026 The Mercury team.

Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.

Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided also that
the entire resulting derived work is distributed under the terms of a
permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions.
@end titlepage
@contents
@page
@c ---------------------------------------------------------------------------

@ifnottex
@node Top,,, (mercury)
@top The Mercury Language Reference Manual, version <VERSION>
@end ifnottex
@c XXX Move to after Determinism
@c * Assertions::        Assertion declarations allow you to declare laws
@c                       that hold.
@menu
* Introduction::      A brief introduction to Mercury.
* Syntax::            A description of Mercury's syntax.
* Clauses::           A description of Mercury's clauses.
* Types::             Mercury has a strong parametric polymorphic type system.
* Modes::             Modes allow you to specify the direction of data flow.
* Unique modes::      Unique modes allow you to specify when there is only one
                      reference to a particular value, so the compiler can
                      safely use destructive update to modify that value.
* Determinism::       Determinism declarations let you specify that a predicate
                      should never fail or should never succeed more than once.
* User-defined equality and comparison::
                      User-defined types can have user-defined equality and
                      comparison predicates.
* Higher-order::      Mercury supports higher-order predicates and functions,
                      with closures, lambda expressions, and currying.
* Modules::           Modules allow you to divide a program into smaller parts.
* Type classes::      Constrained polymorphism.
* Existential types:: Support for data abstraction and heterogeneous
                      collections.
* Type conversions::  Converting between subtypes and supertypes.
* Exception handling:: Catching exceptions to recover from exceptional
                      situations.
* Formal semantics::  Declarative and operational semantics of Mercury
                      programs.
* Foreign language interface:: Calling code written in other programming
                      languages from Mercury code.
* Impurity::          Users can write impure Mercury code.
* Solver types::      Support for constraint logic programming.
* Trace goals::       Trace goals allow programmers to add debugging and
                      logging code to their programs.
* Pragmas::           Various compiler directives, used for example to
                      control optimization.
* Implementation-dependent extensions::
                      The Melbourne Mercury implementation supports
                      several extensions to the Mercury language.
* Bibliography::      References for further reading.
@end menu

@node Introduction
@chapter Introduction

Mercury is a general-purpose programming language,
originally designed and implemented by a small group of researchers
at the University of Melbourne, Australia.
Mercury is based on the paradigm of purely declarative programming,
and was designed to be useful
for the development of large and robust ``real-world'' applications.
It improves on existing logic programming languages
by providing increased productivity, reliability and efficiency,
and by avoiding the need for non-logical program constructs.
Mercury provides the traditional logic programming syntax,
but also allows the syntactic convenience of user-defined functions,
smoothly integrating logic and functional programming into a single paradigm.

Mercury requires programmers to supply
type, mode and determinism declarations for the predicates
and functions they write.
The compiler checks these declarations,
and rejects the program if it cannot prove
that every predicate or function satisfies its declarations.
This improves reliability,
since many kinds of errors simply cannot happen
in successfully compiled Mercury programs.
It also improves productivity,
since the compiler pinpoints many errors
that would otherwise require manual debugging to locate.
The fact that declarations are checked by the compiler
makes them much more useful than comments
to anyone who has to maintain the program.
The compiler also exploits the guaranteed correctness of the declarations
for significantly improving the efficiency of the code it generates.

To facilitate programming-in-the-large, to allow separate compilation,
and to support encapsulation, Mercury has a simple module system.
Mercury's standard library has a variety of pre-defined modules
for common programming tasks --- see the Mercury Library Reference Manual.

@node Syntax
@chapter Syntax

@menu
* Syntax overview::
* Character set::
* Whitespace::
* Comments::
* Line number directives::
* Variables::
* Names::
* Literals::
* Punctuation symbols::
* Operators::
* Terms::
* Items::
@end menu

@node Syntax overview
@section Syntax overview

A Mercury program consists of a set of source files,
each of which contains a module.
A module consists of a sequence of tokens,
each of which is a variable, name, literal, or punctuation symbol.
Tokens may be separated by any amount of
whitespace, comments, and line number directives.
These separators are mostly ignored by the parser,
but in some cases whitespace may be required to separate tokens
that would otherwise be ambiguous.
In other cases whitespace is not allowed,
e.g., before the @var{open-ct} token,
or after a @samp{.} operator
that would otherwise be interpreted as an @var{end} token.

@node Character set
@section Character set

Mercury program source files must be written
using the UTF-8 encoding of the Unicode character set.
In the rest of this chapter,
``letters'', ``digits'', ``underscore'' and other kinds of punctuation
refer to characters in the Basic Latin code block.

@node Whitespace
@section Whitespace

Whitespace is defined to be the following characters:

@multitable {CHARACTER_TABULATION} {Unicode_code_point} {Horizontal-tab}
@headitem  Unicode name @tab Unicode code point @tab Notes
    @item @sc{space}                @tab U+0020             @tab
    @item @sc{character tabulation} @tab U+0009             @tab Horizontal-tab
    @item @sc{line feed}            @tab U+000A             @tab
    @item @sc{line tabulation}      @tab U+000B             @tab Vertical-tab
    @item @sc{form feed}            @tab U+000C             @tab
    @item @sc{carriage return}      @tab U+000D             @tab
@end multitable

@node Comments
@section Comments

The @samp{%} character starts a comment that continues to the end of the line.
The @samp{/*} character sequence starts a comment that continues until
the next occurrence of @samp{*/}.
For example:

@example
% Calculate the answer.
Result = 42     % This is the answer!

/*
omit this declaration for now
:- mode append(out, in, in) is semidet.
*/
@end example

@node Line number directives
@section Line number directives

A line number directive consists of the character @samp{#},
a positive integer specifying the line number, and then a newline.
Line number directives set the current line number;
they are used in conjunction with the @samp{pragma source_file} declaration
(@pxref{Source file name})
to indicate that errors in the Mercury code following the directive
should be reported relative to the line number set by the directive.
This is useful if the code in question was generated by another tool,
in which case the line number can be set to the corresponding location
in the original source file from which the Mercury code was derived.
The Mercury compiler can thereby issue more informative error messages
using locations in the original source file.
A @samp{#@var{line}} directive specifies
the line number for the immediately following line.
Line numbers for lines after that are incremented as usual,
so the second line after a @samp{#100} directive
would be considered to be line number 101.

@node Variables
@section Variables

A variable is an uppercase letter or underscore
followed by zero or more letters, underscores, and digits.
For example,
@samp{Sum}, @samp{_NotNeeded}, @samp{_a}, and @samp{_123} are variables,
whereas @samp{x} and @samp{_#} are not.
A variable token consisting of a single underscore is treated specially:
each instance of @samp{_} denotes a distinct variable.

Variables starting with an uppercase letter
are expected to occur more than once;
the compiler will issue a warning if this is not the case,
as it often indicates a simple error.
Variables starting with an underscore
are presumed to be ``don't-care'' variables;
the compiler will issue a warning
if a variable starting with an underscore,
excluding single underscore variables,
occurs more than once in the same scope.

@node Names
@section Names

@c In these paragraphs, we use @t instead of @samp or @code
@c to wrap examples as @t will not put quotes around its argument,
@c and we cannot expect readers to figure out which quotes
@c are subject matter and which are decoration.
A name token is either an unquoted name, a quoted name, a graphic name,
or a single semicolon character.
An unquoted name is a lowercase letter followed by zero or more letters,
underscores, and digits.
A quoted name is any sequence of zero or more characters
enclosed in single quotes (@t{'}).
Within a quoted name,
two adjacent single quotes stand for a single single quote.
Quoted names can also contain
backslash escapes of the same form as for strings.
A graphic name is a sequence of one or more of the following characters
@example
! & * + - : < = > ? @@ ^ ~ \ # $ . /
@end example

@noindent
where the first character is not @samp{#}.

As a special case, the character sequences @samp{<<u} and @samp{>>u}
are also graphic names.
(They are intended to denote left and right shifts
by unsigned amounts respectively.)

An unquoted name, graphic name, or semicolon
is treated as equivalent to a quoted name containing
the same sequence of characters.

@node Literals
@section Literals

The different literals in Mercury are as follows.

@table @var
@item string
A string is a sequence of characters enclosed in double quotes (@code{"}).

Within a string, two adjacent double quotes stand for a single double quote.
For example, the string @samp{ """" } is a string of length one,
containing a single double quote:
the outermost pair of double quotes encloses the string,
and the innermost pair stand for a single double quote.

Strings may also contain backslash escapes.
@samp{\a} stands for ``alert'' (a beep character),
@samp{\b} for backspace,
@samp{\e} for escape,
@samp{\f} for form-feed,
@samp{\n} for newline,
@samp{\r} for carriage-return,
@samp{\t} for tab,
@samp{\v} for vertical-tab.
An escaped backslash, single-quote, or double-quote stands for itself.

The sequence @samp{\x} introduces a hexadecimal escape;
it must be followed by a sequence of hexadecimal digits
and then a closing backslash.
It is replaced with the character
whose character code is identified by the hexadecimal number.
Similarly, a backslash followed by an octal digit
is the beginning of an octal escape;
as with hexadecimal escapes,
the sequence of octal digits must be terminated with a closing backslash.

The sequences @samp{\u} and @samp{\U} begin a Unicode escape.
@samp{\u} must be followed by the Unicode character code
expressed as four hexadecimal digits.
@samp{\U} must be followed by the Unicode character code
expressed as eight hexadecimal digits.
The highest allowed value is @samp{\U0010FFFF}.

A backslash followed immediately by a newline is deleted;
thus an escaped newline can be used to continue a string
over more than one source line.
(String literals may also contain embedded newlines.)

@item integer
An integer is either a decimal, binary, octal, hexadecimal,
or character-code literal.
A decimal literal is any sequence of decimal digits.
A binary literal is @samp{0b} followed by any sequence of binary digits.
An octal literal is @samp{0o} followed by any sequence of octal digits.
A hexadecimal literal is @samp{0x}
followed by any sequence of hexadecimal digits.
A character-code literal is @samp{0'} followed by any single character.

Decimal, binary, octal and hexadecimal literals
may be optionally terminated by a suffix
that indicates whether the literal represents a signed or unsigned integer
and what the size of that integer is.
These suffixes are:
@multitable {i_or_no_suffix} {Unsigned} {Implementation_defined}
@headitem Suffix @tab Signedness @tab Size
@item @code{i} or no suffix   @tab Signed   @tab Implementation-defined
@item @code{i8}               @tab Signed   @tab 8-bit
@item @code{i16}              @tab Signed   @tab 16-bit
@item @code{i32}              @tab Signed   @tab 32-bit
@item @code{i64}              @tab Signed   @tab 64-bit
@item @code{u}                @tab Unsigned @tab Implementation-defined
@item @code{u8}               @tab Unsigned @tab 8-bit
@item @code{u16}              @tab Unsigned @tab 16-bit
@item @code{u32}              @tab Unsigned @tab 32-bit
@item @code{u64}              @tab Unsigned @tab 64-bit
@end multitable

For decimal, binary, octal and hexadecimal literals,
an arbitrary number of underscores (@samp{_})
may be inserted between the digits.
An arbitrary number of underscores may also be inserted
between the radix prefix (i.e.@: @samp{0b}, @samp{0o} and @samp{0x})
and the initial digit.
Similarly, an arbitrary number of underscores
may be inserted between the final digit and the signedness suffix.
The purpose of the underscores is to improve readability;
they do not affect the numeric value of the literal.

@c TODO: we should support hexadecimal float literals too.
@item float
A floating point literal consists of a sequence of decimal digits,
a decimal point (@samp{.}) and a sequence of digits (the fraction part),
and the letter @samp{E} (or @samp{e}),
an optional sign (@samp{+} or @samp{-}),
and then another sequence of decimal digits (the exponent).
The fraction part or the exponent (but not both) may be omitted.

An arbitrary number of underscores (@samp{_})
may be inserted between the digits in a floating point literal.
Underscores may @emph{not} occur adjacent to any non-digit characters
(i.e.@: @samp{.}, @samp{e}, @samp{E}, @samp{+} or @samp{-})
in a floating point literal,
with one exception: underscores may occur between a digit
and an @samp{e} or @samp{E} that introduces the exponent part of the number.
The purpose of the underscores is to improve readability;
they do not affect the numeric value of the literal.

@item implementation-defined-literal
An implementation-defined literal
consists of a dollar sign (@samp{$}) followed by an unquoted name.

@end table

@node Punctuation symbols
@section Punctuation symbols

The following punctuation symbols are used in Mercury's syntax.

@table @var
@item open-ct
A left parenthesis, @samp{(}, that is not preceded by whitespace.

@item open
A left parenthesis, @samp{(}, that is preceded by whitespace.

@item close
A right parenthesis, @samp{)}.

@item open-list
A left square bracket, @samp{[}.

@item close-list
A right square bracket, @samp{]}.

@item open-curly
A left curly bracket, @samp{@{}.

@item close-curly
A right curly bracket, @samp{@}}.

@item backquote
A backquote character, @samp{`}.

@item ht-sep
A ``head-tail separator'', i.e.@: a vertical bar, @samp{|}.

@item comma
A comma, @samp{,}.

@item end
A full stop (period), @samp{.}.

@end table

@node Operators
@section Operators

An operator is either a builtin operator or a user-defined operator.
A user-defined operator is a name,
module qualified name (@pxref{The module system}), or variable,
enclosed in backquotes (grave accents).
User-defined operators are left-associative infix operators
that bind more strongly than most other operators (see below).

The builtin operators, with the exception of comma, are all names,
and as such they can be used without arguments supplied.
For example, @samp{f(+)} is syntactically valid.
In some cases parentheses may be required
to limit the scope of an operator without arguments,
e.g. if it appears as an argument to another operator.
The comma operator is not a name and therefore requires single quotes
in order to be used without arguments.
An operator in single quotes is still an operator,
so any requirement for parentheses will remain unchanged.

Operators are a syntactic concept.
The @samp{+} infix operator, for example, is only a symbol;
it does not mean addition, unless you write or import code
that defines it as addition.
Modules in the Mercury standard library,
such as @code{int}, @code{uint} and @code{float},
provide such arithmetic definitions.
Other, non-arithmetic definitions can also be provided,
for example,
the @samp{-} infix operator is defined as subtraction by those modules
but is defined as a pair constructor by the @code{pair} module.

The following table lists all of Mercury's builtin operators,
as well as user-defined operators of the form @code{`@var{op}`}.
Operators with a higher priority bind more tightly
than those with a lower priority.
(This is a recent change;
previously, Mercury followed the Prolog tradition
in using higher priorities to denote operators that bind @emph{less} tightly.)
For example, given that
@code{+} has priority 1000 and @code{*} has priority 1100,
the term @code{2 * X + Y} parenthesises as @code{(2 * X) + Y}.
Note that the module qualification operator, @samp{.},
binds more tightly than any other operator.
Therefore, operator terms using builtin operators
need to be parenthesized in order to be module qualified,
for example,
integer subtraction can be written as @samp{int.(A - B)}
whereas pair construction can be written as @samp{pair.(A - B)}.
(@xref{The module system}).

The ``Specifier'' field indicates
what structure terms constructed with an operator are allowed to take.
``f'' represents the operator and ``x'' and ``y'' represent arguments.
``x'' represents an argument
whose priority must be strictly higher than that of the operator.
``y'' represents an argument
whose priority is higher than or equal to that of the operator.
For example, ``yfx'' indicates a left-associative infix operator,
while ``xfy'' indicates a right-associative infix operator.

@example

Operator                        Specifier         Priority

.                               yfx               1490
!                               fx                1460
!.                              fx                1460
!:                              fx                1460
@@                               xfx               1410
^                               xfy               1401
^                               fx                1400
event                           fx                1400
:                               yfx               1380
`@var{op}`                            yfx               1380
**                              xfy               1300
-                               fx                1300
\                               fx                1300
*                               yfx               1100
/                               yfx               1100
//                              yfx               1100
<<                              yfx               1100
<<u                             yfx               1100
>>                              yfx               1100
>>u                             yfx               1100
div                             yfx               1100
mod                             xfx               1100
rem                             xfx               1100
for                             xfx               1000
+                               fx                1000
+                               yfx               1000
++                              xfy               1000
-                               yfx               1000
--                              yfx               1000
/\                              yfx               1000
\/                              yfx               1000
..                              xfx               950
:=                              xfx               850
=^                              xfx               850
<                               xfx               800
=                               xfx               800
=..                             xfx               800
=:=                             xfx               800
=<                              xfx               800
==                              xfx               800
=\=                             xfx               800
>                               xfx               800
>=                              xfx               800
@@<                              xfx               800
@@=<                             xfx               800
@@>                              xfx               800
@@>=                             xfx               800
\=                              xfx               800
\==                             xfx               800
~=                              xfx               800
is                              xfx               799
and                             xfy               780
or                              xfy               760
func                            fx                700
impure                          fy                700
pred                            fx                700
semipure                        fy                700
\+                              fy                600
not                             fy                600
when                            xfx               600
~                               fy                600
<=                              xfy               580
<=>                             xfy               580
=>                              xfy               580
all                             fxy               550
arbitrary                       fxy               550
atomic                          fxy               550
disable_warning                 fxy               550
disable_warnings                fxy               550
promise_equivalent_solutions    fxy               550
promise_equivalent_solution_sets fxy              550
promise_exclusive               fy                550
promise_exclusive_exhaustive    fy                550
promise_exhaustive              fy                550
promise_impure                  fx                550
promise_pure                    fx                550
promise_semipure                fx                550
require_complete_switch         fxy               550
require_switch_arms_det         fxy               550
require_switch_arms_semidet     fxy               550
require_switch_arms_multi       fxy               550
require_switch_arms_nondet      fxy               550
require_switch_arms_cc_multi    fxy               550
require_switch_arms_cc_nondet   fxy               550
require_switch_arms_erroneous   fxy               550
require_switch_arms_failure     fxy               550
require_det                     fx                550
require_semidet                 fx                550
require_multi                   fx                550
require_nondet                  fx                550
require_cc_multi                fx                550
require_cc_nondet               fx                550
require_erroneous               fx                550
require_failure                 fx                550
trace                           fxy               550
try                             fxy               550
some                            fxy               550
,                               xfy               500
&                               xfy               475
->                              xfy               450
;                               xfy               400
or_else                         xfy               400
then                            xfx               350
if                              fx                340
else                            xfy               330
::                              xfx               325
==>                             xfx               325
where                           xfx               325
--->                            xfy               321
catch                           xfy               320
type                            fx                320
solver                          fy                319
catch_any                       xfy               310
end_module                      fx                301
import_module                   fx                301
include_module                  fx                301
initialise                      fx                301
initialize                      fx                301
finalise                        fx                301
finalize                        fx                301
inst                            fx                301
instance                        fx                301
mode                            fx                301
module                          fx                301
pragma                          fx                301
promise                         fx                301
rule                            fx                301
typeclass                       fx                301
use_module                      fx                301
-->                             xfx               300
:-                              fx                300
:-                              xfx               300
?-                              fx                300

@end example

@node Terms
@section Terms

Terms are the basic construct used in Mercury syntax.
The term syntax is summarized by the following rules.
(All of this information can be found in the descriptions below the rules.)

@display
@var{term} = @var{core-term} | @var{special-term}

@var{core-term} = @var{variable} | @var{literal} | @var{functor-term}

@var{literal} = @var{string} | @var{integer} | @var{float} | @var{implementation-defined-literal}

@var{functor-term} = @var{name} | @var{name} @var{open-ct} @var{functor-args} @var{close}

@var{functor-args} = @var{functor-arg} | @var{functor-arg} @samp{,} @var{functor-args}

@c The following definition of functor arg is more restrictive than our
@c implementation, which allows operators with priority > 1000 other
@c than the '::' which is explicitly allowed here. In future we may
@c lower the priority of '::' and implement the restriction; for this
@c reason, we impose the restriction now.
@c
@c A similar situation applies to list elements and the arguments to
@c apply terms. For tuples we already implement the restriction.
@c
@var{functor-arg} = @var{arg} | @var{arg} @samp{::} @var{arg}

@var{args} = @var{arg} | @var{arg} @samp{,} @var{args}

@var{arg} = @var{term}, where the term is not an operator term with priority >= 1000

@var{special-term} = @var{operator-term} | @var{list-term} | @var{tuple-term} | @var{apply-term} | @var{paren-term}

@var{operator-term} = @var{term} @var{operator} @var{term} | @var{operator} @var{term} | @var{operator} @var{term} @var{term},
    where the term is constructed according to the requirements of the operator
    (@pxref{Operators})

@var{list-term} = @samp{[} @var{list-body}? @samp{]}

@var{list-body} = @var{arg} | @var{arg} @samp{,} @var{list-body} | @var{arg} @var{ht-sep} @var{term}

@var{tuple-term} = @samp{@{} @var{args}? @samp{@}}

@var{apply-term} = @var{term} @var{open-ct} @var{args} @var{close},
    where the term is not a name or operator term

@var{paren-term} = @samp{(} @var{term} @samp{)}
@end display

@noindent
Terms can be described in the following way.

@table @var
@item term
A term is either a @var{core term} or a @var{special term}.
A term normalization procedure, given below,
translates terms that may contain special terms
into terms that are only constructed from core terms;
two terms are considered syntactically equivalent
if they translate to the same term.
Syntactically equivalent terms can be used interchangeably
anywhere in a module
(e.g.@: operator syntax can be used in declarations and clauses,
in particular those that define an operator).

Note that there can be further equivalences in some contexts,
e.g.@: an if-then-else can be written in either of two equivalent forms.
Such equivalences will be covered in the relevant chapters.

@item core-term
A core term is a @var{variable}, a @var{literal}, or a @var{functor-term}.

@item literal
A literal is
a @var{string},
an @var{integer},
a @var{float},
or an @var{implementation-defined-literal}.

@item functor-term
A functor term is either a name or a compound term.
A compound term is a name followed without any intervening whitespace
by an open parenthesis (i.e.@: an @var{open-ct} token),
then followed by a functor argument list and a close parenthesis.
E.g., @samp{foo(X,Y)} is a compound term,
whereas @samp{foo (X,Y)} and @samp{foo()} are not
(the first because the space after @samp{foo} is not allowed,
the second because the parentheses must be omitted if there are no arguments).

The @dfn{principal functor} of a functor term
is the name and arity of the term, separated by a slash,
where the arity is the number of arguments
(or zero if there are no arguments).
For example, the principal functor of @samp{foo(bar,baz)} is @samp{foo/2},
while the principal functor of @samp{foo} is @samp{foo/0}.
The principal functor of a special term is determined
@emph{after} term normalization.
For module qualified terms,
the principal functor is defined slightly differently
(@pxref{The module system}).

Note that the word ``functor'' has a number of definitions,
but in Mercury it just means
a symbol to which arguments can be applied,
and which has no intrinsic meaning of its own.
It is a syntactic concept that applies to all functor terms.
In specific contexts functors may also be referred to as
type constructors, data constructors (or just constructors),
predicates, functions, etc.
The principal functor may also be referred to
as the ``top-level constructor''.

@item functor-args
A functor argument list is a sequence of one or more functor arguments,
separated by commas.

@item functor-arg
A functor argument is either a single argument
or two arguments separated by a @samp{::} operator
(the latter form is for mode qualifiers; @pxref{Different clauses for different modes}).

@item args
An argument list is a sequence of one or more arguments,
separated by commas.

@item arg
An argument is any term, except operator terms where
the operator does not bind more tightly than comma
(i.e.@: where the priority is greater than or equal to 1000).
In such a situation parentheses can be used,
e.g.@: @samp{f((A,B))} is a compound term
with one argument that is a parenthesized operator term,
whereas @samp{f(A,B)} is a compound term
with two arguments (and no operators).

@item special-term
A special term is an operator term, a list term, a tuple term,
an apply term, or a parenthesized term.
The term normalization procedure, below, defines how these terms
are represented internally as core terms.

@item operator-term
An operator term is a term constructed using an operator,
which complies with the rules for constructing terms using that operator
(@pxref{Operators}).
Operator terms can be infix, such as @samp{A + B},
unary-prefix, such as @samp{not P},
or binary-prefix, such as @samp{some Vars Goal}.

@item list-term
A list term is an open square bracket (an @var{open-list} token),
followed by an optional list body,
followed by a close square bracket (a @var{close-list} token).
If the list body is omitted it is the empty list.
If present, the list body is an argument list,
optionally followed by a vertical bar (a @var{ht-sep} token)
followed by a term.
E.g., @samp{[]}, @samp{[X]}, and @samp{[1, 2 | Tail]}
are all list terms.
The argument list gives the elements appearing at the front of the list.
The term following the vertical bar, if present,
gives the @dfn{tail} of the list (i.e.@: the remaining elements),
otherwise the tail is the empty list.
Note that technically the tail does not have to be a list
for this to be syntactically valid,
although generally it would need to be in order to be type correct.

@item tuple-term
A tuple term is an open curly bracket (an @var{open-curly} token),
followed by an optional argument list,
followed by a close curly bracket (a @var{close-curly} token).
If the argument list is omitted it is the empty tuple,
otherwise the arguments give the components of the tuple.
E.g., @code{@{@}} and @code{@{1,'2',"three"@}} are tuple terms.

@item apply-term
An apply-term is a ``closure'' term,
which can be any term other than a name or an operator term,
followed without any intervening whitespace
by an open parenthesis (an @var{open-ct} token),
an argument list, and a close parenthesis (a @var{close} token).
E.g., @samp{A(B,C)} is an apply-term.
An apply-term represents the closure (i.e.@: a higher-order value)
applied to the arguments.

Note that although the closure term cannot be an operator term,
it @emph{can} be a parenthesized term.
Thus @samp{(Var ^ foo)(Arg1, Arg2)} is a valid apply-term,
whereas @samp{Var ^ foo(Arg1, Arg2)} is not
(it is an operator term whose second argument is a compound term).

@item paren-term
A parenthesized term is just a term enclosed in parentheses.
E.g., @code{(X-Y)} is a parenthesized term.

@end table

@noindent
The term normalization procedure works by rewriting special terms
that occur anywhere within a term
(i.e.@: at the top level or as some descendant)
according to a set of rewriting rules,
and repeating until no rules can be further applied.
The rules are as follows.

@example
@var{term1} `@var{name}` @var{term2} @expansion{} @var{name}(@var{term1}, @var{term2})
@var{term1} `@var{var}` @var{term2} @expansion{} @var{var}(@var{term1}, @var{term2})
@var{term1} @var{operator} @var{term2} @expansion{} '@var{operator}'(@var{term1}, @var{term2})
@var{operator} @var{term} @expansion{} '@var{operator}'(@var{term})
@var{operator} @var{term1} @var{term2} @expansion{} '@var{operator}'(@var{term1}, @var{term2})

[ ] @expansion{} '[]'
[ @var{arg} ] @expansion{} '[|]'(@var{arg}, '[]')
[ @var{arg} , @var{list-body} ] @expansion{} '[|]'(@var{arg}, [@var{list-body}])
[ @var{arg} | @var{term} ] @expansion{} '[|]'(@var{arg}, @var{term})

@{ @} @expansion{} '@{@}'
@{ @var{args} @} @expansion{} '@{@}'(@var{args})

@var{term}(@var{args}) @expansion{} ''(@var{term}, @var{args})

( @var{term} ) @expansion{} @var{term}
@end example

@noindent
For example, the following terms are all syntactically equivalent
(i.e.@: they are equal after term normalization).
The last is constructed from core terms;
the others all normalize to this term.
The last one shows that
the principal functor of all of them is @t{'[|]'/2}.
@example
[1, 2, 3]
[1, 2, 3 | []]
[1, 2 | [3]]
[1 | [2, 3]]
'[|]'(1, '[|]'(2, '[|]'(3, '[]')))
@end example

@noindent
Similarly, the following terms are all syntactically equivalent.
The principal functor in this case is @t{'+'/2}.
@example
A * B + C
(A * B) + C
'+'('*'(A, B), C)
@end example

@node Items
@section Items

Mercury modules are parsed as a sequence of @dfn{items}.
Each item is a term followed by an @var{end} token (a period).
If the principal functor of the term is @samp{:-/1},
it is a declaration item and the argument is the @dfn{declaration}.
Otherwise it is a clause item
and the term is the @dfn{clause}.
Note that we often use ``declaration'' and ``clause'' informally
to refer to the items themselves
(i.e.@: including the @var{end} token).

Declarations are used in relation to a number of features.
Details of their syntax are covered in the relevant chapters.

A clause provides part of the definition of a function or predicate,
and takes one of the following forms.
The first form is a DCG-rule and is not discussed further here
(@pxref{Definite clause grammars}).

@example
@var{DCG_Head} --> @var{DCG_Body}.

@var{Head} :- @var{Body}.
@var{Head}.
@end example

@noindent
@var{Head} is the @dfn{head} of the clause
and @var{Body}, if present, is the @dfn{body} of the clause.
If the principal functor is @samp{:-/2},
the clause is a @dfn{rule} and the body is a goal
(@pxref{Goals}).
If the principal functor is not
@samp{:-/1}, @samp{:-/2}, or @samp{-->/2},
the clause is a @dfn{fact}.
A fact is equivalent to a rule that has the same head
and a body of @samp{true}.

A clause head takes one of the following forms.

@example
@var{FunctorTerm} = @var{Result}
@var{FunctorTerm}
@end example

@noindent
@code{@var{FunctorTerm}} is a functor term
whose arguments are expressions
(@pxref{Expressions}),
optionally annotated with mode qualifiers
(@pxref{Different clauses for different modes}).
If the principal functor is @samp{=/2},
then the clause is a function rule or a function fact,
and @code{@var{Result}} is an expression,
optionally annotated with a mode qualifier.
Otherwise,
the clause is a predicate rule or a predicate fact.
The principal functor of @code{@var{FunctorTerm}}
determines which function or predicate is being defined.

@noindent
For example, the following three items are clauses.
The first is a function fact that defines a function named @samp{loop/1},
a not particularly useful function.
The second is a predicate fact and the third is a predicate rule,
that between them define a predicate named @samp{append/3}.

@example
loop(X) = 1 + loop(X).

append([], Bs, Bs).
append([X | As], Bs, [X | Cs]) :-
    append(As, Bs, Cs).
@end example

@noindent
The following example contains a number of declaration and clause items,
and forms a syntactically valid module.
(The semantics of the clauses will be covered in the next chapter.
Note that the @code{length/1} function in the standard library
is implemented more efficiently.)

@example
:- module slow_length.
:- interface.
:- import_module list.

:- func length(list(T)) = int.

:- implementation.
:- import_module int.       % for '+'

length([]) = 0.
length([_ | Xs]) = 1 + length(Xs).

:- end_module slow_length.
@end example

@node Clauses
@chapter Clauses

This chapter covers the semantics of Mercury clauses,
and the goals and expressions they contain.
The first section gives an informal overview of the semantics;
if you are already familiar with logic programming
you may wish to skip it and start with @ref{Goals}.

Full details of the language semantics
can be found in @ref{Formal semantics}.

@menu
* Overview of Mercury semantics::
* Goals::
* Expressions::
* State variables::
* Variable scoping::
* Implicit quantification::
* Elimination of double negation::
* Definite clause grammars::
@end menu

@node Overview of Mercury semantics
@section Overview of Mercury semantics

There is no agreed upon definition of ``declarative programming''.
One notable characteristic of Mercury as a declarative language, however,
is that it has both a @dfn{declarative} and an @dfn{operational} semantics.
The declarative semantics is conceptually the simpler of the two:
it is only concerned with the relationship between inputs and outputs,
and not the steps taken to execute a program.
The operational semantics is additionally concerned with these steps.
This is often expressed by saying that
the declarative semantics is about ``what''
whereas the operational semantics is about ``how''.

In the remainder of this section we introduce
each of these semantics.

@subheading Declarative semantics

The declarative semantics is concerned with ``truth''.
For example, it is true that 1 plus 1 is 2,
and that the length of the list [1, 2, 3] is 3.
Statements that are either true or false like this
are known as @dfn{propositions},
e.g., 1 + 1 = 2 and 1 + 2 = 5 are both propositions;
if + is interpreted as integer addition
then the first proposition is true and the second is false.

Mercury clauses state things that are true about
the function or predicate being defined.
To illustrate we will use an example from the previous chapter.
(Note that, here and below,
some declarations would need to be added to make this compile.)

@example
length([]) = 0.
length([_ | Xs]) = 1 + length(Xs).
@end example

@noindent
Both of these clauses are facts about the function @code{length/1}.
The first simply states that the length of an empty list is zero.
The second states that no matter what expressions we substitute for
the variables @samp{Xs} and @samp{_},
the length of @samp{[_ | Xs]} will be one greater than
the length of @samp{Xs}.
In other words, the length of a non-empty list
is one greater than the length of its tail.

These two statements are true according to our intuitive idea of length.
Furthermore, we can see that the clauses cover every possible list,
since every list is either empty or non-empty,
and every non-empty list has a tail that is also a list.
Perhaps surprisingly,
this is enough to conclude that
our implementation of list length is correct,
at least as far as its arguments and return values are concerned.

As another example,
the following clauses define a predicate named @code{append/3},
which is intended to be true if
the third argument is the list that results from appending
the first and second arguments.
(Equivalently, we could say that it is intended to be true if
it is possible to split the list in the third argument
to produce the first and second arguments.)

@example
append([], Bs, Bs).
append([X | As], Bs, [X | Cs]) :-
    append(As, Bs, Cs).
@end example

@noindent
The first clause is a fact that states
if you append the empty list and any other list,
the result will be the same as that other list.
The second clause is a rule;
these are taken as logical implications
in which the body implies the head
(i.e.@: @samp{:-} is interpreted as reverse implication).
So this is stating that, for any substitution,
if @samp{Cs} is the result of appending @samp{As} and @samp{Bs},
then @samp{[X | Cs]} is the result of appending @samp{[X | As]} and @samp{Bs}.

Again, both clauses are true according to
the @dfn{intended interpretation},
which is defined as all of the propositions about
the functions and predicates in the program
that the programmer intends to be true.
And the definition is @dfn{complete},
meaning that for every proposition that is intended to be true
there is either a fact that covers it,
or a rule whose head covers it and (under the same substitution)
whose body is intended to be true.
Thus we can conclude in a similar way to above that our code is correct.

The declarative semantics of a Mercury program is defined as
all of the propositions that can be inferred to be true
from the clauses of the program
(with some additional axioms).
If the program is producing incorrect output,
this implies that there is a difference between
the declarative semantics and the intended interpretation.
From the above discussion,
there must be some clause that is false in the intended interpretation,
or some definition that is incomplete.

This is the reason for having a declarative semantics.
Despite it being relatively simple---you only need
to know about which propositions are true and which are false,
and not how the program actually executes---it is
still effective in reasoning about your program,
even so far as to be able to localize a bug observed in the output
down to individual clauses or definitions.

@subheading Operational semantics

The declarative semantics does not tell us
whether our program will terminate, for example,
or what its computational complexity is.
For that we need the operational semantics,
which tells us how the program will be executed.

Execution in Mercury starts with a @dfn{goal},
which is a proposition that may contain some variables.
The aim of execution is to find a substitution
for which the proposition is true.
If it does, we refer to this as @dfn{success},
and we refer to the substitution that was found as a @dfn{solution}.
If execution determines that there are no such substitutions,
we refer to this as @dfn{failure}.

Say, for example,
we start with a goal of @samp{N = length([1, 2])}.
Function evaluation is strict,
depth-first, and left-to-right,
so we want to call the @samp{length/1} function first.
To do this, we match the argument with
the heads of the clauses that define the function
to find the clause that is applicable.
In this case the second clause matches,
with the substitution of @code{Xs @expansion{} [2]}
(the substitution for @samp{_} is irrelevant,
since any other occurrence of @samp{_}
is considered a distinct variable).
Applying this substitution to the body
then replacing it in the goal gives us a new goal,
namely @samp{N = 1 + length([2])}.

Repeating this process a second time
gives us the goal @samp{N = 1 + 1 + length([])}.
When we call the function the third time
it will match the @emph{first} clause,
and the new goal will be @samp{N = 1 + 1 + 0}.
At this point we can evaluate the @samp{+/2} calls
and get a result of @samp{N = 2}.
It is trivial to find a substitution that makes this proposition true:
just map @code{N} to the literal @code{2}.

Now consider the goal @samp{append(As, Bs, [1])}.
In this case the first two arguments are @dfn{free},
meaning that they are variables that
are not mapped to anything in the current substitution,
and the third argument is @dfn{ground},
meaning that it does not contain any variables
after applying the current substitution.
In this case when we try to match (or @dfn{unify})
the goal with a clause,
both clauses match.
We arbitrarily pick the first one,
but we also push a @dfn{choice point} onto a stack,
which will allow us to return to this point later on
and try the other clause if we need to.
Matching with the first clause gives us
the substitution @code{As @expansion{} [], Bs @expansion{} [1]};
since this clause is a fact,
we succeed with this substitution as our solution.

If a later goal fails,
we pop the previous choice point off the stack
in order to search for a different solution.
This time we want to try unifying our goal with
the head of the second clause,
that is, we want to find a substitution such that

@example
append(As, Bs, [1]) = append([X1 | As1], Bs1, [X1 | Cs1])
@end example

@noindent
(the variables from the clause have been given a numerical suffix,
which is to indicate that they came from a different scope
and are not the same variables as those in the goal).
The substitution we use is
@code{As @expansion{} [1 | As1], Bs @expansion{} Bs1, X1 @expansion{} 1, Cs1 @expansion{} []};
you can check that this does indeed unify the two terms.
Note that information is effectively flowing in both directions:
variables from both the goal and the clause
(i.e.@: the caller and callee) are bound by this substitution.
This is different from pattern matching in many other languages,
in which only variables in the pattern are bound.

Applying this substitution to the body of the selected clause
gives us our new goal,
@samp{append(As1, Bs1, [])}.
Only the first clause matches,
with the substitution of @code{As1 @expansion{} [], Bs1 @expansion{} []},
and the clause is a fact,
so this is a solution to @emph{this} call to append.
To find the solution to the parent goal we compose this substitution
with the one from before,
giving @code{As @expansion{} [1], Bs @expansion{} []}.
We have now found two solutions for our goal:
one with @code{As} being the empty list and @code{Bs} being @code{[1]},
and the other with @code{As} being @code{[1]}
and @code{Bs} being the empty list.
These are all of the possible solutions;
if we wanted to search for another
we would find the choice point stack to be empty,
hence we would fail.

@node Goals
@section Goals

A goal is a term that takes one of the following forms.

@table @code
@item @var{Call}
Any goal which is a functor term
that does not match any of the other forms below
is a first-order predicate call.
The principal functor determines the predicate called,
which must be visible (@pxref{Modules}).
The arguments, if present, are expressions.

@item call(Closure)
@itemx call(Closure1, Arg1)
@itemx call(Closure2, Arg1, Arg2)
@itemx call(Closure3, Arg1, Arg2, Arg3)
@itemx @dots{}
@itemx Closure
@itemx Closure1(Arg1)
@itemx Closure2(Arg1, Arg2)
@itemx Closure3(Arg1, Arg2, Arg3)
@itemx @dots{}
A higher-order predicate call.
The closure and arguments are expressions.
@samp{call(Closure)} and @samp{Closure} just call the specified closure.
The other forms append the specified arguments
onto the argument list of the closure before calling it.
A higher-order predicate call written using
an apply term with @code{N} arguments
is equivalent to the form using @code{call/N+1}.
@xref{Higher-order}.

@item @var{Expr1} = @var{Expr2}
A unification.
@var{Expr1} and @var{Expr2} are expressions.

@item !@var{Var} ^ @var{field_list} := Expr
A state variable field update.
@xref{State variables}.

@item @var{Goal1}, @var{Goal2}
A conjunction.
@var{Goal1} and @var{Goal2} are goals.

@item @var{Goal1} & @var{Goal2}
A parallel conjunction.
@var{Goal1} and @var{Goal2} are goals.
This has the same declarative semantics as normal conjunction.
Operationally,
implementations may execute @var{Goal1} & @var{Goal2} in parallel.
The order in which parallel conjuncts begin execution is not fixed.
It is an error for @var{Goal1} or @var{Goal2}
to have a determinism other than @code{det} or @code{cc_multi}.
@xref{Determinism categories}.

@item @var{Goal1} ; @var{Goal2}
A disjunction,
where @var{Goal1} is not of the form
@samp{@var{Goal1a} -> @var{Goal1b}}
(since that would make it an if-then-else, below).
@var{Goal1} and @var{Goal2} are goals.

@item true
The empty conjunction.
Always succeeds exactly once.

@item fail
The empty disjunction.
Always fails.

@item if @var{CondGoal} then @var{ThenGoal} else @var{ElseGoal}
@itemx @var{CondGoal} -> @var{ThenGoal} ; @var{ElseGoal}
An if-then-else.
The two different syntaxes have identical semantics.
@var{CondGoal}, @var{ThenGoal}, and @var{ElseGoal} are goals.
Note that the ``else'' part is @emph{not} optional.

The declarative semantics of an if-then-else is given by
@code{( @var{CondGoal}, @var{ThenGoal} ; not(@var{CondGoal}), @var{ElseGoal} )},
but the operational semantics is different,
and it is treated differently for the purposes of determinism inference
(@pxref{Determinism}).
Operationally, it executes the @var{CondGoal},
and if that succeeds, then execution continues with the @var{ThenGoal};
otherwise, i.e.@: if @var{CondGoal} fails without producing any solutions,
it executes the @var{ElseGoal}.
Note that @var{CondGoal} can be nondeterministic---if the @var{CondGoal}
succeeds more than once then
the @var{ThenGoal} is executed once for each of the solutions.

If @var{CondGoal} is an explicit existential quantification,
@code{some @var{Vars} @var{QuantifiedCondGoal}}, then the variables @var{Vars}
are existentially quantified over the conjunction of the goals
@var{QuantifiedCondGoal} and @var{ThenGoal}
(see existential quantifications, below).
Explicit existential quantifications that occur as subgoals of @var{CondGoal}
do @emph{not} affect the scope of variables in the ``then'' part.
For example, in
@example
   ( if some [V] @var{C} then @var{T} else @var{E} )
@end example
@noindent
the variable @var{V} is quantified
over the conjunction of the goals @var{C} and @var{T}
because the top-level goal of the condition
is an explicit existential quantification,
but in
@example
   ( if true, some [V] @var{C} then @var{T} else @var{E} )
@end example
@noindent
the variable @var{V} is only quantified over @var{C}
because the top-level goal of the condition
is not an explicit existential quantification.

@item not @var{Goal}
@itemx \+ @var{Goal}
A negation.
@var{Goal} is a goal.
The two different syntaxes have identical semantics:
both forms are operationally equivalent to
@samp{if @var{Goal} then fail else true}.

@item @var{Expr1} \= @var{Expr2}
An inequality.
@var{Expr1} and @var{Expr2} are expressions.
This is an abbreviation for @samp{not (@var{Expr1} = @var{Expr2})}.

@item try @var{Params} @var{Goal} @dots{} catch @var{Expr} -> @var{CGoal} @dots{}
A try goal.
Exceptions thrown during the execution of @var{Goal} may be caught and handled.
A summary of the try goal syntax is:

@example
@group
    try @var{Params} @var{Goal}
    then @var{ThenGoal}
    else @var{ElseGoal}
    catch @var{Expr} -> @var{CatchGoal}
    @dots{}
    catch_any @var{CatchAnyVar} -> @var{CatchAnyGoal}
@end group
@end example

See @ref{Exception handling} for the full details.

@item some @var{Vars} @var{Goal}
An existential quantification.
@var{Goal} is a goal
and @var{Vars} is a list
whose elements are either variables or state variables
(a single list may contain both).
The case where there are state variables
is described in @ref{State variables};
here we discuss the case where they are all plain variables.

Each existential quantification introduces a new scope.
The variables in @var{Vars} are local to the goal @var{Goal}:
for each variable named in @var{Vars},
any occurrences of variables with that name in @var{Goal}
are considered to name a different variable
than any variables with the same name
that occur outside of the existential quantification.

Operationally, existential quantification has no effect,
so apart from its effect on variable scoping,
@samp{some @var{Vars} @var{Goal}} is the same as @samp{@var{Goal}}.

Mercury's rules for implicit quantification (@pxref{Implicit quantification})
mean that variables are often implicitly existentially quantified.
There is usually no need to write existential quantifiers explicitly.

@item all @var{Vars} @var{Goal}
A universal quantification.
@var{Goal} is a goal
and @var{Vars} is a list of variables
(they may @emph{not} be state variables).
This goal is an abbreviation for @samp{not (some @var{Vars} not @var{Goal})}.

@item @var{Goal1} => @var{Goal2}
An implication.
@var{Goal1} and @var{Goal2} are goals.
This is an abbreviation for @samp{not (@var{Goal1}, not @var{Goal2})}.

@item @var{Goal1} <= @var{Goal2}
A reverse implication.
@var{Goal1} and @var{Goal2} are goals.
This is an abbreviation for @samp{not (@var{Goal2}, not @var{Goal1})}.

@item @var{Goal1} <=> @var{Goal2}
A logical equivalence.
@var{Goal1} and @var{Goal2} are goals.
This is an abbreviation for
@samp{(@var{Goal1} => @var{Goal2}), (@var{Goal1} <= @var{Goal2})}.

@item promise_pure @var{Goal}
A purity cast.
@var{Goal} is a goal.
This goal promises that @var{Goal} implements a pure interface,
even though it may include impure and semipure components.

@item promise_semipure @var{Goal}
A purity cast.
@var{Goal} is a goal.
This goal promises that @var{Goal} implements a semipure interface,
even though it may include impure components.

@item promise_impure @var{Goal}
A purity cast.
@var{Goal} is a goal.
This goal instructs the compiler to treat @var{Goal} as though it were impure,
regardless of its actual purity.

@item promise_equivalent_solutions @var{Vars} @var{Goal}
A determinism cast.
@var{Vars} is a list of variables
and @var{Goal} is a goal.
This goal promises that @var{Vars}
is the set of variables bound by @var{Goal},
and that while @var{Goal} may have more than one solution,
all of these solutions are equivalent
with respect to the equality theories of the variables in @var{Vars}.
It is an error for @var{Vars} to include a variable not bound by @var{Goal}
or for @var{Goal} to bind a non-local variable
that is not listed in @var{Vars}
(non-local variables with inst @code{any} are assumed to be further constrained
by @var{Goal} and must also be included in @var{Vars}).
If @var{Goal} has determinism @code{multi} or @code{cc_multi} then
@code{promise_equivalent_solutions @var{Vars} @var{Goal}}
has determinism @code{det}.
If @var{Goal} has determinism @code{nondet} or @code{cc_nondet} then
@code{promise_equivalent_solutions @var{Vars} @var{Goal}}
has determinism @code{semidet}.

@item promise_equivalent_solution_sets @var{Vars} @var{Goal}
A determinism cast,
of the kind performed by @code{promise_equivalent_solutions},
on any goals of the form
@code{arbitrary @var{ArbVars} @var{ArbGoal}} inside @var{Goal},
of which there should be at least one.
@c XXX "should" or "must"?
@var{Vars} and @var{ArbVars} must be lists of variables,
and @var{Goal} and @var{ArbGoal} must be goals.
@var{Vars} must be the set of variables bound by @var{Goal},
and @var{ArbVars} must be the set of variables bound by @var{ArbGoal},
It is an error for @var{Vars} to include a variable not bound by @var{Goal}
or for @var{Goal} to bind a non-local variable
that is not listed in @var{Vars},
and similarly for @var{ArbVars} and @var{ArbGoal}.
The intersection of @var{Vars} and the @var{ArbVars} list
of any @code{arbitrary @var{ArbVars} @var{ArbGoal}} goal
included inside @var{Goal} must be empty.

The overall @var{promise_equivalent_solution_sets} goal promises that
the set of solutions computed for @var{Vars} by @var{Goal}
is not influenced by which of the possible solutions
for @var{ArbVars} is computed by each @var{ArbGoal};
while different choices of solutions for some of the @var{ArbGoal}s
may lead to syntactically different solutions for @var{Vars} for @var{Goal},
all of these solutions are equivalent
with respect to the equality theories of the variables in @var{Vars}.
If an @var{ArbGoal} has determinism @code{multi} or @code{cc_multi} then
@code{arbitrary @var{ArbVars} @var{ArbGoal}} has determinism @code{det}.
If @var{ArbGoal} has determinism @code{nondet} or @code{cc_nondet} then
@code{arbitrary @var{ArbVars} @var{ArbGoal}} has determinism @code{semidet}.
@var{Goal} itself may have any determinism.

There is no requirement that given one of the @var{ArbGoal}s,
all its solutions must be equivalent with respect to the equality theories
of the corresponding @var{ArbVars};
in fact, in typical usage, this won't be the case.
The different solutions of the nested @var{arbitrary} goals
are not required to be equivalent in any context
except the @var{promise_equivalent_solution_sets} goal they are nested inside.

@item arbitrary @var{ArbVars} @var{ArbGoal}
Goals of this form are only allowed to occur inside
@code{promise_equivalent_solution_sets @var{Vars} @var{Goal}} goals.
See the preceding description for details.

@item require_det @var{Goal}
@itemx require_semidet @var{Goal}
@itemx require_multi @var{Goal}
@itemx require_nondet @var{Goal}
@itemx require_cc_multi @var{Goal}
@itemx require_cc_nondet @var{Goal}
@itemx require_erroneous @var{Goal}
@itemx require_failure @var{Goal}
A determinism check, typically used to enhance the robustness of code.
@var{Goal} is a goal.
If @var{Goal} is det, then
@code{require_det @var{Goal}} is equivalent to just @var{Goal}.
If @var{Goal} is not det,
then the compiler is required to generate an error message.

The @code{require_det} keyword may be replaced with
@code{require_semidet},
@code{require_multi},
@code{require_nondet},
@code{require_cc_multi},
@code{require_cc_nondet},
@code{require_erroneous} or
@code{require_failure},
each of which requires @var{Goal} to have the named determinism.

@item require_complete_switch [@var{Var}] @var{Goal}
A switch completeness check, typically used to enhance the robustness of code.
@var{Var} is a variable and @var{Goal} is a goal.
If @var{Goal} is a switch on @var{Var}
and the switch is @emph{complete},
i.e.@: the switch has an arm for every function symbol that @var{Var}
could be bound to at this point in the code,
then @code{require_complete_switch [@var{Var}] @var{Goal}}
is equivalent to @var{Goal}.
If @var{Goal} is a switch on @var{Var} but is @emph{not} complete,
or @var{Goal} is not a switch on @var{Var} at all,
then the compiler is required to generate an error message.

@item require_switch_arms_det [@var{Var}] @var{Goal}
@itemx require_switch_arms_semidet [@var{Var}] @var{Goal}
@itemx require_switch_arms_multi [@var{Var}] @var{Goal}
@itemx require_switch_arms_nondet [@var{Var}] @var{Goal}
@itemx require_switch_arms_cc_multi [@var{Var}] @var{Goal}
@itemx require_switch_arms_cc_nondet [@var{Var}] @var{Goal}
@itemx require_switch_arms_erroneous [@var{Var}] @var{Goal}
@itemx require_switch_arms_failure [@var{Var}] @var{Goal}
@code{require_switch_arms_det} is a determinism check,
typically used to enhance the robustness of code.
@var{Var} is a variable and @var{Goal} is a goal.
If @var{Goal} is a switch on @var{Var},
and all arms of the switch would be allowable in a det context,
@code{require_switch_arms_det [@var{Var}] @var{Goal}}
is equivalent to @var{Goal}.
If @var{Goal} is not a switch on @var{Var},
or if it is a switch on @var{Var}
but some of its arms would @emph{not} be allowable in a det context,
then the compiler is required to generate an error message.

The @code{require_switch_arms_det} keyword may be replaced with
@code{require_switch_arms_semidet},
@code{require_switch_arms_multi},
@code{require_switch_arms_nondet},
@code{require_switch_arms_cc_multi},
@code{require_switch_arms_cc_nondet},
@code{require_switch_arms_erroneous} or
@code{require_switch_arms_failure},
each of which requires
the arms of the switch on @var{Var} to have a determinism
that is @emph{at least as tight} as the named determinism.
The determinism match need not be exact;
the requirement is that the arms' determinisms should make
all the promises about the minimum and maximum number of solutions
as the named determinism does.
For example, it is ok to have a det switch arm
in a @code{require_switch_arms_semidet} scope,
even though it would not be ok
to have a det goal in a @code{require_semidet} scope.

@item disable_warnings [@var{Warnings}] @var{Goal}
@itemx disable_warning [@var{Warnings}] @var{Goal}
@var{Goal} is a goal
and @var{Warnings} is a comma-separated list of names.
The Mercury compiler can generate warnings
about several kinds of constructs whose legal Mercury semantics
is likely to differ from the semantics intended by the programmer.
While such warnings are useful most of the time,
they are a distraction in cases where the programmer's intention
@emph{does} match the legal semantics.
Programmers can disable all warnings of a particular kind for an entire module
by compiling that module with the appropriate compiler option,
but in many cases this is not a good idea,
since some of the warnings it disables may @emph{not} have been mistaken.
This is what these goals are for.
The goal @code{disable_warnings [@var{Warnings}] @var{Goal}}
is equivalent to @code{@var{Goal}} in all respects, with one exception:
the Mercury compiler will not generate warnings of any of the categories
whose names appear in @code{[@var{Warnings}]}.

At the moment, the Mercury compiler supports the disabling of
the following warning categories:
@table @code
@item singleton_vars
Disable the generation of warnings
for variables that occur only once
despite their names not starting with an underscore.
@item repeated_singleton_vars
Disable the generation of warnings
for variables that occur more than once
despite their names starting with an underscore.
@item suspected_occurs_check_failure
Disable the generation of warnings about code that looks like
it unifies a variable with a term that contains that same variable.
@c @item @code{non_tail_recursive_calls}
@c Disable the generation of warnings about recursive calls
@c that are not @emph{tail}-recursive calls.
@item suspicious_recursion
Disable the generation of warnings about suspicious recursive calls.
@item no_solution_disjunct
Disable the generation of warnings about disjuncts that can have no solution.
This is usually done to shut up such a warning in a multi-mode predicate
where the disjunct in question is a switch arm in another mode.
(The difference is that a disjunct that cannot succeed has no meaningful use,
but a switch arm that cannot succeed does have one:
a switch may need that arm to make it complete.)
@item unknown_format_calls
Disable the generation of warnings about calls to
@code{string.format}, @code{io.format} or @code{stream.string_writer.format}
for which the compiler cannot tell whether there are any mismatches
between the format string and the supplied values.
@end table

The keyword starting this scope may be written
either as @code{disable_warnings} or as @code{disable_warning}.
This is intended to make the code read more naturally
regardless of whether the list contains the name of
more than one warning category.

@item trace @var{Params} @var{Goal}
A trace goal, typically used for debugging or logging.
@var{Goal} is a goal
and @var{Params} is a list of trace parameters.
@c should we move the rest of the paragraph except the last sentence
@c into the Trace goals chapter?
Some trace parameters specify compile time or run time conditions;
if any of these conditions are false, @var{Goal} will not be executed.
Since in some program invocations
@var{Goal} may be replaced by @samp{true} in this way,
@var{Goal} may not bind or change the instantiation state
of any variables it shares with the surrounding context.
The things it may do are thus restricted to side effects;
good programming style requires these side effects
to not have any effect on the execution of the program itself,
but to be confined to the provision of extra information
for the user of the program.
See @ref{Trace goals} for the details.

@item event @var{Goal}
An event goal.
@var{Goal} is a predicate call.
Event goals are an extension used by the Melbourne Mercury implementation
to support user-defined events in the Mercury debugger, @samp{mdb}.
See the ``Debugging'' chapter of the Mercury User's Guide for further details.

@end table

@node Expressions
@section Expressions

Syntactically, an expression is just a term.
Semantically, an expression is
a variable,
a literal,
a functor expression,
or a special expression.
A special expression is
a conditional expression,
a unification expression,
a state variable,
an explicit type qualification,
a type conversion expression,
a lambda expression,
an apply expression,
or a field access expression.

A literal is a string, an integer, a float,
or an implementation-defined literal
(note that character literals are just single character names; see below).

Implementation-defined literals are symbolic names
whose value represents a property of the compilation environment
or the context in which it appears.
The implementation replaces these symbolic names
with actual literals during compilation.
Implementation-defined literals can only appear within clauses.
The following must be supported by all Mercury implementations:

@table @samp
@item $file
A string that gives the name of the file
that contains the module being compiled.
If the name of the file cannot be determined,
then it is replaced by an arbitrary string.

@item $line
The line number (integer) of the goal in which the literal appears,
or -1 if it cannot be determined.

@item $module
A string representation of the fully qualified module name.

@item $pred
A string containing the fully qualified predicate or function name and arity.

@end table

@noindent
The Melbourne Mercury implementation
additionally supports the following extension:

@table @samp
@item $grade
The grade (string) in which the module is compiled.

@end table

A functor expression is a name or a compound expression.
A compound expression is a compound term
that does not match the form of a special expression,
and whose arguments are expressions.
If a functor expression is not a character literal,
its principal functor must be the name of
a visible function, predicate, or data constructor
(except for field specifiers,
for which the corresponding field access function must be visible;
see below).

Character literals in Mercury are single character names,
possibly quoted.
Since they sometimes require quotes
and sometimes require parentheses,
for code consistency we recommend
writing all character literals with quotes and
(except where used as arguments)
parentheses.
For example, @code{Char = ('+') ; Char = ('''')}.

Special expressions
(not including field access expressions,
which are covered below)
take one of the following forms.

@table @code
@item if @var{Goal} then @var{ThenExpr} else @var{ElseExpr}
@itemx @var{Goal} -> @var{ThenExpr} ; @var{ElseExpr}
A conditional expression.
@var{Goal} is a goal;
@var{ThenExpr} and @var{ElseExpr} are both expressions.
The two forms are equivalent.
The meaning of a conditional expression is that
if @var{Goal} is true it is equivalent to @var{ThenExpr},
otherwise it is equivalent to @var{ElseExpr}.

If @var{Goal} takes the form @code{some [X, Y, Z] @dots{}}
then the scope of @var{X}, @var{Y}, and @var{Z} includes @var{ThenExpr}.
See the related discussion regarding if-then-else goals.

@item @var{X} @@ @var{Y}
A unification expression.
@var{X} and @var{Y} are both expressions.
The meaning of a unification expression is that the arguments are unified,
and the expression is equivalent to the unified value.

The strict sequential operational semantics (@pxref{Formal semantics})
of an expression @w{@code{@var{X} @@ @var{Y}}}
is that the expression is replaced by a fresh variable @code{Z},
and immediately after @code{Z} is evaluated,
the conjunction @w{@code{Z = @var{X}, Z = @var{Y}}} is evaluated.

For example

@example
p(X @@ f(_, _), X).
@end example

@noindent
is equivalent to

@example
p(Z, X) :-
    Z = X,
    Z = f(_, _).
@end example

Unification expressions are particularly useful when writing switches
(@pxref{Determinism checking and inference}),
as the arguments of a unification expression
are examined when checking for switches.
The arguments of an equivalent user-defined function would not be.

@item !.@var{S}
@itemx !:@var{S}
A state variable.
@var{S} is a variable.
@xref{State variables}.

@item @var{Expr} : @var{Type}
An explicit type qualification.
@var{Expr} is an expression,
and @var{Type} is a type (@pxref{Types}).
An explicit type qualification constrains
the specified expression to have the specified type;
these expressions are occasionally useful to resolve ambiguities
that can arise from polymorphic types or overloading.
Apart from that,
the explicit type qualification is equivalent to @var{Expr}.

Currently we also support
@code{@var{Expr} `with_type` @var{Type}}
as an alternative syntax for explicit type qualification.

@item coerce(Expr)
A type conversion expression.
@var{Expr} is an expression.
@xref{Type conversions}.

@item pred(Arg1::Mode1, Arg2::Mode2, @dots{}) is Det :- Goal
@itemx pred(Arg1::Mode1, Arg2::Mode2, @dots{}, DCGMode0, DCGMode1) is Det --> DCGGoal
@itemx func(Arg1::Mode1, Arg2::Mode2, @dots{}) = (Result::Mode) is Det :- Goal
@itemx func(Arg1, Arg2, @dots{}) = (Result) is Det :- Goal
@itemx func(Arg1, Arg2, @dots{}) = Result :- Goal
A lambda expression.
@var{Arg1}, @var{Arg2}, @dots{} are zero or more expressions,
@var{Result} is an expression,
@var{Goal} is a goal,
@var{DCGGoal} is a DCG-goal,
@var{Mode1}, @var{Mode2}, @dots{}, @var{DCGMode0}, and @var{DCGMode1}
are modes (@pxref{Modes}),
and @var{Det} is a determinism category (@pxref{Determinism}).
The @w{@samp{:- Goal}} part is optional;
if it is not specified, then @samp{:- true} is assumed.

A lambda expression denotes a higher-order predicate or function term
whose value is the predicate or function of the specified arguments
determined by the specified goal.
@xref{Higher-order}.

A lambda expression introduces a new scope:
any variables occurring in the arguments @var{Arg1}, @var{Arg2}, @dots{}
are locally quantified, i.e.@:
they are distinct from other variables with the same name
that occur outside of the lambda expression.
For variables which occur in @var{Result} or @var{Goal},
but not in the arguments,
the usual Mercury rules for implicit quantification apply
(@pxref{Implicit quantification}).

The form of lambda expression using @samp{-->} as its top level functor
is a syntactic abbreviation.
It is equivalent to

@example
pred(Var1::Mode1, Var2::Mode2, @dots{},
    DCGVar0::DCGMode0, DCGVar1::DCGMode1) is Det :- Goal
@end example

@noindent
where @code{DCGVar0} and @code{DCGVar1} are fresh variables,
and @code{Goal} is @code{transform(DCGVar0, DCGVar1, DCGGoal)}
where @code{transform} is the function
specified in @ref{Definite clause grammars}.

@item apply(@var{Func}, @var{Arg1}, @var{Arg2}, @dots{}, @var{ArgN})
@itemx @var{Func}(@var{Arg1}, @var{Arg2}, @dots{}, @var{ArgN})
An apply expression (i.e.@: a higher-order function call).
@var{N} >= 0,
@var{Func} is an expression of type @samp{func(T1, T2, @dots{}, Tn) = T},
and @var{Arg1}, @var{Arg2}, @dots{}, @var{ArgN}
are expressions of types @samp{T1}, @samp{T2}, @dots{}, @samp{Tn}.
The type of the apply expression is @var{T}.
It denotes the result of applying the specified function
to the specified arguments.
@xref{Higher-order}.

@end table

@anchor{Field access expressions}
@subheading Field access expressions

Field access expressions provide a convenient way
to select or update fields of data constructors,
independent of the definition of the constructor.
The compiler transforms field access expressions into
sequences of calls to field selection or update functions
(@pxref{Field access functions}).

A field specifier is a functor expression.
A field list is a sequence of field specifiers
separated by @code{^} (circumflex).
E.g., @samp{field}, @samp{field1 ^ field2}
and @samp{field1(A) ^ field2(B, C)}
are all field lists.

If the principal functor of a field specifier is @code{@var{field}/N},
there must be a visible selection function @w{@code{@var{field}/(N + 1)}}.
If the field specifier occurs in a field update expression,
there must also be a visible update function
named @w{@code{'@var{field} :='/(N + 2)}}.

Field access expressions have one of the following forms.
There are also DCG goals for field access (@pxref{Definite clause grammars}),
which provide similar functionality to field access expressions,
except that they act on the DCG arguments of a DCG clause.

@table @code
@item @var{Expr} ^ @var{field_list}

A field selection.
@var{Expr} is an expression
and @var{field_list} is a field list.
For each field specifier in @var{field_list},
apply the corresponding selection function in turn.

A field selection is transformed using the following rules:
@example
transform(Expr ^ @var{field}(Arg1, @dots{})) = @var{field}(Arg1, @dots{}, Expr).
transform(Expr ^ @var{field}(Arg1, @dots{}) ^ Rest) =
                transform(@var{field}(Arg1, @dots{}, Expr) ^ Rest).
@end example

@item @var{Expr} ^ @var{field_list} := @var{FieldExpr}

A field update.
@var{Expr} and @var{FieldExpr} are expressions
and @var{field_list} is a field list.
Returns a copy of @var{Expr}
with the value of the field specified by @var{field_list}
replaced with @var{FieldExpr}.

A field update is transformed using the following rules:
@example
transform(Expr ^ @var{field}(Arg1, @dots{}) := FieldExpr) =
                '@var{field}:='(Arg1, @dots{}, Expr, FieldExpr)).

transform(Expr0 ^ @var{field}(Arg1, @dots{}) ^ Rest := FieldExpr) = Expr :-
        OldFieldValue = @var{field}(Arg1, @dots{}, Expr0),
        NewFieldValue = transform(OldFieldValue ^ Rest := FieldExpr),
        Expr = '@var{field} :='(Arg1, @dots{}, Expr0, NewFieldValue).
@end example

@end table

@noindent
Examples:

@example
transform(Expr ^ field) = field(Expr).

transform(Expr ^ field(Arg)) = field(Arg, Expr).

transform(Expr ^ field1(Arg1) ^ field2(Arg2, Arg3)) =
    field2(Arg2, Arg3, field1(Arg1, Expr)).

transform(Expr ^ field := FieldExpr) = 'field :='(Expr, FieldExpr).

transform(Expr ^ field(Arg) := FieldExpr) = 'field :='(Arg, Expr, FieldExpr).

transform(Expr0 ^ field1(Arg1) ^ field2(Arg2) := FieldExpr) = Expr :-
    OldField1 = field1(Arg1, Expr0),
    NewField1 = 'field2 :='(Arg2, OldField1, FieldExpr),
    Expr = 'field1 :='(Arg1, Expr0, NewField1).
@end example

@node State variables
@section State variables

Clauses may use @dfn{state variables}
as a shorthand for naming intermediate values in a sequence.
For example, the following clauses
@example
    main(IO0, IO) :-
        write_string("The answer is ", IO0, IO1),
        write_int(42, IO1, IO2),
        nl(IO2, IO).
@end example
@noindent
could be written equivalently using state variable syntax as
@example
    main(!IO) :-
        write_string("The answer is ", !IO),
        write_int(42, !IO),
        nl(!IO).
@end example
@noindent
One advantage of doing this is that
if in future more operations need to be added in the middle of the sequence,
the state variables will not need to be renumbered.

A state variable is written @samp{!.@var{X}} or @samp{!:@var{X}},
denoting the ``current'' or ``next'' value of the sequence labelled @var{X}.
A predicate argument @samp{!@var{X}} is shorthand
for two state variable arguments @samp{!.@var{X}, !:@var{X}};
that is,
@samp{p(@dots{}, !@var{X}, @dots{})}
is equivalent to
@samp{p(@dots{}, !.@var{X}, !:@var{X}, @dots{})}.
The variables @samp{!.@var{X}} and @samp{!:@var{X}}
are referred to as the current and next components of @samp{!@var{X}},
respectively.
Note that,
since predicate arguments of the form @samp{!@var{X}}
stand for two arguments,
the arity of a predicate may be greater than it appears.
E.g.@: @samp{p(!X)} is a call to the predicate @code{p/2}.

State variables obey special scope rules.
A state variable @var{X} must be explicitly introduced,
and this can happen in one of four ways:
@itemize
@item
As @samp{!@var{X}} in the head of a predicate clause.
In this case,
references to state variable @samp{!@var{X}} or to its components
may appear in the clause body.
@item
As either @samp{!.@var{X}} or @samp{!:@var{X}} or both
in the head of a predicate or function clause.
Again, in this case,
references to state variable @samp{!@var{X}} or to its components
may appear in the clause body.
@item
As either @samp{!.@var{X}} or @samp{!:@var{X}} or both
in the head of a lambda expression.
In this case,
references to state variable @samp{!@var{X}} or to its components
may appear in the lambda expression body.
(The reason that @samp{!@var{X}} may not appear
in the head of a lambda expression is that
there is no syntax for specifying the modes of the two implied parameters.)
@item
In an explicit quantification such as @samp{some [!@var{X}] @var{Goal}}.
In this case,
references to state variable @samp{!@var{X}} or to its components
may appear in @samp{@var{Goal}}.
@end itemize

Only the current component of
a state variable @var{X} in the enclosing scope
of a lambda or if-then-else expression
may be referred to
(unless the enclosing @var{X} is shadowed
by a more local state variable of the same name.)

For instance, the following clause employs a lambda expression
and is illegal because
it implicitly refers to the next component, @samp{!:@var{S}},
inside the lambda expression.
@example
p(@var{A}, @var{B}, !@var{S}) :-
    P = ( pred(@var{C}::in, @var{D}::out) is det :-
            q(@var{C}, @var{D}, !@var{S})
        ),
    ( if P(@var{A}, @var{E}) then
        @var{B} = @var{E}
    else
        @var{B} = @var{A}
    ).
@end example
@noindent
However
@example
p(@var{A}, @var{B}, !@var{S}) :-
    P = ( pred(@var{C}::in, @var{D}::out, !.@var{S}::in, !:@var{S}::out) is det :-
            q(@var{C}, @var{D}, !@var{S})
        ),
    ( if P(@var{A}, @var{E}, !@var{S}) then
        @var{B} = @var{E}
    else
        @var{B} = @var{A}
    ).
@end example
@noindent
is acceptable because
the state variable @var{S} accessed inside the lambda expression
is locally scoped to the lambda expression
(shadowing the state variable of the same name outside the lambda expression),
and the lambda expression may refer to
the next component of a local state variable.

There are two restrictions concerning state variables in functions,
whether they are defined by clauses or lambda expressions.
@itemize
@item
@samp{!@var{X}} is not a legal function result,
because it stands for two arguments, rather than one.
@item
Neither @samp{!@var{X}} nor @samp{!:@var{X}}
may appear as an argument in a function application,
because this would not make sense
given the usual interpretation of state variables and functions.
@c XXX it appears the implementation does actually allow !:X
(The default mode of functions is that all arguments are input,
while in typical usage, @samp{!:@var{X}} is output.)
@end itemize

Within each clause, the compiler
replaces each occurrence of !@var{X} in an argument list
with two arguments: !.@var{X}, !:@var{X},
where !.@var{X} represents the current version of the state of !@var{X},
and !:@var{X} represents its next state.
It then replaces all occurrences of !.@var{X} and !:@var{X}
with ordinary variables in a way that (in the general case)
represents a sequence of updates to the state of @var{X}
from an initial state to a final state.

This replacement is done by code that is equivalent to
the @samp{transform_goal} and @samp{transform_clause} functions below.
The basic operation used by these functions is substitution:
@samp{substitute(@var{Goal},
[!.@var{X} -> @var{CurX}, !:@var{X} -> @var{NextX}])}
stands for a copy of @var{Goal} in which
every free occurrence of @samp{!.@var{X}} is replaced with @var{CurX}, and
every free occurrence of @samp{!:@var{X}} is replaced with @var{NextX}.
(A free occurrence is one not bound
by the head of a clause or lambda, or by an explicit quantification.)

The @samp{transform_goal(@var{Goal}, @var{X}, @var{CurX}, @var{NextX})}
function's inputs are
@itemize
@item the goal to transform @var{Goal},
@item the name of the state variable @var{X},
@item and the ordinary variables @var{CurX} and @var{NextX}
representing the current and next versions of that state variable.
@end itemize
It returns a transformed version of @var{Goal}.

@samp{transform_goal} has a case for each kind of Mercury goal.
These cases are as follows.

@table @code

@item Calls
Given a first order call such as
@samp{@var{predname}(@var{Arg1}, ..., @var{ArgN})}
or a higher-order call such as
@samp{@var{Expr}(@var{Arg1}, ..., @var{ArgN})},
if any of the arguments is !@var{X},
@samp{transform_goal} replaces that argument with two arguments:
!.@var{X} and !:@var{X}.
It then checks whether
@samp{!:@var{X}} appears in the updated @var{Call}.
@itemize
@item
If it does, then it replaces @var{Call} with
@example
substitute(@var{Call}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{NextX}])
@end example
@item
If it does not, then it replaces @var{Call} with
@example
substitute(@var{Call}, [!.@var{X} -> @var{CurX}]),
@var{NextX} = @var{CurX}
@end example
@end itemize
Note that !.@var{X} can occur in @var{Call} on its own
(i.e.@: without !:@var{X}).
Likewise, !:@var{X} can occur in @var{Call} without !.@var{X},
but this does not need separate handling.

The expression @var{Expr} in a higher-order call
may not be of the form !@var{X}, !.@var{X} or !:@var{X}.
It may be parenthesised as (!.@var{X}), however.

@item Unifications
In a unification @samp{@var{ExprA} = @var{ExprB}},
each of @var{ExprA} and @var{ExprB}
are expressions that
may have one of the following four forms:
@itemize
@item
The expression may be !.@var{S} for some state variable @var{S}.
If @var{S} is @var{X},
then @samp{transform_goal} replaces the expression with @var{CurX}.
@item
The expression may be !:@var{S} for some state variable @var{S}.
If @var{S} is @var{X},
then @samp{transform_goal} replaces the expression with @var{NextX}.
@item
The expression may be a name, a constant,
or a variable that is not a state variable,
@samp{transform_goal} leaves such expressions unchanged.
@item
The expression may be a compound term, which means that
it must have the form
@samp{@var{f}(@var{ArgTerm1}, ..., @var{ArgTermN})}.
@samp{transform_goal} handles these the same way it handles
function applications.
@end itemize
Note that @var{ExprA} and @var{ExprB} may not have the form !@var{S}.

@item State variable field updates
A state variable field update goal has the form
@example
!@var{S} ^ @var{field_list} := @var{Expr}
@end example
where @var{field_list} is a valid field list @xref{Field access expressions}.
This means that
@example
!@var{S} ^ @var{field1} := @var{Expr}
!@var{S} ^ @var{field1} ^ @var{field2} := @var{Expr}
!@var{S} ^ @var{field1} ^ @var{field2} ^ @var{field3} := @var{Expr}
@end example
are all valid field update goals.
If @var{S} is @var{X},
@samp{transform_goal} replaces such goals with
@example
@var{NextX} = @var{CurX} ^ @var{field_list} := @var{Expr}
@end example
Otherwise, it leaves the goal unchanged.

@item Conjunctions
Given a nonempty conjunction,
whether a sequential conjunction such as
@var{Goal1}, @var{Goal2}
or a parallel conjunction such as @var{Goal1} & @var{Goal2},
@samp{transform_goal}
@itemize
@item creates a fresh variable @var{MidX},
@item replaces @var{Goal1} with
@example
substitute(@var{Goal1}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{MidX}])
@end example
@item replaces @var{Goal2} with
@example
substitute(@var{Goal2}, [!.@var{X} -> @var{MidX}, !:@var{X} -> @var{NextX}])
@end example
@end itemize
This implies that first @var{Goal1}
updates the state of @var{X} from @var{CurX} to @var{MidX},
and then @var{Goal2}
updates the state of @var{X} from @var{MidX} to @var{NextX}.

Given the empty conjunction, i.e.@: the goal @samp{true},
@samp{transform_goal} will replace it with
@example
@var{NextX} = @var{CurX}
@end example

@item Disjunctions
Given a disjunction such as @var{Goal1} ; @var{Goal2},
@samp{transform_goal}
@itemize
@item replaces @var{Goal1} with
@example
substitute(@var{Goal1}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{NextX}])
@end example
@item replaces @var{Goal2} with
@example
substitute(@var{Goal2}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{NextX}])
@end example
@end itemize
This shows that both disjuncts start with the @var{CurX},
and both end with @var{NextX}.
If a disjunct has no update of !@var{X},
then the value of @var{NextX} in that disjunct will be set to @var{CurX}.

The empty disjunction, i.e.@: the goal @samp{fail}, cannot succeed,
so what the value of !:@var{X} would be if it @emph{did} succeed is moot.
Therefore @samp{transform_goal} returns empty disjunctions unchanged.

@item Negations
Given a negated goal of the form @samp{not @var{NegatedGoal}},
@samp{transform_goal}
@itemize
@item creates a fresh variable @var{DummyX}, and then
@item replaces @samp{not @var{NegatedGoal}} with
@example
@samp{not} substitute(@samp{NegatedGoal}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{DummyX}]),
@var{NextX} = @var{CurX}
@end example
@end itemize
It does this because negated goals
may not generate any outputs visible from the rest of the code,
which means that any output they @emph{do} generate
must be local to the negated goal.

Negations that use @samp{\+ @var{NegatedGoal}} notation
are handled exactly the same way.

@item If-then-elses
Given an if-then-else, whether it uses
( if @var{Cond} then @var{Then} else @var{Else} ) syntax or
( @var{Cond} -> @var{Then} ; @var{Else} ) syntax,
@samp{transform_goal}
@itemize
@item creates a fresh variable @var{MidX},
@item replaces @var{Cond} with
@example
substitute(@var{Cond}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{MidX}])
@end example
@item replaces @var{Then} with
@example
substitute(@var{Then}, [!.@var{X} -> @var{MidX}, !:@var{X} -> @var{NextX}])
@end example
@item replaces @var{Else} with
@example
substitute(@var{Else}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{NextX}])
@end example
@end itemize
This effectively treats an if-then-else as being a disjunction,
with the first disjunct being
the conjunction of the @var{Cond} and @var{Then} goals,
and the second disjunct being the @var{Else} goal.
(The @var{Else} goal is implicitly conjoined inside the second disjunct
with the negation of the existential closure of @var{Cond},
since the else case is executed only if the condition has no solution.)

@item Bidirectional implications
@samp{transform_goal} treats a bidirectional implication goal,
which has the form @var{GoalA} <=> @var{GoalB}, as if it were the
conjunction of its two constituent unidirectional implications:
@var{GoalA} => @var{GoalB}, @var{GoalA} <= @var{GoalB}.

@item Unidirectional implications
@samp{transform_goal} treats a unidirectional implication,
which has one of the two forms
@samp{@var{GoalA} => @var{GoalB}} and @samp{@var{GoalB} <= @var{GoalA}},
as if they were written as
@samp{not (@var{GoalA}, not @var{GoalB})}.

@item Universal quantifications
@samp{transform_goal} treats universal quantifications,
which have the form @samp{all @var{Vars} @var{SubGoal}}
as if they were written as
@samp{not (some @var{Vars} (not @var{SubGoal}))}.
Note that in universal quantifications,
@var{Vars} must be a list of ordinary variables.
@c XXX the state var transformation does not enforce this.

@item Existential quantifications
In existential quantifications,
which have the form @samp{some @var{Vars} @var{SubGoal}},
@var{Vars} must be a list, in which every element
must be either an ordinary variable (such as @var{A}),
or a state variable (such as !@var{B}).
(Note that @var{Vars} may not contain
any element whose form is !.@var{B} or !:@var{B}.)
@itemize
@item
If @var{Vars} does not contain !@var{X},
then @samp{transform_goal} will replace @var{SubGoal} with
@example
substitute(@var{SubGoal}, [!.@var{X} -> @var{CurX}, !:@var{X} -> @var{NextX}])
@end example
@item
If @var{Vars} does contain !@var{X},
then @samp{transform_goal} will leave @var{SubGoal} unchanged, because
any references to !.@var{X}, !:@var{X} and !@var{X} inside @var{SubGoal}
refer to the state variable @var{X} introduced by this scope,
not the one visible outside.
Effectively, this state variable @var{X}
@emph{shadows} the one visible outside.
@end itemize
Note that state variables in @var{Vars}
are handled by @samp{transform_clause} below.

@end table

The @samp{transform_clause} function's input is a clause,
which may be a non-DCG clause or a DCG clause, which have the forms
@example
@var{predname}(@var{ArgTerm1}, ..., @var{ArgTermN}) :- @var{BodyGoal}.
@end example
and
@example
@var{predname}(@var{ArgTerm1}, ..., @var{ArgTermN}) --> @var{BodyGoal}.
@end example
respectively.
@samp{transform_clause} handles both the same way.

@itemize
@item While any of the @var{ArgTerms}
has one of the forms !.@var{X}, !:@var{X} and !@var{X},
@itemize
@item @samp{transform_clause} will create two fresh variables,
@var{InitX} and @var{FinalX},
@item it will replace
any one of the @var{ArgTerms} that is !.@var{X} with @var{InitX},
any one of the @var{ArgTerms} that is !:@var{X} with @var{FinalX}, and
any one of the @var{ArgTerms} that is !@var{X} with
the argument pair @var{InitX}, @var{FinalX}, and
@item it will replace @var{BodyGoal} with the result of
@samp{transform_goal(@var{BodyGoal}, @var{X}, @var{InitX}, @var{FinalX})}.
@end itemize
@item While @var{BodyGoal} contains a lambda expression
whose argument list contains either !.@var{X} or !:@var{X} or both:
@itemize
@item @samp{transform_clause} will create two fresh variables,
@var{InitX} and @var{FinalX},
@item it will replace
any one of the arguments that is !.@var{X} with @var{InitX}, and
any one of the arguments that is !:@var{X} with @var{FinalX}
(there may not be any argument that is !@var{X}), and
@item it will replace the lambda goal @var{BodyGoal} with the result of
@samp{transform_goal(@var{BodyGoal}, @var{X}, @var{InitX}, @var{FinalX})}.
@end itemize
@item While @var{BodyGoal} contains an existential quantification goal
@samp{some @var{Vars} @var{SubGoal}}
where @var{Vars} contains a state variable such as !@var{B},
@itemize
@item @samp{transform_clause} will create two fresh variables,
@var{InitB} and @var{FinalB},
@item it will replace @var{SubGoal} with the result of
@samp{transform_goal(@var{SubGoal}, @var{B}, @var{InitB}, @var{FinalB})},
and then
@item it will delete !@var{B} from @var{Vars}.
@end itemize
@end itemize

Actual application of this transformation would, in the general case,
result in the generation of many different versions of each state variable,
for which we need more names than just
@samp{@var{CurX}}, @samp{@var{TmpX}} and @samp{@var{NextX}}.
The Mercury compiler therefore uses
@itemize
@item
@samp{@var{STATE_VARIABLE_X_0}} as the initial value of a state variable,
@item
@samp{@var{STATE_VARIABLE_X_N}},
where @samp{@var{N}} is a nonzero positive integer,
as its intermediate values, and
@item
@samp{@var{STATE_VARIABLE_X}} as its final value.
@end itemize

This transformation can lead
to the introduction of chains of unifications for variables
that do not otherwise play a role in the definition,
such as
@samp{@var{STATE_VARIABLE_X_5} = @var{STATE_VARIABLE_X_6},
@var{STATE_VARIABLE_X_6} = @var{STATE_VARIABLE_X_7},
@var{STATE_VARIABLE_X_7} = @var{STATE_VARIABLE_X_8}}.
Where possible, the compiler automatically shortcircuits such sequences
by removing any unneeded intermediate variables.
In the above case, this would yield
@samp{@var{STATE_VARIABLE_X_5} = @var{STATE_VARIABLE_X_8}}.

The following code fragments illustrate
some appropriate uses of state variable syntax.

@table @b
@item Threading the I/O state
@example
main(!IO) :-
    io.write_string("The 100th prime is ", !IO),
    X = prime(100),
    io.write_int(X, !IO),
    io.nl(!IO).
@end example

@item Handling accumulators (1)
@example
foldl2(_, [], !A, !B).
foldl2(P, [X | Xs], !A, !B) :-
    P(X, !A, !B),
    foldl2(P, Xs, !A, !B).
@end example

@item Handling accumulators (2)
@example
iterate_while2(Test, Update, !A, !B) :-
    ( if Test(!.A, !.B) then
        Update(!A, !B),
        iterate_while2(Test, Update, !A, !B)
    else
        true
    ).
@end example

@item Introducing state
@example
compute_out(InA, InB, InC, Out) :-
    some [!State]
    (
        init_state(!:State),
        update_state_a(InA, !State),
        update_state_b(InB, !State),
        list.foldl(update_state_c, InC, !State),
        compute_output(!.State, Out)
    ).
@end example
@end table

@node Variable scoping
@section Variable scoping

There are three sorts of variables in Mercury:
ordinary variables, type variables, and inst variables.

Variables occurring in types,
type classes, and instances are called type variables.
Variables occurring in insts or modes are called inst variables.
Variables that occur in expressions,
and that are not inst variables or type variables,
are called ordinary variables.

Note that type variables can occur in expressions
in the right-hand (@var{Type}) operand of an explicit type qualification.
Inst variables can occur in expressions
in the right-hand (@var{Mode}) operand of an explicit mode qualification.
Apart from that, all other variables in expressions are ordinary variables.

The three different variable sorts occupy different namespaces:
there is no semantic relationship between two variables of different sorts
(e.g.@: a type variable and an ordinary variable)
even if they happen to share the same name.
However, as a matter of programming style, it is generally a bad idea
to use the same name for variables of different sorts in the same clause.

The scope of ordinary variables
is the clause or declaration in which they occur,
unless they are quantified,
either explicitly (@pxref{Goals})
or implicitly (@pxref{Implicit quantification}).

The scope of type variables in a predicate or function's type declaration
extends over any explicit type qualifications
(@pxref{Expressions})
in the clauses for that predicate or function,
and over @samp{pragma type_spec} (@pxref{Type specialization}) declarations
for that predicate or function,
so that explicit type qualifications and @samp{pragma type_spec} declarations
can refer to those type variables.
The scope of any type variables in an explicit type qualification
which do not occur in the predicate or function's type declaration
is the clause in which they occur.

The scope of inst variables is the clause or declaration in which they occur.

@node Implicit quantification
@section Implicit quantification

The rule for implicit quantification in Mercury
is not the same as the usual one in mathematical logic.
In Mercury, variables that do not occur in the head term of a clause
are implicitly existentially quantified around their closest enclosing scope
(in a sense to be made precise in the following paragraphs).
This allows most existential quantifiers to be omitted,
and leads to more concise code.

An occurrence of a variable is @dfn{in a negated context}
if it is in a negation,
in a universal quantification,
in the condition of an if-then-else,
in an inequality,
or in a lambda expression.

Two goals are @dfn{parallel}
if they are different disjuncts of the same disjunction,
or if one is the ``else'' part of an if-then-else
and the other goal is either the ``then'' part or the condition
of the if-then-else,
or if they are the goals of disjoint (distinct and non-overlapping)
lambda expressions.

If a variable occurs in a negated context
and does not occur outside of that negated context
other than in parallel goals
(and in the case of a variable in the condition of an if-then-else,
other than in the ``then'' part of the if-then-else),
then that variable is implicitly existentially quantified
inside the negated context.

@node Elimination of double negation
@section Elimination of double negation

The treatment of inequality, universal quantification,
implication, and logical equivalence as abbreviations
can cause the introduction of double negations
which could make otherwise well-formed code mode-incorrect.
To avoid this problem, the language specifies that
after syntax analysis and implicit quantification,
and before mode analysis is performed,
the implementation must delete any double negations
and must replace any negations of conjunctions of negations
with disjunctions.
(Both of these transformations preserve
the logical meaning and type-correctness of the code,
and they preserve or improve mode-correctness:
they never transform code fragments that would be well-moded
into ones that would be ill-moded. @xref{Modes}.)

@node Definite clause grammars
@section Definite clause grammars

A definite clause grammar (DCG)
is a way of expressing parsing rules,
and is intended for writing parsers and sequence generators.
In the past it has also been used to thread an implicit state variable,
typically the I/O state, through code.
We now recommend that
DCGs only be used for writing parsers and sequence generators,
and that state variable syntax (@pxref{State variables}),
which performs a similar transformation but is more flexible,
be used for other purposes such as threading state variables.

A DCG-rule is a clause that takes the form

@example
DCG_Head --> DCG_Body.
@end example

@noindent
where @var{DCG_Head} is a predicate head term
and @var{DCG_Body} is a DCG-goal.
It is an abbreviation for the rule

@example
Head --> Body.
@end example

@noindent
where @var{Head} is @var{DCG_Head}
with two fresh variables,
@code{V_in} and @code{V_out},
appended to the argument list,
and @var{Body} is @code{transform(V_in, V_out, DCG_Body)},
where @code{transform} is the function defined below.

A DCG-goal is a term of one of the following forms:

@table @code
@item some @var{Vars} @var{DCG-goal}
A DCG existential quantification.
@var{Vars} is a list of variables
and @var{DCG-goal} is a DCG-goal.

@item all @var{Vars} @var{DCG-goal}
A DCG universal quantification.
@var{Vars} is a list of variables
and @var{DCG-goal} is a DCG-goal.

@item @var{DCG-goal1}, @var{DCG-goal2}
A DCG sequence.
@var{DCG-goal1} and @var{DCG-goal2} are DCG-goals.
Intuitively, this means ``parse @var{DCG-goal1} and then parse @var{DCG-goal2}''
or ``do @var{DCG-goal1} and then do @var{DCG-goal2}''.
(Note that the only way this construct actually forces the desired sequencing
is by the modes of the implicit DCG arguments.)

@item @var{DCG-goal1} ; @var{DCG-goal2}
A disjunction.  @var{DCG-goal1} and @var{DCG-goal2} are DCG-goals.
@var{DCG-goal1} must not be of the form @samp{DCG-goal1a -> DCG-goal1b}.
(If it is, then the goal is an if-then-else, not a disjunction.)

@item @{ @var{Goal} @}
A brace-enclosed ordinary goal.
@var{Goal} is a goal.

@item [@var{Expr}, @dots{}]
A DCG input match.
@var{Expr} is an expression.
Unifies the implicit DCG input variable V_in,
which must have type @samp{list(_)},
with a list whose initial elements are the expressions specified
and whose tail is the implicit DCG output variable V_out.

@item []
The null DCG goal (an empty DCG input match).
Equivalent to @samp{@{ true @}}.

@item not @var{DCG-goal}
@itemx \+ @var{DCG-goal}
A DCG negation.
@var{DCG-goal} is a DCG-goal.
The two different syntaxes have identical semantics.

@item if @var{CondGoal} then @var{ThenGoal} else @var{ElseGoal}
@itemx @var{CondGoal} -> @var{ThenGoal} ; @var{ElseGoal}
A DCG if-then-else.
The two different syntaxes have identical semantics.
@var{CondGoal}, @var{ThenGoal}, and @var{ElseGoal} are DCG-goals.

@item =(@var{Expr})
A DCG unification.
@var{Expr} is an expression.
Unifies @var{Expr} with the implicit DCG argument.

@item :=(@var{Expr})
A DCG output unification.
@var{Expr} is an expression.
Unifies @var{Expr} with the implicit DCG output argument,
ignoring the input DCG argument.

@item @var{Expr} =^ @var{field_list}
A DCG field selection.
@var{Expr} is an expression
and @var{field_list} is a field list.
Unifies @var{Expr} with the result of
applying the field selection @var{field_list} to the implicit DCG argument.
@xref{Field access expressions}.

@item ^ @var{field_list} := @var{Expr}
A DCG field update.
@var{Expr} is an expression
and @var{field_list} is a field list.
Replaces a field in the implicit DCG argument.
@xref{Field access expressions}.

@item @var{DCG-call}
A term which does not match any of the above forms
is a DCG predicate call.
If @var{DCG-call} is a variable @var{Var},
it is treated as if it were @samp{call(@var{Var})}.
Then, the two implicit DCG arguments are appended to the specified arguments.

@end table

The semantics is given by the following function,
where each occurrence of @var{V_new} is a fresh variable.
@example
transform(V_in, V_out, some Vars DCG_goal) =
    some Vars transform(V_in, V_out, DCG_goal)

transform(V_in, V_out, all Vars DCG_goal) =
    all Vars transform(V_in, V_out, DCG_goal)

transform(V_in, V_out, (DCG-goal1, DCG-goal2)) =
    (transform(V_in, V_new, DCG_goal1), transform(V_new, V_out, DCG_goal2))

transform(V_in, V_out, (DCG_goal1 ; DCG_goal2)) =
    ( transform(V_in, V_out, DCG_goal1)
    ; transform(V_in, V_out, DCG_goal2)
    )

transform(V_in, V_out, @{ Goal @}) = (Goal, V_out = V_in)

transform(V_in, V_out, [Expr, @dots{}]) = (V_in = [Expr, @dots{} | V_out])

transform(V_in, V_out, []) = (V_out = V_in)

transform(V_in, V_out, not DCG_goal) =
    (not transform(V_in, V_new, DCG_goal), V_out = V_in)

transform(V_in, V_out, if CondGoal then ThenGoal else ElseGoal) =
    ( if transform(V_in, V_new, CondGoal) then
        transform(V_new, V_out, ThenGoal)
    else
        transform(V_in, V_out, ElseGoal)
    )

transform(V_in, V_out, =(Expr)) = (Expr = V_in, V_out = V_in)

transform(V_in, V_out, :=(Expr)) = (V_out = Expr)

transform(V_in, V_out, Expr =^ field_list) =
    (Expr = V_in ^ field_list, V_out = V_in)

transform(V_in, V_out, ^ field_list := Expr) =
    (V_out = V_in ^ field_list := Expr)

transform(V_in, V_out, p(A1, @dots{}, AN)) =
    p(A1, @dots{}, AN, V_in, V_out)
@end example

@node Types
@chapter Types

The type system is based on many-sorted logic,
and supports polymorphism,
type classes (@pxref{Type classes}),
and existentially quantified types (@pxref{Existential types}).

@menu
* Builtin types::
* User-defined types::
* Predicate and function type declarations::
* Field access functions::
* The standard ordering::
@end menu

@node Builtin types
@section Builtin types

This section describes the special types
that are built into the Mercury implementation,
or are defined in the standard library.

@menu
* Primitive types::
* Other builtin types::
@end menu

@node Primitive types
@subsection Primitive types

There is a special syntax
for constants of all primitive types except @code{char}.
(For @code{char}, the standard syntax suffices.)

@menu
* Signed integer types::
* Unsigned integer types::
* Floating-point type::
* Character type::
* String type::
@end menu

@node Signed integer types
@subsubsection Signed integer types
There are five primitive signed integer types:
@code{int}, @code{int8}, @code{int16}, @code{int32} and @code{int64}.

Except for @code{int},
the width in bits of each of these is given by the numeric suffix in its name.

The width in bits of @code{int} is implementation defined,
but must be at least 32-bits.

All signed integer types use two's-complement representation.
Their width must be equal to the width of the corresponding unsigned type.

Values of the type @code{int8} must be in the range
@math{-128} (@math{-(2^{8 - 1})}) to @math{127} (@math{2^{8 - 1} - 1}),
both inclusive.

Values of the type @code{int16} must be in the range
@math{-32768} (@math{-(2^{16 - 1})}) to @math{32767} (@math{2^{16 - 1} - 1}),
both inclusive.

Values of the type @code{int32} must be in the range
@math{-2147483648} (@math{-(2^{32 - 1})})
to @math{2147483647} (@math{2^{32 - 1} - 1}),
both inclusive.

Values of the type @code{int64} must be in the range
@math{-9223372036854775808} (@math{-(2^{64 - 1})})
to @math{9223372036854775807} (@math{2^{64 - 1} - 1}),
both inclusive.

Values of the type @code{int} must be in the range
@math{-(2^{N - 1})} to @math{2^{N - 1} - 1},
both inclusive;
@math{N} being the width of @code{int} in bits.

@node Unsigned integer types
@subsubsection Unsigned integer types
There are five primitive unsigned integer types:
@code{uint}, @code{uint8}, @code{uint16}, @code{uint32} and @code{uint64}.

Except for @code{uint},
the width in bits of each of these is given by the numeric suffix in its name.

The width in bits of @code{uint} is implementation defined,
but must be at least 32-bits.
It must be equal to the width of the type @code{int}.

Values of the type @code{uint8} must be in the range
@math{0} (@math{2^0 - 1}) to @math{255} (@math{2^8 - 1}),
both inclusive.

Values of the type @code{uint16} must be in the range
@math{0} (@math{2^0 - 1}) to @math{65535} (@math{2^16 - 1}),
both inclusive.

Values of the type @code{uint32} must be in the range
@math{0} (@math{2^0 - 1}) to @math{4294967295} (@math{2^32 - 1}),
both inclusive.

Values of the type @code{uint64} must be in the range
@math{0} (@math{2^0 - 1}) to @math{18446744073709551615} (@math{2^64 - 1}),
both inclusive.

Values of the type @code{uint} must be in the range
@math{0} (@math{2^0 - 1}) to @math{2^N - 1},
both inclusive;
@math{N} being the width of @code{uint} in bits.

@node Floating-point type
@subsubsection Floating-point type
There is one floating-point type: @code{float}.

It is represented using either the 32-bit single-precision IEEE 754 format
or the 64-bit double-precision IEEE 754 format.

The choice between the two formats is implementation dependent.

In the Melbourne Mercury implementation,
@code{float}s are represented
using the 32-bit single-precision IEEE 754 format
in grades that have @code{.spf} grade component,
and using the 64-bit double-precision IEEE 754 format in every other grade.

@node Character type
@subsubsection Character type
There is one character type: @code{char}.

Values of this type represent Unicode code points.

@node String type
@subsubsection String type
There is one string type: @code{string}.

A string is a sequence of characters encoded
using either the UTF-8 or UTF-16 encoding of Unicode.

The choice between the two encodings is implementation dependent.

In the Melbourne Mercury implementation,
@code{string}s are represented
using UTF-8 when generating code for C,
and using UTF-16 when generating code for C# or Java.

@node Other builtin types
@subsection Other builtin types

@menu
* Predicate and function types::
* Tuple types::
* The universal type::
* The ``state-of-the-world'' type::
@end menu

@node Predicate and function types
@subsubsection Predicate and function types
The predicate types are
@code{pred}, @code{pred(T)}, @code{pred(T1, T2)}, @dots{}

@noindent
The function types are
@code{(func) = T}, @code{func(T1) = T}, @code{func(T1, T2) = T}, @dots{}

Higher-order predicate and function types
are used to pass closures to other predicates and functions.
@xref{Higher-order}.

@node Tuple types
@subsubsection Tuple types
The tuple types are @code{@{@}}, @code{@{T@}}, @code{@{T1, T2@}}, @dots{}

A tuple type is equivalent to a discriminated union type
(@pxref{Discriminated unions}) with declaration
@example
 :- type @{Arg1, Arg2, @dots{}, ArgN@}
         --->    @{ @{Arg1, Arg2, @dots{}, ArgN@} @}.
@end example

@node The universal type
@subsubsection The universal type
The type @code{univ} is defined in the standard library module @code{univ},
along with the predicates @code{type_to_univ/2} and @code{univ_to_type/2}.
With those predicates,
values of any type can be converted to the universal type, and back again.
The conversion from @code{univ} to the original type
will check that the value inside the @code{univ} has the expected type.
The universal type is useful for situations
where you need heterogeneous collections.

@node The ``state-of-the-world'' type
@subsubsection The ``state-of-the-world'' type
The type @code{io.state} is defined in the standard library module @code{io},
and represents the state of the world.
Predicates which perform I/O
are passed the only reference to the current state of the world,
and produce a unique reference to the new state of the world.
In this way, we can give a declarative semantics to code that performs I/O.

@node User-defined types
@section User-defined types

New types can be introduced with @samp{:- type} declarations.
There are several categories of derived types:

@menu
* Discriminated unions::
* Equivalence types::
* Abstract types::
* Subtypes::
@end menu

@node Discriminated unions
@subsection Discriminated unions

These encompass both enumeration and record types in other languages.
A derived type is defined using @samp{:- type @var{type} ---> @var{body}}.
(Note there are @emph{three} dashes in that arrow.
It should not be confused with the two-dash arrow used for DCGs
or the one-dash arrow used for if-then-else.)
If the @var{type} term is a functor of arity zero
(i.e.@: one having zero arguments),
it names a monomorphic type.
Otherwise, it names a polymorphic type;
the arguments of the functor must be distinct type variables.
The @var{body} term is defined as
a sequence of constructor definitions separated by semicolons.

Ordinarily, each constructor definition
must be a functor whose arguments (if any) are types.
Ordinary discriminated union definitions must be @dfn{transparent}:
all type variables occurring in the @var{body}
must also occur in the @var{type}.
(The reverse is not the case:
a variable occurring in the @var{type}
need not also occur in the @var{body}.
Such variables are called @samp{phantom type parameters},
and their use is explained below.)

However, constructor definitions can optionally be existentially typed.
In that case, the functor will be preceded by an existential type quantifier
and can optionally be followed by an existential type class constraint.
For details, see @ref{Existential types}.
Existentially typed discriminated union definitions need not be transparent.

The arguments of constructor definitions may be named.
These names cause the compiler to generate functions
which can be used to select and update fields of a term
in a manner independent of the definition of the type
(@pxref{Field access functions}).
A named argument has the form @w{@code{@var{fieldname} :: @var{Type}}}.
It is an error for
two fields in the same type definition to have the same name,
even if the fields they name occur in different data constructors.

Here are some examples of discriminated union definitions:

@example
:- type fruit
        --->    apple
        ;       orange
        ;       banana
        ;       pear.

:- type strange
        --->    foo(int)
        ;       bar(string).

:- type employee
        --->    employee(
                       name        :: string,
                       age         :: int,
                       department  :: string
                ).

:- type tree
        --->    empty
        ;       leaf(int)
        ;       branch(tree, tree).

:- type list(T)
        --->    []
        ;       [T | list(T)].

:- type pair(T1, T2)
        --->    T1 - T2.
@end example

If the body of a discriminated union type definition
contains a term whose top-level functor is @code{';'/2},
the semicolon is normally assumed to be a separator.
This makes it difficult to define a type
whose constructors include @code{';'/2}.
To allow this, curly braces can be used to quote the semicolon.
It is then also necessary to quote curly braces.
The following example illustrates this:

@example
:- type tricky
        --->    @{ int ; int @}
        ;       @{ @{ int @} @}.
@end example

This defines a type with two constructors,
@code{';'/2} and @code{'@{@}'/1}, whose argument types are all @code{int}.
We recommend against using constructors named @code{'@{@}'}
because of the possibility of confusion with the builtin tuple types.

Each discriminated union type definition introduces a distinct type.
Mercury considers two discriminated union types that have the same bodies
to be distinct types (name equivalence).
Having two different definitions of a type
with the same name and arity in the same module is an error.

Constructors may be overloaded among different types:
there may be any number of constructors with a given name and arity,
so long as they all have different types.
However, there must not be more than one constructor
with the same name, arity, and result type in the same module.
(There is no particularly good reason for this restriction;
in the future we may allow several such functors
as long as they have different argument types.)
@c XXX Was that restriction already lifted?
Note that excessive overloading of constructors
can slow down type checking
and can make the program confusing for human readers,
so overloading should not be over-used.

@c XXX The `where direct_arg' attribute is not documented because it requires
@c     the user has a detailed understanding of the type representation, and
@c     is very implementation specific. The following is for implementors.

@c Discriminated union type definitions may be followed by a
@c @samp{direct_arg} attribute of the following form:
@c
@c @example
@c where direct_arg is @var{ctors}
@c @end example
@c
@c @noindent
@c where @var{ctors} is a list of @var{functor-name} / @var{functor-arity}.
@c The functor arities must always be one.
@c
@c The attribute notifies importing modules that each of the functors
@c listed is to be represented as a tagged pointer to its argument. The
@c argument type must be known, when compiling the module that the type is
@c defined in, to not require the use of the tag bits. The compiler will
@c emit an error message otherwise.  The compiler will silently ignore
@c functors which require a secondary tag.
@c
@c The optimised type representation is usually only applied if the
@c argument type is defined in the interface section of the same module.
@c This attribute allows the programmer to also apply it when the argument
@c type is known to the defining module, but not necessarily modules which
@c import the top-level type.
@c
@c Ideally, the @samp{direct_arg} attribute would be automatically
@c generated when making an interface file, so the user would never need to
@c write it manually.  At this time, the compiler does not have enough
@c information when making interface files.

Note that user-defined types may not have names that have meanings in Mercury.
(Most of these are documented in later sections.)

The list of reserved type names is
@example
int
int8
int16
int32
int64
uint
uint8
uint16
uint32
uint64
float
character
string
@{@}
=
=<
pred
func
pure
semipure
impure
''
@end example

Phantom type parameters are useful when you have
two distinct concepts that you want to keep separate,
but for which nevertheless you want to use the same representation.
This is an example of their use, taken from the Mercury compiler:

@example
:- type var(T)
        --->    ...

:- type prog_var == var(prog_var_type).
:- type type_var == var(type_var_type).

:- type prog_var_type
    --->    prog_var_type.
:- type type_var_type
    --->    type_var_type.
@end example

The @code{var} type represents the generic notion of a variable.
It has a phantom type parameter, @code{T},
which does not occur in the body of its type definition.
The @code{prog_var} and @code{type_var} types represent
two different specific kinds of variables:
program variables, which occur in code (clauses),
and type variables, which occur in types, respectively.
They each bind a different type to the type parameter @code{T}.
These types, @code{prog_var_type} and @code{type_var_type},
each have a single function symbol of arity zero.
This means that each type has only one value,
which in turn means that values of these types contain no information at all.
But containing information is not the purpose of these types.
Their purpose is to ensure that
if a computation that expects program variables
is ever accidentally given type variables, or vice versa,
this mismatch is detected and reported by the compiler.
Two variables can be unified only if they have the same type.
While two @code{prog_var}s have the same type,
and two @code{type_var}s have the same type,
a @code{prog_var} and @code{type_var} have different types,
due to having different types
(@code{prog_var_type} and @code{type_var_type})
being bound to the phantom type parameter @code{T}.

@node Equivalence types
@subsection Equivalence types

These are type abbreviations.
They are defined using @samp{==} as follows.
They may be polymorphic.

@example
:- type money == int.
:- type assoc_list(KeyType, ValueType)
        == list(pair(KeyType, ValueType)).
@end example

Equivalence type definitions must be transparent.
Unlike discriminated union type definitions,
equivalence type definitions must not be cyclic;
that is, the type on the left hand side of the @samp{==}
(@samp{assoc_list} and @samp{money} in the examples above)
must not occur on the right hand side of the @samp{==}.

Mercury treats an equivalence type
as an abbreviation for the type on the right hand side of the definition;
the two are equivalent in all respects
in scopes where the equivalence type is visible.

@node Abstract types
@subsection Abstract types

These are types whose implementation is hidden.
The type declarations

@example
:- type t1.
:- type t2(T1, T2).
@end example

@noindent
declare types @code{t1/0} and @code{t2/2} to be abstract types.
Such declarations are only useful in the interface section of a module.
This means that the type names will be exported,
but the constructors (functors) for these types will not be exported.
The implementation section of a module
must give a definition for all of the abstract types
named in the interface section of the module.
Abstract types may be defined as either discriminated union types
or as equivalence types.

@c -----------------------------------------------------------------------

@node Subtypes
@subsection Subtypes

(This is a new and experimental feature, subject to change.)

A subtype is a discriminated union type
that is a subset of a supertype,
in that every term of a subtype is a valid term in the supertype.
It is possible to convert terms between subtype and supertype
using type conversion expressions (@pxref{Type conversions}).

As previously described,
the syntax for non-subtype discriminated union types is
@example
:- type @var{type} ---> @var{body}.
@end example
where @var{type} is the name of a type constructor
applied to zero or more distinct type variables
(the @emph{parameters} of the type constructor),
and @var{body} is a sequence of constructor definitions
separated by semicolons.
All universally quantified type variables that occur in @var{body}
must be among @var{type}'s  parameters.

The syntax for subtypes is similar but slightly different:
@example
:- type @var{subtype} =< @var{supertype} ---> @var{body}.
@end example
Since a subtype is also a discriminated union type,
the rules for discriminated union types apply to them as well:
@var{subtype} must be the name of a type constructor
applied to zero or more distinct type variables (its parameters),
@var{body} must be a sequence of constructor definitions
separated by semicolons,
and all universally quantified type variables that occur in @var{body}
must be among @var{subtype}'s parameters.

@var{supertype} must be a type constructor
applied to zero or more argument types,
which @emph{may} be type variables, but they do not have to be,
and if they are, do not need to be distinct.
After expanding out equivalences,
@var{supertype}'s principal type constructor
must specify a discriminated union type
whose definition is in scope where the subtype definition occurs,
by normal module visibility rules.

The discriminated union type specified by @var{supertype}
may itself be a subtype.
Following the chain of subtype definitions,
it must be possible to arrive at a @emph{base type},
which is a discriminated union type but @emph{not} a subtype.

The body of the subtype may differ from the body of its supertype in two ways.
@itemize @bullet
@item
It may omit one or more constructor definitions.
The ability to do this is the main motivation for the use of subtypes.

Since the subtype cannot @emph{add} definitions of constructors,
the set of constructor definitions in the subtype
must be a subset of the constructor definitions in the supertype.
We recommend that they should appear in the same relative order
as in the supertype definition.
@item
It may change the types of some of the arguments of some of the constructors,
@emph{provided} that each replacement replaces a type with one of its subtypes.

Formally, this means that
if the supertype @samp{t} has a constructor @samp{f(T1, ..., Tn)},
and the subtype @samp{s =< t} has a constructor @samp{f(S1, ..., Sn)},
then for each @var{Si}, the condition @samp{Si =< Ti} must hold,
where @samp{=<} is the subtype relation below.
@end itemize

This is an example of the first kind of difference:
@example
:- type fruit
    --->    apple
    ;       pear
    ;       lemon
    ;       orange.

:- type citrus_fruit =< fruit
    --->    lemon
    ;       orange.
@end example

And this is an example of the second:
@example
:- type fruit_basket
    --->    basket(fruit, int).
            % What kind of fruit, and how many.

:- type citrus_fruit_basket =< fruit_basket
    --->    basket(citrus_fruit, int).
@end example

(There are more examples below.)

If the subtype retains a constructor from the supertype
that has one or more existentially quantified type variables,
then the subtype constructor must repeat
the list of existentially quantified type variables
from the supertype constructor,
and all existential class constraints,
with no additions, removals, or reordering.
(The type variables do not need to have the same names
in the subtype as the supertype,
but, stylistically, it makes more sense if they do.)

As mentioned above,
any universally quantified type variable
that occurs in @var{body} must occur also in @var{subtype}.
However, this is the only restriction
on the list of parameters in @var{subtype}.
For example, it need not have any particular relationship
with the list of parameters
of the principal type constructor of @var{supertype}.
For example, @var{subtype} may have
a phantom type parameter (@pxref{Discriminated unions})
that does not occur in @var{supertype}.

@c There should be some discussion here
@c of the possible uses of this flexibility.

(In the following discussion,
we assume that all equivalence types have been expanded out.)

The subtype relation @samp{S =< T} has four cases to consider:
when @samp{S} and @samp{T} are both discriminated union types,
when they are both tuple types,
when they are both higher-order types,
and all other types.

@c Manually numbered as @enumerate introduces another level of indentation
@c and leaves too little space between the items.
@noindent
1. For discriminated union types @samp{S} and @samp{T}:
@itemize
@item
If @samp{S} and @samp{T} have the same principal type constructor,
say @samp{f/n}, which implies that
@samp{S = f(S1, ..., Sn)} and @samp{T = f(T1, ..., Tn)},
then @samp{S =< T} holds if and only if
for all @var{i} in @samp{1..n}, @samp{Si =< Ti}.
@item
If @samp{S} and @samp{T} have different principal type constructors,
and if @samp{S = f(S1, ..., Sn)}, @samp{S =< T} holds if
    @itemize @minus
    @item
    there is a visible subtype definition starting with
    @w{@samp{:- type f(R1, ..., Rn) =< U}},
    @item
    for all @var{i} in @samp{1..n}, @w{@samp{Si = Ri}} (unification), and
    @item
    @samp{U =< T}.
    @end itemize
In other words, if all occurrences of @var{Ri} in @var{U}
are replaced by the corresponding @var{Si} to give @var{Usub},
then @samp{Usub =< T} must hold.
@end itemize

@noindent
2. For two tuple types
@samp{S = @{S1, ..., Sn@}} and @samp{T = @{T1, ..., Tn@}},
@samp{S =< T} holds if and only if
@samp{Si =< Ti} for all @samp{i} in @samp{1..n}.
This is analogous to the case for discriminated union types
with the same principal type constructor.

@noindent
3. A higher-order type @samp{S}
can be a subtype of another higher-order type @samp{T}
in only one way.
Since subtype definitions do not apply to higher-order types,
this way is analogous to the case for discriminated union types
with the same principal type constructor.

@itemize
@item
@samp{P =< Q} holds for two higher-order types @var{P} and @var{Q}
if and only if all of the following conditions hold:
    @itemize @minus
    @item
    @var{P} and @var{Q} are either
    both @samp{pred} types, or both @samp{func} types,
    @item
    they have the same arity,
    @item
    @var{P} and @var{Q} have the same argument types
    (the current implementation does not allow subtyping
    in higher-order arguments), and
    @item
    if either of @var{P} and @var{Q} has higher-order inst information,
    then @var{P} and @var{Q} must have
    the @emph{same} higher-order inst information,
    i.e.@: their higher-order inst information must specify
    the same argument modes, determinism, and purity.
    @end itemize
@end itemize

@noindent
4. For all other types,
@samp{S =< T} if and only if @samp{S = T},
i.e.@: they are syntactically identical.

A subtype may be exported as an abstract type
by declaring only the name of the subtype in the
interface section of a module (without the @samp{=< @var{supertype}} part).
Then the subtype definition must be given in the implementation section of
the same module.

Example:

@example
:- interface.

:- type non_empty_list(T).  % abstract type

:- implementation.

:- import list.

:- type non_empty_list(T) =< list(T)
    --->    [T | list(T)].
@end example

Subtypes must not have user-defined equality or comparison predicates.
The base type of a subtype may have user-defined equality or comparison.
In that case, values of the subtype will be tested for equality or
compared using those predicates.

There is no special interaction between subtypes and the type class system.

Some more examples of subtypes:

@example
:- type list(T)
    --->    []
    ;       [T | list(T)].

:- type non_empty_list(T) =< list(T)
    --->    [T | list(T)].

:- type non_empty_list_of_foo =< list(foo)
    --->    [foo | list(foo)].

:- type maybe_foo
    --->    none
    ;       some [T] foo(T) => fooable(T).

:- type foo =< maybe_foo
    --->    some [T] foo(T) => fooable(T).

:- type task
   --->     create(pred(int::in, io::di, io::uo) is det)
   ;        delete(pred(int::in, io::di, io::uo) is det).

:- type create_task =< task
   --->     create(pred(int::in, io::di, io::uo) is det).
@end example

And one more complex example.

In the case of a set of mutually recursive types,
omitting some constructor definitions from a type may not be enough;
it may be necessary to replace some argument types with their subtypes as well.
Consider this pair of mutually recursive types representing a bipartite graph,
i.e.@: a graph in which there are two kinds of nodes,
and edges always connect two nodes of different kinds.
In this bipartite graph,
the two kinds of nodes are @var{or} nodes and @var{and} nodes,
and each kind of node can be connected to zero, two or more nodes
of the other kind.

@example
:- type or_node
    --->    or_source(source_id)
    ;       logical_or(and_node, and_node)
    ;       logical_or_list(and_node, and_node, and_node, list(and_node)).

:- type and_node
    --->    and_source(source_id)
    ;       logical_and(or_node, or_node)
    ;       logical_and_list(or_node, or_node, or_node, list(or_node)).
@end example

If we wanted a subtype to represent graphs in which
no @var{or} node could be connected to more than two @var{and} nodes,
one might think that it would be enough
to delete the @var{logical_or_list} constructor from the @var{or_node} type,
like this:

@example
:- type binary_or_node =< or_node
    --->    or_source(source_id)
    ;       logical_or(and_node, and_node).
@end example

However, this would not work,
because the @var{and_node}s have constructors
whose arguments have type @var{or_node}, not @var{binary_or_node}.
One would have to create a subtype of the @var{and_node} type
that constructs @var{and} nodes
from @var{binary_or_node}s, not plain @var{or_node}s,
like this:

@example
:- type binary_or_node =< or_node
    --->    or_source(source_id)
    ;       logical_or(binary_or_and_node, binary_or_and_node).

:- type binary_or_and_node =< and_node
    --->    and_source(source_id)
    ;       logical_and(binary_or_node, binary_or_node)
    ;       logical_and_list(binary_or_node, binary_or_node, binary_or_node,
                list(binary_or_node)).
@end example

@c -----------------------------------------------------------------------

@node Predicate and function type declarations
@section Predicate and function type declarations

The argument types of each predicate
must be explicitly declared with a @samp{:- pred} declaration.
The argument types and return type of each function must be
explicitly declared with a @samp{:- func} declaration.
For example:

@example
:- pred is_all_uppercase(string).

:- func strlen(string) = int.
@end example

Predicates and functions can be polymorphic;
that is, their declarations can include type variables.
For example:

@example
:- pred member(T, list(T)).

:- func length(list(T)) = int.
@end example

A predicate or function can be declared
to have a given higher-order type (@pxref{Higher-order})
by using an explicit type qualification in the type declaration.
This is useful where several predicates or functions
need to have the same type signature,
which often occurs for type class method implementations
(@pxref{Type classes}),
and for predicates to be passed as higher-order terms.

For example,

@example
:- type foldl_pred(T, U) == pred(T, U, U).
:- type foldl_func(T, U) == (func(T, U) = U).

:- pred p(int) : foldl_pred(T, U).
:- func f(int) : foldl_func(T, U).
@end example

@noindent
is equivalent to

@example
:- pred p(int, T, U, U).
:- pred f(int, T, U) = U.
@end example

Type variables in predicate and function declarations
are implicitly universally quantified by default;
that is, the predicate or function may be called with arguments
and (in the case of functions) return value
whose actual types are any instance of the types specified in the declaration.
For example, the function @samp{length/1} declared above
could be called with the argument having type
@samp{list(int)}, or @samp{list(float)}, or @samp{list(list(int))}, etc.

Type variables in predicate and function declarations can
also be existentially quantified; this is discussed in
@ref{Existential types}.

There must only be one predicate with a given name and arity in each module,
and only one function with a given name and arity in each module.
It is an error to declare the same predicate or function twice.

There must be at least one clause defined
for each declared predicate or function,
except for those defined using the foreign language interface
(@pxref{Foreign language interface}).
However, Mercury implementations are permitted to provide a method
of processing Mercury programs in which such errors are not reported
until and unless the predicate or function is actually called.
(The Melbourne Mercury implementation provides this
with its @samp{--allow-stubs} option.
This can be useful during program development,
since it allows you to execute parts of a program
while the program's implementation is still incomplete.)

Note that a predicate defined using DCG notation (@pxref{Items})
will appear to be defined with two fewer arguments than it is declared with.
It will also appear to be called with two fewer arguments
when called from predicates defined using DCG notation.
However, when called from an ordinary predicate or function,
it must have all the arguments it was declared with.

The compiler infers the types of expressions,
and in particular the types of variables
and overloaded constructors, functions, and predicates.
A @dfn{type assignment} is an assignment of a type to every variable,
and of a particular constructor, function, or predicate
to every name in a clause.
A type assignment is @dfn{valid} if it satisfies the following conditions.

Each constructor in a clause
must have been declared in at least one visible type declaration.
The type assigned to each constructor term
must match one of the type declarations for that constructor,
and the types assigned to the arguments of that constructor
must match the argument types specified in that type declaration.

The type assigned to each function call term
must match the return type of one of the @samp{:- func} declarations
for that function, and the types assigned to the arguments of that function
must match the argument types specified in that type declaration.

The type assigned to each predicate argument must match
the type specified in one of the @samp{:- pred} declarations
for that predicate.
The type assigned to each head argument in a predicate clause
must exactly match the argument type specified
in the corresponding @samp{:- pred} declaration.

The type assigned to each head argument in a function clause
must exactly match the argument type
specified in the corresponding @samp{:- func} declaration,
and the type assigned to the result term in a function clause
must exactly match the result type specified
in the corresponding @samp{:- func} declaration.

The type assigned to each expression with an explicit type qualification
(@pxref{Expressions})
must match the type specified by the type qualification
expression@footnote{The type of an explicitly
type qualified term may be an instance of the type specified by the
qualifier. This allows explicit type qualifications to constrain the
types of two expressions to be identical, without knowing the exact types
of the expressions. It also allows type qualifications to refer to the
types of the results of existentially typed predicates or functions.}.

(Here ``match'' means to be an instance of,
i.e.@: to be identical to for some substitution of the type parameters,
and ``exactly match'' means to be identical up to renaming of type parameters.)

One type assignment @var{A} is said to be
@dfn{more general} than another type assignment @var{B}
if there is a binding of the type parameters in @var{A}
that makes it identical (up to renaming of parameters) to @var{B}.
If there is more than one valid type assignment,
the compiler must choose the most general one.
If there are two valid type assignments which are not identical up to renaming
and neither of which is more general than the other,
then there is a type ambiguity, and the compiler must report an error.
A clause is @dfn{type-correct}
if there is a unique (up to renaming) most general valid type assignment.
Every clause in a Mercury program must be type-correct.

@node Field access functions
@section Field access functions

Fields of constructors of discriminated union types may be named
(@pxref{Discriminated unions}).
These names cause the compiler to generate functions
which can be used to select and update fields of a term
in a manner independent of the definition of the type.

The Mercury language includes syntactic sugar to make it more convenient
to select and update fields inside nested terms
(@pxref{Field access expressions})
and to select and update fields of the DCG arguments of a clause
(@pxref{Definite clause grammars}).

@menu
* Field selection::
* Field update::
* User-supplied field access function declarations::
* Field access examples::
@end menu

@node Field selection
@subsection Field selection

@example
@var{field}(@var{Term})
@end example

Each field name @samp{@var{field}} in a constructor
tells the compiler to generate
a field selection function @samp{@var{field}/1},
which takes an expression of the same type as the constructor
and returns the value of the named field,
failing if the top-level constructor of the argument
is not the constructor containing the field.

If the declaration of the field is in the interface section of the module,
the corresponding field selection function is also exported from the module.

By default, this function has no declared modes---the modes are inferred
at each call to the function.
However, the type and modes of this function may be explicitly declared,
in which case it will have only the declared modes.

To create a higher-order value from a field selection function,
an explicit lambda expression must be used,
unless a single mode declaration is supplied for the field selection function.
The reason for this is that normally,
field access functions are implemented directly as unifications,
without the code of a function being generated for them.
The declaration acts as the request for the generation of that code.

@node Field update
@subsection Field update

@example
'@var{field} :='(@var{Term}, @var{ValueTerm})
@end example

Each field name @samp{@var{field}} in a constructor
tells the compiler to generate
a field update function @samp{'@var{field} :='/2}.
The first argument of this function
is an expression of the same type as the constructor.
The second argument
is an expression of the same type as the named field.
The return value is a copy of the first argument
with the value of the named field replaced by the second argument.
@samp{'@var{field} :='/2} fails
if the top-level constructor of the first argument
is not the constructor containing the named field.

If the declaration of the field is in the interface section of the module,
the corresponding field update function is also exported from the module.

By default, this function has no declared modes---the modes are inferred
at each call to the function.
However, the type and modes of this function may be explicitly declared,
in which case it will have only the declared modes.

To create a higher-order value from a field update function,
users must write an explicit lambda expression,
unless a single mode declaration is supplied for the field update function.
The reason for this is that normally, as with field selection functions,
field update functions are implemented directly as unifications,
without the code of a function being generated for them.
The declaration acts as the request
for the compiler to generate that function.

Some fields cannot be updated using field update functions.
For the constructor @samp{unsettable/2} below,
neither field may be updated
because the resulting term would not be well-typed.
A future release may allow multiple fields to be updated
by a single expression to avoid this problem.

@example
:- type unsettable
        --->    some [T] unsettable(
                    unsettable1 :: T,
                    unsettable2 :: T
                ).
@end example

@node User-supplied field access function declarations
@subsection User-supplied field access function declarations

Type and mode declarations for compiler-generated field access functions
for fields of constructors local to a module
may be placed in the interface section of the module.
The user-supplied declarations will be used instead of
any automatically generated declarations.
This allows the implementation of a type to be hidden
while still allowing client modules to use record syntax
to manipulate values of the type.
Supplying a type declaration and a single mode declaration
also allows higher-order terms to be created
from a field access function without using explicit lambda expressions.

If a field occurs in the interface section of a module,
then any declaration for a field access function for that field
must also occur in the interface section.

If there are multiple fields with the same name in the same module,
only one of those fields can have user-supplied declarations
for its selection function.
Similarly, only one of those fields can have user-supplied declarations
for its update function.

Declarations and clauses for field access functions can also be supplied
for fields which are not a part of any type.
This is useful when the data structures of a program change
so that a value which was previously stored as part of a type
is now computed each time it is requested.
It also allows record syntax to be used for type class methods.

User-declared field access functions may take extra arguments.
For example, the Mercury standard library module @code{map}
contains the following functions:
@example
:- func elem(K, map(K, V)) = V is semidet.
:- func 'elem :='(K, map(K, V), V) = map(K, V).
@end example
Field access syntax may be used
at the top-level of @code{func} and @code{mode} declarations
and in the head of clauses.
For instance:
@example
:- func map(K, V) ^ elem(K) = V.
:- mode in        ^ in      = out is semidet.
Map ^ elem(Key) = map.lookup(Map, Key).

:- func (map(K, V) ^ elem(K) := V)  = V.
:- mode (in        ^ in      := in) = out is semidet.
(Map ^ elem(Key) := Value) = map.set(Map, Key, Value).
@end example

The Mercury standard library modules @code{array} and @code{bt_array}
define similar functions.

@node Field access examples
@subsection Field access examples

The examples make use of the following type declarations:

@example
:- type type1
        --->    type1(
                    field1 :: type2,
                    field2 :: string
                ).

:- type type2
        --->    type2(
                    field3 :: int,
                    field4 :: int
                ).
@end example

The compiler generates some field access functions for @samp{field1}.
The functions generated for the other fields are similar.

@example
:- func type1 ^ field1 = type2.
type1(Field1, _) ^ field1 = Field1.

:- func (type1 ^ field1 := type2) = type1.
(type1(_, Field2) ^ field1 := Field1) = type1(Field1, Field2).
@end example

Using these functions and the syntactic sugar described in
@ref{Field access expressions},
programmers can write code such as

@example
:- func type1 ^ increment_field3 = type1.

Term0 ^ increment_field3 =
    Term0 ^ field1 ^ field3 := Term0 ^ field1 ^ field3 + 1.
@end example

The compiler expands this into

@example
increment_field3(Term0) = Term :-
    OldField3 = field3(field1(Term0)),
    OldField1 = field1(Term0),
    NewField1 = 'field3 :='(OldField1, OldField3 + 1),
    Term = 'field1 :='(Term0, NewField1).
@end example

The field access functions defined
in the Mercury standard library module @samp{map}
can be used as follows:

@example
:- func update_field_in_map(map(int, type1), int, string)
    = map(int, type1) is semidet.

update_field_in_map(Map, Index, Value) =
    Map ^ elem(Index) ^ field2 := Value.
@end example

@node The standard ordering
@section The standard ordering

For (almost) every Mercury type there exists a standard ordering;
any two values of the same type can be compared under this ordering
by using the @code{builtin.compare/3} predicate.
The ordering is total, meaning that the corresponding binary relations
are reflexive, transitive and anti-symmetric.
The one exception is higher-order types,
which cannot be unified or compared;
any attempt to do so will raise an exception.

The existence of this ordering makes it possible to implement
generic data structures such as sets and maps,
without needing to know the specifics of the ordering.
Furthermore, different platforms often have their own natural orderings
which are not necessarily consistent with each other.
As such, the standard ordering for most types is not fully defined.

For the primitive integer types,
the standard ordering is the usual numerical ordering.
Implementations should reject code containing overflowing integer literals.

For the primitive type @code{float},
the standard ordering approximates the usual numerical ordering.
If the result of @code{builtin.compare/3} is @code{(<)} or @code{(>)}
then this relation holds in the numerical ordering,
but this is not necessarily the case for @code{(=)} due to lack of precision.
In the standard ordering, ``negative'' and ``positive'' zero values are equal.
Implementations should replace overflowing literals
with the infinity of the same sign;
in the standard ordering positive infinity is greater than all finite values
and negative infinity is less than all finite values.
Implementations must throw an exception when comparing
a ``not a number'' (NaN) value.

For the primitive type @code{char},
the standard ordering is
the numerical ordering of the Unicode code point values.

For the primitive type @code{string},
the standard ordering is implementation dependent.
The current implementation performs string comparison using
the C @code{strcmp()} function,
the Java @code{String.compareTo()} method, and
the C# @code{System.String.CompareOrdinal()} method,
when compiling to C, Java and C# respectively.

For tuple types, corresponding arguments are compared,
with the first argument being the most significant,
then the second, and so on.

For discriminated union types (other than subtypes),
if both values have the same principal constructor
then corresponding arguments are compared in order,
with the first argument being the most significant,
then the second, and so on.
If the values have different principal constructors,
then the value whose principal constructor
is listed first in the definition of the type
will compare as less than the other value.
There is one exception from this rule:
in types that are subject to a @code{foreign_enum} pragma,
the outcomes of comparisons are decided
by user's chosen foreign language representations,
using the rules of the foreign language.

For subtypes, the two values compare as though
converted to the base type.
The ordering of constructors in a subtype definition
does not affect the standard ordering.

@node Modes
@chapter Modes

@menu
* Insts modes and mode definitions::
* Predicate and function mode declarations::
* Constrained polymorphic modes::
* Different clauses for different modes::
@end menu

@node Insts modes and mode definitions
@section Insts, modes, and mode definitions

The @dfn{mode} of a predicate, or function, is a mapping
from the initial state of instantiation of the arguments of the predicate,
or the arguments and result of a function,
to their final state of instantiation.
To describe states of instantiation,
we use information provided by the type system.
Types can be viewed as regular trees with two kinds of nodes:
or-nodes representing types
and and-nodes representing constructors.
The children of an or-node are the constructors
that can be used to construct terms of that type;
the children of an and-node are the types
of the arguments of the constructors.
We attach mode information to the or-nodes of type trees.

An @dfn{instantiatedness tree}
is an assignment of an @dfn{instantiatedness}
--- either @dfn{free} or @dfn{bound} ---
to each or-node of a type tree,
with the constraint that all descendants of a free node must be free.

A term is @dfn{approximated by} an instantiatedness tree
if for every node in the instantiatedness tree,

@itemize @bullet
@item
if the node is ``free'',
then the corresponding node in the term (if any)
is a free variable that does not share with any other variable
(we call such variables @dfn{distinct});

@item
if the node is ``bound'',
then the corresponding node in the term (if any)
is a function symbol.

@end itemize

When an instantiatedness tree tells us that a variable is bound,
there may be several alternative function symbols to which it could be bound.
The instantiatedness tree does not tell us which of these it is bound to;
instead for each possible function symbol it tells us exactly
which arguments of the function symbol will be free and which will be bound.
The same principle applies recursively to these bound arguments.

Mercury's mode system allows users
to declare names for instantiatedness trees using declarations such as

@example
:- inst listskel == bound([] ; [free | listskel]).
@end example

This instantiatedness tree describes lists
whose skeleton is known but whose elements are distinct variables.
As such, it approximates the term @code{[A,B]}
but not the term @code{[H|T]} (only part of the skeleton is known),
the term @code{[A,2]} (not all elements are variables),
or the term @code{[A,A]} (the elements are not distinct variables).

As a shorthand, the mode system provides @code{free} and @code{ground}
as names for instantiatedness trees
all of whose nodes are free and bound respectively
(with the exception of solver type values
which may be semantically ground,
but be defined in terms of non-ground solver type values;
see @ref{Solver types} for more detail).
The shape of these trees is determined
by the type of the variable to which they apply.

A more concise, alternative syntax exists
for @code{bound} instantiatedness trees:

@example
:- inst maybeskel
    --->    no
    ;       yes(free).
@end example

@noindent
which is equivalent to writing

@example
:- inst maybeskel == bound(no ; yes(free)).
@end example

You can specify what type (actually what type constructor)
an inst is intended to be used on
by adding @code{for}, followed by the name and arity of that type constructor,
after the name of the inst, like this:
@example
:- inst maybeskel for maybe/1
    --->    no
    ;       yes(free).
@end example

@noindent
This can be useful documentation,
and the compiler will generate an error message
when an inst that was declared to be for values of one type constructor
is applied to values of another type constructor.

As execution proceeds, variables may become more instantiated.
A @dfn{mode mapping} is a mapping
from an initial instantiatedness tree to a final instantiatedness tree,
with the constraint that no node of the type tree
is transformed from bound to free.
Mercury allows the user to specify mode mappings directly
by expressions such as @code{inst1 >> inst2},
or to give them a name using declarations such as

@example
:- mode m == inst1 >> inst2.
@end example

Two standard shorthand modes are provided,
corresponding to the standard notions of inputs and outputs:

@example
:- mode in == ground >> ground.
:- mode out == free >> ground.
@end example

Though we do not recommend this,
Prolog fans who want to use the symbols @samp{+} and @samp{-}
can do so by simply defining them using a mode declaration:

@example
:- mode (+) == in.
:- mode (-) == out.
@end example

These two modes are enough for most functions and predicates.
Nevertheless, Mercury's mode system is sufficiently
expressive to handle more complex data-flow patterns,
including those involving partially instantiated data structures,
i.e.@: terms that contain both function symbols and free variables,
such as @samp{f(a, b, X)}.
In the current implementation,
partially instantiated data structures are unsupported
due to a lack of alias tracking in the mode system.
For more information,
please see the @file{LIMITATIONS.md} file distributed with Mercury.

For example, consider an interface to a database
that associates data with keys,
and provides read and write access to the items it stores.
To represent accesses to the database over a network,
you would need declarations such as

@example
:- type operation
     --->   lookup(key, data)
     ;      set(key, data).
:- inst request for operation/0
    --->    lookup(ground, free)
    ;       set(ground, ground).
:- mode create_request == free >> request.
:- mode satisfy_request == request >> ground.
@end example

@samp{inst} and @samp{mode} declarations can be parametric.
For example, the following declaration

@example
:- inst listskel(Inst) for list/1
    --->    []
    ;       [Inst | listskel(Inst)].
@end example

@noindent
defines the inst @samp{listskel(Inst)} to be a list skeleton
whose elements have inst @code{Inst};
you can then use insts such as @samp{listskel(listskel(free))},
which represents the instantiation state of a list of lists of free variables.
The standard library provides the parametric modes

@example
:- mode in(Inst) == Inst >> Inst.
:- mode out(Inst) == free >> Inst.
@end example

@noindent
so that for example the mode @samp{create_request} defined above
could have been defined as

@example
:- mode create_request == out(request).
@end example

There must not be more than one inst definition
with the same name and arity in the same module.
Similarly, there must not be more than one mode definition
with the same name and arity in the same module.

Note that user-defined insts and modes may not have names
that have meanings in Mercury.
(Most of these are documented in later sections.)

The list of reserved inst names is
@example
=<
any
bound
bound_unique
clobbered
clobbered_any
free
ground
is
mostly_clobbered
mostly_unique
mostly_unique_any
not_reached
unique
unique_any
@end example

The list of reserved mode names is
@example
=
>>
any_func
any_pred
func
is
pred
@end example

@node Predicate and function mode declarations
@section Predicate and function mode declarations

A @dfn{predicate mode declaration}
assigns a mode mapping to each argument of a predicate.
A @dfn{function mode declaration}
assigns a mode mapping to each argument of a function,
and a mode mapping to the function result.
Each mode of a predicate or function is called a @dfn{procedure}.
For example, given the mode names defined by

@example
:- mode out_listskel == free >> listskel.
:- mode in_listskel == listskel >> listskel.
@end example

@noindent
the (type and) mode declarations
of the function @samp{length} and predicate @samp{append} are as follows:

@example
:- func length(list(T)) = int.
:- mode length(in_listskel) = out.
:- mode length(out_listskel) = in.

:- pred append(list(T), list(T), list(T)).
:- mode append(in, in, out).
:- mode append(out, out, in).
@end example

Note that functions may have more than one mode, just like predicates;
functions can be reversible.

Alternately, the mode declarations for @samp{length}
could use the standard library modes @samp{in/1} and @samp{out/1}:

@example
:- func length(list(T)) = int.
:- mode length(in(listskel)) = out.
:- mode length(out(listskel)) = in.
@end example

As for type declarations,
a predicate or function can be defined
to have a given higher-order inst (@pxref{Higher-order insts and modes})
by using @code{`with_inst`} in the mode declaration.

For example,

@example
:- inst foldl_pred == (pred(in, in, out) is det).
:- inst foldl_func == (func(in, in) = out is det).

:- mode p(in) `with_inst` foldl_pred.
:- mode f(in) `with_inst` foldl_func.
@end example

@noindent
is equivalent to

@example
:- mode p(in, in, in, out) is det.
:- mode f(in, in, in) = out is det.
@end example

@noindent
(@samp{is det} is explained in @ref{Determinism}.)

If a predicate or function has only one mode,
the @samp{pred} and @samp{mode} declaration can be combined:

@example
:- func length(list(T)::in) = (int::out).
:- pred append(list(T)::in, list(T)::in, list(T)::out).

:- pred p `with_type` foldl_pred(T, U) `with_inst` foldl_pred.
@end example

It is an error for a predicate or function
whose @samp{pred} and @samp{mode} declarations are so combined
to have any other separate @samp{mode} declarations.

If there is no mode declaration for a function,
the compiler assumes a default mode for the function
in which all the arguments have mode @code{in}
and the result of the function has mode @code{out}.
(However, there is no requirement that a function have such a mode;
if there is any explicit mode declaration, it overrides the default.)

If a predicate or function type declaration
occurs in the interface section of a module,
then all mode declarations for that predicate or function
must occur in the interface section of the @emph{same} module.
Likewise, if a predicate or function type declaration
occurs in the implementation section of a module,
then all mode declarations for that predicate or function
must occur in the implementation section of the @emph{same} module.
Therefore, it is an error
for a predicate or function to have mode declarations
in both the interface and implementation sections of a module.

A function or predicate mode declaration is an assertion by the programmer
that for all possible argument terms and (if applicable) result term
for the function or predicate
that are approximated (in our technical sense)
by the initial instantiatedness trees of the mode declaration
and all of whose free variables are distinct,
if the function or predicate succeeds, then
the resulting binding of those argument terms and (if applicable)
result term will in turn be approximated
by the final instantiatedness trees of the mode declaration,
with all free variables again being distinct.
We refer to such assertions as @dfn{mode declaration constraints}.
These assertions are checked by the compiler, which rejects programs
if it cannot prove that their mode declaration constraints are satisfied.

Note that with the usual definition of @samp{append}, the mode

@example
:- mode append(in_listskel, in_listskel, out_listskel).
@end example

@noindent
would not be allowed,
since it would create aliasing between the different arguments ---
on success of the predicate, the list elements would be free variables,
but they would not be distinct.

In Mercury it is always possible to call a procedure with an argument
that is more bound than the initial inst specified for that argument
in the procedure's mode declaration.
In such cases, the compiler will insert additional unifications
to ensure that the argument actually passed to the procedure
will have the inst specified.
For example, if the predicate @code{p/1} has mode @samp{p(out)},
you can still call @samp{p(X)} if @code{X} is ground.
The compiler will transform this code to @samp{p(Y), X = Y}
where @code{Y} is a fresh variable.
It is almost as if the predicate @code{p/1} has another mode @samp{p(in)};
we call such modes ``implied modes''.

To make this concept precise, we introduce the following definition.
A term @dfn{satisfies} an instantiatedness tree
if for every node in the instantiatedness tree,

@itemize @bullet
@item
if the node is ``free'',
then the corresponding node in the term (if any)
is either a distinct free variable,
or a function symbol.

@item
if the node is ``bound'',
then the corresponding node in the term (if any)
is a function symbol.

@end itemize

The @dfn{mode set} for a predicate or function
is the set of mode declarations for the predicate or function.
A mode set is an assertion by the programmer
that the predicate should only be called with argument terms
that satisfy the initial instantiatedness trees
of one of the mode declarations in the set
(i.e.@: the specified modes and the modes they imply
are the only allowed modes for this predicate or function).
We refer to the assertion associated with a mode set
as the @dfn{mode set constraint};
these are also checked by the compiler.

A predicate or function @var{p} is
@dfn{well-moded with respect to a given mode declaration}
if given that the predicates and functions called by @var{p}
all satisfy their mode declaration constraints,
there exists an ordering of the conjuncts in each conjunction
in the clauses of @var{p} such that

@itemize @bullet
@item
@var{p} satisfies its mode declaration constraint, and
@item
@var{p} satisfies the mode set constraint of all of the predicates and
functions it calls
@end itemize

We say that a predicate or function is well-moded
if it is well-moded with respect to
all the mode declarations in its mode set,
and we say that a program is well-moded
if all its predicates and functions are well-moded.

The mode analysis algorithm checks one procedure at a time.
It abstractly interprets the definition of the predicate or function,
keeping track of the instantiatedness of each variable,
and selecting a mode for each call and unification in the definition.
To ensure that
the mode set constraints of called predicates and functions are satisfied,
the compiler may reorder the elements of conjunctions;
it reports an error if no satisfactory order exists.
Finally it checks that
the resulting instantiatedness of the procedure's arguments
is the same as the one given by the procedure's declaration.

The mode analysis algorithm annotates each call with the mode used.

@node Constrained polymorphic modes
@section Constrained polymorphic modes

Mode declarations for predicates and functions may also have inst parameters.
However, such parameters must be constrained
to be @emph{compatible} with some other inst.
In a predicate or function mode declaration,
an inst of the form @samp{@var{InstParam} =< @var{Inst}},
where @var{InstParam} is a variable and @var{Inst} is an inst,
states that
@var{InstParam} is constrained to be @emph{compatible} with @var{Inst},
that is,
@var{InstParam} represents some inst
that can be used anywhere where @var{Inst} is required.
If an inst parameter occurs more than once in a declaration,
it must have the same constraint on each occurrence.

For example, in the mode declaration
@example
:- mode append(in(list_skel(I =< ground)), in(list_skel(I =< ground)),
    out(list_skel(I =< ground))) is det.
@end example
@noindent
@code{I} is an inst parameter which is constrained to be ground.
If @samp{append} is called with the first two arguments
having an inst of, say, @samp{list_skel(bound(f))},
then after @samp{append} returns,
all three arguments will have inst @samp{list_skel(bound(f))}.
If the mode of append had been simply
@example
:- mode append(in(list_skel(ground)), in(list_skel(ground)),
    out(list_skel(ground))) is det.
@end example
@noindent
then we would only have been able to infer an inst of @samp{list_skel(ground)}
for the third argument, not the more specific inst.

Note that attempting to call @samp{append}
when the first two arguments do not have ground insts
(e.g.@: @samp{list_skel(bound(g(free)))})
is a mode error because it violates the constraint on the inst parameter.

To avoid having to repeat a constraint everywhere that an inst parameter occurs,
it is possible to list the constraints after the rest of the mode declaration,
following a @samp{<=}.
E.g.@: the above example could have been written as
@example
:- (mode append(in(list_skel(I)), in(list_skel(I)),
    out(list_skel(I))) is det) <= I =< ground.
@end example

Note that in the current Mercury implementation
this syntax requires parentheses
around the @samp{mode(@dots{}) is @var{Det}} part of the declaration.

Also, if the constraint on an inst parameter is @samp{ground}
then it is not necessary to give the constraint in the declaration.
The example can be further shortened to
@example
:- mode append(in(list_skel(I)), in(list_skel(I)), out(list_skel(I)))
    is det.
@end example

Constrained polymorphic modes are particularly useful
when passing objects with higher-order types to polymorphic predicates,
since they allow the higher-order mode information to be retained
(@pxref{Higher-order}).

@node Different clauses for different modes
@section Different clauses for different modes

Because the compiler automatically reorders conjunctions to satisfy the modes,
it is often possible for a single clause to satisfy different modes.
However, occasionally reordering of conjunctions is not sufficient;
you may want to write different code for different modes.

For example, the usual code for list append

@example
append([], Ys, Ys).
append([X|Xs], Ys, [X|Zs]) :-
    append(Xs, Ys, Zs).
@end example

@noindent
works fine in most modes,
but is not very satisfactory for the @samp{append(out, in, in)} mode of append,
because although every call in this mode only has at most one solution,
the compiler's determinism inference will not be able to infer that.
This means that using the usual code
for append in this mode will be inefficient,
and the overly conservative determinism inference
may cause spurious determinism errors later.

For this mode, it is better to use a completely different algorithm:

@example
append(Prefix, Suffix, List) :-
    list.length(List, ListLength),
    list.length(Suffix, SuffixLength),
    PrefixLength = ListLength - SuffixLength,
    list.split_list(PrefixLength, List, Prefix, Suffix).
@end example

@noindent
However, that code does not work in the other modes of @samp{append}.

To handle such cases, you can use mode annotations on clauses,
which indicate that particular clauses
should only be used for particular modes.
To specify that a clause only applies to a given mode,
each argument @var{Arg} of the clause head
should be annotated with the corresponding argument mode @var{Mode},
using the @samp{::} mode qualification operator,
i.e.@: @samp{@var{Arg} :: @var{Mode}}.

For example, if @samp{append} was declared as
@example
@group
:- pred append(list(T), list(T), list(T)).
:- mode append(in, in, out).
:- mode append(out, out, in).
:- mode append(in, out, in).
:- mode append(out, in, in).
@end group
@end example

@noindent
then you could implement it as

@example
@group
append(L1::in,  L2::in,  L3::out) :- usual_append(L1, L2, L3).
append(L1::out, L2::out, L3::in)  :- usual_append(L1, L2, L3).
append(L1::in,  L2::out, L3::in)  :- usual_append(L1, L2, L3).
append(L1::out, L2::in,  L3::in)  :- other_append(L1, L2, L3).

usual_append([], Ys, Ys).
usual_append([X|Xs], Ys, [X|Zs]) :- usual_append(Xs, Ys, Zs).

other_append(Prefix, Suffix, List) :-
    list.length(List, ListLength),
    list.length(Suffix, SuffixLength),
    PrefixLength = ListLength - SuffixLength,
    list.split_list(PrefixLength, List, Prefix, Suffix).
@end group
@end example

This language feature can be used to write ``impure'' code
that does not have any consistent declarative semantics.
For example, you can easily use it
to write something similar to Prolog's (in)famous @samp{var/1} predicate:

@example
:- mode var(in).
:- mode var(free>>free).
var(_::in) :- fail.
var(_::free>>free) :- true.
@end example

@noindent
As you can see, in this case the two clauses are @emph{not} equivalent.

Because of this possibility,
predicates or functions which are defined
using different code for different modes
are by default assumed to be impure;
the programmer must either
(1) carefully ensure that the logical meaning of the clauses
is the same for all modes,
which can be declared to the compiler
through a @samp{pragma promise_equivalent_clauses} declaration,
or a @samp{pragma promise_pure} declaration,
or (2) declare the predicate or function as impure.
@xref{Impurity}.

In the example with @samp{append} above,
the two ways of implementing append do have the same declarative semantics,
so we can safely use the first approach:

@example
:- pragma promise_equivalent_clauses(append/3).
@end example

The pragma

@example
:- pragma promise_pure(append/3).
@end example

would also promise that the clauses are equivalent,
but on top of that would also promise that the code of each clause is pure.
Sometimes, if some clauses contain impure code,
that is a promise that the programmer wants to make,
but this extra promise is unnecessary in this case.

In the example with @samp{var/1} above,
the two clauses have different semantics,
so the predicate must be declared as impure:

@example
        :- impure pred var(T).
@end example

@node Unique modes
@chapter Unique modes

Mode declarations can also specify so-called ``unique modes''.
Mercury's unique modes are similar
to ``linear types'' in some functional programming languages such as Clean.
They allow you to specify
when there is only one reference to a particular value,
and when there will be no more references to that value.
If the compiler knows there will be no more references to a value,
it can perform ``compile-time garbage collection''
by automatically inserting code
to deallocate the storage associated with that value.
Even more importantly,
the compiler can also simply reuse the storage immediately,
for example by destructively updating one element of an array
rather than making a new copy of the entire array
in order to change one element.
Unique modes are also the mechanism Mercury uses to provide declarative I/O.

We have not yet implemented unique modes fully,
and the details are still in a state of flux.
So the following should be considered tentative.

@menu
* Destructive update::
* Backtrackable destructive update::
* Limitations of the current implementation::
@end menu

@node Destructive update
@section Destructive update

In addition to the insts mentioned above
(@code{free}, @code{ground}, and @code{bound(@dots{})}),
Mercury also provides ``unique'' insts
@code{unique} and @code{unique(@dots{})}
which are like @code{ground} and @code{bound(@dots{})} respectively,
except that they carry the additional constraint that
there can only be one reference to the corresponding value.
There is also an inst @code{dead} which means that
there are no references to the corresponding value,
so the compiler is free to generate code that reuses that value.
There are three standard modes for manipulating unique values:

@example
% unique output
:- mode uo == free >> unique.

% unique input
:- mode ui == unique >> unique.

% destructive input
:- mode di == unique >> dead.
@end example

Mode @code{uo} is used to create a unique value.
Mode @code{ui} is used to inspect a unique value without losing its uniqueness.
Mode @code{di} is used to deallocate or reuse the memory
occupied by a value that will not be used.

Note that a value is not considered @code{unique}
if it might be needed on backtracking.
This means that unique modes are generally only useful
for code whose determinism is @code{det} or @code{cc_multi}
(@pxref{Determinism}).

Unlike @code{bound} instantiatedness trees,
there is no alternative syntax for @code{unique} instantiatedness trees.

@node Backtrackable destructive update
@section Backtrackable destructive update

@quotation
``Well it just so happens that your friend here is only @emph{mostly} dead.
@*There's a big difference between mostly dead and all dead@dots{}
@*Now, mostly dead is slightly alive.
@*Now, all dead --- well, with all dead, there's usually only
one thing that you can do.''

``What's that?''

``Go through his clothes and look for loose change!''

--- from the movie ``The Princess Bride''.
@end quotation

To allow for backtrackable destructive updates
--- that is, updates whose effect is undone on backtracking,
perhaps by recording the overwritten values on a ``trail''
so that they can be restored after backtracking ---
Mercury also provides ``mostly unique'' modes.
The insts @code{mostly_unique} and @code{mostly_dead}
are equivalent to @code{unique} and @code{dead},
except that only references which will be encountered during forward execution
are counted ---
it is OK for @code{mostly_unique} or @code{mostly_dead} values
to be needed again on backtracking.

Mercury defines some standard modes
for manipulating ``mostly unique'' values, just as it does for unique values:

@example
% mostly unique output
:- mode muo == free >> mostly_unique.

% mostly unique input
:- mode mui == mostly_unique >> mostly_unique.

% mostly destructive input
:- mode mdi == mostly_unique >> mostly_dead.
@end example

@node Limitations of the current implementation
@section Limitations of the current implementation

The implementation of the mode analysis algorithm is not quite complete;
as a result, it is not possible to use nested unique modes,
i.e.@: modes in which anything but the top level of a variable is unique.
If you do, you will get unique mode errors
when you try to get a unique field of a unique data structure.
It is also not possible to use unique-input modes;
only destructive-input and unique-output modes work.

The Mercury compiler does not (yet) reuse @code{dead} values.
The only destructive update in the current implementation occurs
in library modules, e.g.@: for I/O and arrays.
We do however plan to implement structure reuse
and compile-time garbage collection in the future.

@node Determinism
@chapter Determinism

@menu
* Determinism categories::
* Determinism checking and inference::
* Replacing compile-time checking with run-time checking::
* Interfacing nondeterministic code with the real world::
* Committed choice nondeterminism::
@end menu

@node Determinism categories
@section Determinism categories

For each mode of a predicate or function,
we categorise that mode according to how many times it can succeed,
and whether or not it can fail before producing its first solution.

If all possible calls to a particular mode of a predicate or function
which return to the caller
(calls which terminate,
do not throw an exception
and do not cause a fatal runtime error)

@itemize @bullet
@item
have exactly one solution,
then that mode is @dfn{deterministic} (@code{det});

@item
either have no solutions or have one solution,
then that mode is @dfn{semideterministic} (@code{semidet});

@item
have at least one solution but may have more,
then that mode is @dfn{multisolution} (@code{multi});

@item
have zero or more solutions,
then that mode is @dfn{nondeterministic} (@code{nondet});

@item
fail without producing a solution,
then that mode has a determinism of @code{failure}.
@end itemize

If no possible calls to a particular mode of a predicate or function
can return to the caller,
then that mode has a determinism of @code{erroneous}.

The determinism annotation @code{erroneous} is used on the library predicates
@samp{require.error/1} and @samp{exception.throw/1},
but apart from that,
determinism annotations @code{erroneous} and @code{failure}
are generally not needed.

To summarize:

@example
                Maximum number of solutions
Can fail?       0               1               > 1
no              erroneous       det             multi
yes             failure         semidet         nondet
@end example

(Note: the ``Can fail?'' column here indicates only whether the procedure
can fail before producing at least one solution;
attempts to find a @emph{second} solution to a particular call,
e.g.@: for a procedure with determinism @code{multi},
are always allowed to fail.)

The determinism of each mode of a predicate or function
is indicated by an annotation on the mode declaration.
For example:

@example
:- pred append(list(T), list(T), list(T)).
:- mode append(in, in, out) is det.
:- mode append(out, out, in) is multi.
:- mode append(in, in, in) is semidet.

:- func length(list(T)) = int.
:- mode length(in) = out is det.
:- mode length(in(list_skel)) = out is det.
:- mode length(in) = in is semidet.
@end example

An annotation of @code{det} or @code{multi} is an assertion
that for every value of each of the inputs,
there exists at least one value of the outputs
for which the predicate is true,
or (in the case of functions)
for which the function term is equal to the result term.
Conversely, an annotation of @code{det} or @code{semidet} is an assertion
that for every value of each of the inputs,
there exists at most one value of the outputs for which the predicate is true,
or (in the case of functions)
for which the function term is equal to the result term.
These assertions are called the @dfn{mode-determinism assertions};
aside from assisting in reasoning about the code,
they may allow an implementation to perform
optimizations that would not otherwise be allowed,
such as optimizing away a goal with no outputs
even though it might throw an exception (@pxref{Exception handling}),
contain a trace goal (@pxref{Trace goals}),
or infinitely loop.
In some cases these optimizations may not be desirable;
the strict sequential semantics guarantees that they will not be performed
(@pxref{Formal semantics}).

If the mode of the predicate is given in the @code{:- pred} declaration
rather than in a separate @code{:- mode} declaration,
then the determinism annotation goes on the @code{:- pred} declaration
(and similarly for functions).
In particular, this is necessary
if a predicate does not have any argument variables.
If the determinism declaration is given on a @code{:- func} declaration
without the mode, the function is assumed to have the default mode
(see @ref{Modes} for more information on default modes of functions).

For example:

@example
:- pred loop(int::in) is erroneous.
loop(X) :- loop(X).

:- pred p is det.
p.

:- pred q is failure.
q :- fail.
@end example

If there is no mode declaration for a function,
then the default mode for that function
is considered to have been declared as @code{det}.
If you want to write a partial function,
i.e.@: one whose determinism is @code{semidet},
then you must explicitly declare the mode and determinism.

In Mercury, a function is supposed to be
a true mathematical function of its arguments;
that is, the value of the function's result
should be determined only by the values of its arguments.
Hence, for any mode of a function
that specifies that all the arguments are fully input
(i.e.@: for which the initial inst of all the arguments is a ground inst),
the determinism of that mode can only be
@code{det}, @code{semidet}, @code{erroneous}, or @code{failure}.

The determinism categories form this lattice:

@example
             erroneous
              /     \
          failure   det
             \     /   \
             semidet  multi
                 \     /
                  nondet
@end example

The higher up this lattice a determinism category is,
the more the compiler knows about the number of solutions
of procedures of that determinism.

@node Determinism checking and inference
@section Determinism checking and inference

The determinism of goals
is inferred from the determinism of their component parts,
according to the rules below.
The inferred determinism of a procedure is just the inferred
determinism of the procedure's body.

For procedures that are local to a module,
the determinism annotations may be omitted;
in that case, their determinism will be inferred.
(To be precise, the determinism of procedures without a determinism annotation
is defined as the least fixpoint of the transformation which,
given an initial assignment
of the determinism @code{det} to all such procedures,
applies those rules to infer
a new determinism assignment for those procedures.)

It is an error to omit the determinism annotation
for procedures that are exported from their containing module.

If a determinism annotation is supplied for a procedure,
the declared determinism is compared against the inferred determinism.
If the declared determinism is greater than or not comparable to the
inferred determinism (in the partial ordering above), it is an error.
If the declared determinism is less than the inferred determinism,
it is not an error, but the implementation may issue a warning.

The determinism category of each goal
is inferred according to the following rules.
These rules work with the two components of determinism categories:
whether the goal can fail without producing a solution,
and the maximum number of solutions of the goal (0, 1, or more).
If the inference process below reports that a goal can succeed more than once,
but the goal generates no outputs that are visible from outside the goal,
and the goal is not impure (@pxref{Impurity}),
then the final determinism of the goal
will be based on the goal succeeding at most once,
since the compiler will implicitly prune away any duplicate solutions.

@table @asis
@item Calls
The determinism category of a call
is the determinism declared or inferred
for the called mode of the called procedure.

@item Unifications
The determinism of a unification
is either @code{det}, @code{semidet}, or @code{failure},
depending on its mode.

A unification that assigns the value of one variable to another
is deterministic.
A unification that constructs a structure and assigns it to a variable
is also deterministic.
A unification that tests whether a variable has a given top function symbol
is semideterministic,
unless the compiler knows the top function symbol of that variable,
in which case its determinism is either det or failure
depending on whether the two function symbols are the same or not.
A unification that tests two variables for equality
is semideterministic,
unless the compiler knows that the two variables are aliases for one another,
in which case the unification is deterministic,
or unless the compiler knows that the two variables
have different function symbols in the same position,
in which case the unification has a determinism of failure.

The compiler knows the top function symbol of a variable
if the previous part of the procedure definition
contains a unification of the variable with a function symbol,
or if the variable's type has only one function symbol.

@item Conjunctions
The determinism of the empty conjunction (the goal @samp{true})
is @code{det}.
The conjunction @samp{(@var{A}, @var{B})} can fail
if either @var{A} can fail, or if @var{A} can succeed at least once,
and @var{B} can fail.
The conjunction can succeed at most zero times
if either @var{A} or @var{B} can succeed at most zero times.
The conjunction can succeed more than once
if either @var{A} or @var{B} can succeed more than once
and both @var{A} and @var{B} can succeed at least once.
(If e.g.@: @var{A} can succeed at most zero times,
then even if @var{B} can succeed many times
the maximum number of solutions of the conjunction is still zero.)
Otherwise, i.e.@: if both @var{A} and @var{B} succeed at most once,
the conjunction can succeed at most once.

@item Switches
A disjunction is a @emph{switch}
if each disjunct has near its start
a unification that tests the same bound variable
against a different function symbol.
For example, consider the common pattern
@example
@group
(
    L = [], empty(Out)
;
    L = [H|T], nonempty(H, T, Out)
)
@end group
@end example

If @code{L} is input to the disjunction,
then the disjunction is a switch on @code{L}.

If two variables are unified with each other,
then whatever function symbol one variable is unified with,
the other variable is considered to be unified with the same function symbol.
In the following example,
since @code{K} is unified with @code{L},
the second disjunct unifies @code{L} as well as @code{K} with cons,
and thus the disjunction is recognized as a switch.
@example
@group
(
    L = [], empty(Out)
;
    K = L, K = [H|T], nonempty(H, T, Out)
)
@end group
@end example

A switch can fail
if the various arms of the switch do not cover
all the function symbols in the type of the switched-on variable,
or if the code in some arms of the switch can fail,
bearing in mind that in each arm of the switch,
the unification that tests the switched-on variable
against the function symbol of that arm
is considered to be deterministic.
A switch can succeed several times
if some arms of the switch can succeed several times,
possibly because there are multiple disjuncts
that test the switched-on variable against the same function symbol.
A switch can succeed at most zero times
only if all the reachable arms of the switch can succeed at most zero times.
(A switch arm is not reachable
if it unifies the switched-on variable with a function symbol
that is ruled out by that variable's initial instantiation state.)

Only unifications may occur
before the test of the switched-on variable in each disjunct.
Tests of the switched-on variable
may occur within existential quantification goals.

The following example is a switch.

@example
(
    Out = 1, L = []
;
    some [H, T] (
        L = [H|T],
        nonempty(H, T, Out)
    )
)
@end example

The following example is not a switch
because the call in the first disjunct occurs
before the test of the switched-on variable.

@example
(
    empty(Out), L = []
;
    L = [H|T], nonempty(H, T, Out)
)
@end example

The unification of the switched-on variable with a function symbol
may occur inside a nested disjunction in a given disjunct,
provided that unification is preceded only by other unifications,
both inside the nested disjunction and before the nested disjunction.
The following example is a switch on @code{X},
provided @code{X} is bound beforehand.

@example
@group
(
    X = f,
    p(Out)
;
    Y = X,
    (
        Y = g,
        Intermediate = 42
    ;
        Z = Y,
        Z = h(Arg),
        q(Arg, Intermediate)
    ),
    r(Intermediate, Out)
)
@end group
@end example

@item Disjunctions
The determinism of the empty disjunction (the goal @samp{fail})
is @code{failure}.
A disjunction @samp{(@var{A} ; @var{B})} that is not a switch
can fail if both @var{A} and @var{B} can fail.
It can succeed at most zero times
if both @var{A} and @var{B} can succeed at most zero times.
It can succeed at most once
if one of @var{A} and @var{B} can succeed at most once
and the other can succeed at most zero times.
Otherwise, i.e.@: if either @var{A} or @var{B} can succeed more than once,
or if both @var{A} and @var{B} can succeed at least once,
it can succeed more than once.

@c The local determinism of a disjunction is @code{nondet} unless the
@c compiler can detect that the disjunction is actually a switch and
@c hence @dfn{index} the disjunction.
@c Precisely describing the rules for detecting switches is somewhat tricky,
@c and I won't attempt to do so, but they are
@c reasonable easy to understand in practice.
@c The compiler can index on any input variable to a disjunction
@c (not just the first head variable).  It can also index on more than
@c one variable, since after indexing on the first one, switch detection is
@c applied to all sub-disjunctions.  It can index on any functor, not
@c just the top-most one.

@item If-then-else

If the condition of an if-then-else cannot fail,
the if-then-else is equivalent
to the conjunction of the condition and the ``then'' part,
and its determinism is computed accordingly.
Otherwise, an if-then-else can fail
if either the ``then'' part or the ``else'' part can fail.
It can succeed at most zero times
if the ``else'' part can succeed at most zero times
and if at least one of the condition and the ``then'' part
can succeed at most zero times.
It can succeed more than once
if any one of the condition, the ``then'' part and the ``else'' part
can succeed more than once.

@item Negations

If the determinism of the negated goal is @code{erroneous},
then the determinism of the negation is @code{erroneous}.
If the determinism of the negated goal is @code{failure},
the determinism of the negation is @code{det}.
If the determinism of the negated goal is @code{det} or @code{multi},
the determinism of the negation is @code{failure}.
Otherwise, the determinism of the negation is @code{semidet}.

@end table

@node Replacing compile-time checking with run-time checking
@section Replacing compile-time checking with run-time checking

Note that ``perfect'' determinism inference is an undecidable problem,
because it requires solving the halting problem.
(For instance, in the following example

@example
@group
:- pred p(T, T).
:- mode p(in, out) is det.

p(A, B) :-
    (
        something_complicated(A, B)
    ;
        B = A
    ).
@end group
@end example

@noindent
@samp{p/2} can have more than one solution
only if @samp{something_complicated/2} can succeed.)
Sometimes,
the rules specified by the Mercury language for determinism inference
will infer a determinism that is not as precise as you would like.
However, it is generally easy to overcome such problems.
The way to do this is to replace the compiler's static checking
with some manual run-time checking.
For example, if you know that a particular goal should never fail,
but the compiler infers that goal to be @code{semidet},
you can check at runtime that the goal does succeed,
and if it fails, call the library predicate @samp{error/1}.

@example
:- pred q(T, T).
:- mode q(in, out) is det.

q(A, B) :-
    ( if goal_that_should_never_fail(A, B0) then
        B = B0
    else
        error("goal_that_should_never_fail failed!")
    ).
@end example

@noindent
The predicate @code{error/1} has determinism @code{erroneous},
which means the compiler knows that it will never succeed or fail,
so the inferred determinism for the body of @code{q/2} is @code{det}.
(Checking assumptions like this is good coding style anyway.
The small amount of up-front work that Mercury requires
is paid back in reduced debugging time.)
Mercury's mode analysis knows that
computations with determinism @code{erroneous} can never succeed,
which is why it does not require the ``else'' part
to generate a value for @code{B}.
The introduction of the new variable @code{B0} is necessary
because the condition of an if-then-else is a negated context,
and can export the values it generates
only to the ``then'' part of the if-then-else,
not directly to the surrounding computation.
(If the surrounding computations had direct access
to values generated in conditions,
they might access them even if the condition failed.)

@node Interfacing nondeterministic code with the real world
@section Interfacing nondeterministic code with the real world

Normally, attempting to call
a @code{nondet} or @code{multi} mode of a predicate
from a predicate declared as @code{semidet} or @code{det}
will cause a determinism error.
So how can we call nondeterministic code from deterministic code?
There are several alternative possibilities.

If you just want to see if a nondeterministic goal is satisfiable or not,
without needing to know what variable bindings it produces,
then there is no problem -
determinism analysis considers @code{nondet} and @code{multi} goals
with no non-local output variables to be
@code{semidet} and @code{det} respectively.

If you want to use the values of output variables,
then you need to ask yourself
which one of possibly many solutions to a goal do you want?
If you want all of them, then you need to use one of the predicates
in the standard library module @samp{solutions},
such as @samp{solutions/2} itself,
which collects all of the solutions to a goal into a list ---
@pxref{Higher-order}.

If you just want one solution from a predicate and don't care which,
you should declare the relevant mode of the predicate
to have determinism @code{cc_nondet} or @code{cc_multi}
(depending on whether you are guaranteed at least one solution or not).
This tells the compiler that this mode of this predicate
may have more than one solution when viewed as a statement in logic,
but the implementation should stop after generating the first solution.
In other words, the implementation should @emph{commit} to the first solution.

The commit to the first solution means that
a piece of @code{cc_nondet} or @code{cc_multi} code
can never be asked to generate a second solution.
If e.g. a @code{cc_nondet} call is in a conjunction,
then no later goal in that conjunction (after mode reordering) may fail,
because that would ask the committed choice goal for a second solution.
The compiler enforces this rule.

In the declarative semantics,
which solution will be the first is unpredictable,
but in the operational semantics,
you @emph{can} predict which solution will be the first,
since Mercury does depth-first search
with left-to-right execution of clause bodies,
though that is not on the source code form of each clause body,
but on its form @emph{after} mode analysis reordering
to put the producer of each variable before all its consumers.

The @samp{committed choice nondeterminism} of a predicate
has to be propagated up the call tree,
making its callers, its callers' callers, and so on,
also @code{cc_nondet} or @code{cc_multi},
until either you get to @code{main/2} at the top of the call tree,
or you get to a location where you don't have to propagate
the committed choice context upward anymore.

While @code{main/2} is usually declared to have determinism @code{det},
it may also be declared @code{cc_multi}.
In the latter case,
while the program's declarative semantics may admit more than one solution,
the implementation will stop after the first,
so alternative solutions to @code{main/2}
(and hence also to @code{cc_nondet} or @code{cc_multi} predicates
called directly or indirectly from @samp{main/2})
are implicitly pruned away.
This is similar to the ``don't care'' nondeterminism
of committed choice logic programming languages such as GHC.

One way to stop propagating committed choice nondeterminism
is the one mentioned above: if a goal has no non-local output variables
(i.e.@: none of the goal's outputs are visible from outside the goal),
then the goal's solutions are indistinguishable from the outside,
and the implementation will only attempt to satisfy the goal once,
whether or not the goal is committed choice.
Therefore if a @code{cc_multi} goal has all its outputs ignored,
then the compiler considers it to be a @code{det} goal,
while if a @code{cc_nondet} goal has all its outputs ignored,
then the compiler considers it to be a @code{semidet} goal.

The other way to stop propagating committed choice nondeterminism is applicable
when you know that all the solutions returned will be equivalent
in all the ways that your program cares about.
For example, you might want to find the maximum element in a set
by iterating over the elements in the set.
Iterating over the elements in a set in an unspecified order is a
nondeterministic operation,
but no matter which order you iterate over them,
the maximum value in the set should be the same.

If this condition is satisfied,
i.e.@: if you know that there will only ever be at most one distinct solution
under your equality theory of the output variables,
then you can use a @samp{promise_equivalent_solutions} determinism cast.
If goal @samp{G} is a @code{cc_multi} goal
whose outputs are @code{X} and @code{Y}, then
@code{promise_equivalent_solutions [X, Y] ( G )}
promises the compiler that all solutions of @code{G} are equivalent,
so that regardless of which solution of @code{G}
the implementation happens to commit to,
the rest of the program will compute
either identical or (similarly) equivalent results.
This allows the compiler to consider
@code{promise_equivalent_solutions [X, Y] ( G )}
to have determinism @code{det}.
Likewise, the compiler will consider
@code{promise_equivalent_solutions [X, Y] ( G )}
where @code{G} is @code{cc_nondet} to have determinism @code{semidet}.

Note that specifying a user-defined equivalence relation
as the equality predicate for user-defined types
(@pxref{User-defined equality and comparison})
means that @samp{promise_equivalent_solutions}
can be used to express more general forms of equivalence.
For example, if you define a set type which represents sets as unsorted lists,
you would want to define a user-defined equivalence relation for that type,
which could sort the lists before comparing them.
The @samp{promise_equivalent_solutions} determinism cast
could then be used for sets
even though the lists used to represent the sets
might not be in the same order in every solution.

@node Committed choice nondeterminism
@section Committed choice nondeterminism

In addition to the determinism annotations described earlier,
there are ``committed choice'' versions of @code{multi} and @code{nondet},
called @code{cc_multi} and @code{cc_nondet}.
These can be used instead of @code{multi} or @code{nondet}
if all calls to that mode of the predicate (or function)
occur in a context in which only one solution is needed.

Such single-solution contexts are determined as follows.

@itemize @bullet
@item
The body of any procedure declared @code{cc_multi} or @code{cc_nondet}
is in a single-solution context.
For example, the program entry point @samp{main/2}
may be declared @code{cc_multi},
and in that case the clauses for @code{main} are in a single-solution context.

@item
Any goal with no output variables is in a single-solution context.

@item
If a conjunction is in a single-solution context,
then the right-most conjunct is in a single-solution context,
and if the right-most conjunct cannot fail,
then the rest of the conjunction is also in a single-solution context.
(``Right-most'' here refers to the order @emph{after} mode reordering.)

@item
If an if-then-else is in a single-solution context,
then the ``then'' part and the ``else'' part are in single-solution contexts,
and if the ``then'' part cannot fail,
then the condition of the if-then-else is also in a single-solution context.

@item
For other compound goals, i.e.@: disjunctions, negations,
and (explicitly) existentially quantified goals,
if the compound goal is in a single-solution context,
then the immediate sub-goals of that compound goal
are also in single-solution contexts.

@end itemize

The compiler will check that
all calls to a committed-choice mode of a predicate (or function)
do indeed occur in a single-solution context.

You can declare two different modes of a predicate (or function)
which differ only in ``cc-ness''
(i.e.@: one being @code{multi} and the other @code{cc_multi},
or one being @code{nondet} and the other @code{cc_nondet}).
In that case,
the compiler will select the appropriate one for each call
depending on whether the call comes from a single-solution context or not.
Calls from single-solution contexts will call the committed choice version,
while calls which are not from single-solution contexts
will call the backtracking version.

There are several reasons to use committed choice determinism annotations.
One reason is for efficiency:
committed choice annotations allow the compiler
to generate much more efficient code.
Another reason is for doing I/O,
which is allowed only in @code{det} or @code{cc_multi} predicates,
not in @code{multi} predicates.
Another is for dealing with types that use non-canonical representations
(@pxref{User-defined equality and comparison}).
And there are a variety of other applications.

@c XXX fix semantics for I/O + committed choice + mode inference

@c @node Assertions
@c @chapter Assertions
@c
@c Mercury supports the declaration of laws that hold for predicates and
@c functions.
@c These laws are only checked for type-correctness,
@c it is the responsibility of the programmer to ensure overall correctness.
@c The behaviour of programs with incorrect laws is undefined.
@c
@c A new law is introduced with the @samp{:- promise} declaration.
@c
@c Here are some examples of @samp{:- promise} declarations.
@c The following example declares the function @samp{+} to be commutative.
@c
@c @example
@c :- promise all [A, B, R]
@c     (
@c         R = A + B
@c     <=>
@c         R = B + A
@c     ).
@c @end example
@c
@c Note that each variable in the declaration was explicitly quantified.
@c The current Mercury compiler requires that each assertion begins with
@c an @samp{all} quantification, and that every variable is explicitly
@c quantified.
@c
@c Here is a more complicated declaration. It declares that @samp{append} is
@c associative.
@c
@c @example
@c :- promise all [A, B, C, ABC]
@c     (
@c         ( some [AB] (append(A, B, AB), append(AB, C, ABC)) )
@c     <=>
@c         ( some [BC] (append(B, C, BC), append(A, BC, ABC)) )
@c     ).
@c @end example

@node User-defined equality and comparison
@chapter User-defined equality and comparison

When defining abstract data types,
often it is convenient to use a non-canonical representation ---
that is, one for which a single abstract value
may have more than one different possible concrete representation.
For example, you may wish to implement an abstract type @samp{set}
by representing a set as an (unsorted) list.

@example
:- module set_as_unsorted_list.
:- interface.
:- type set(T).

:- implementation.
:- import_module list.
:- type set(T)
    --->    set(list(T)).
@end example

@noindent
In this example,
the concrete representations @samp{set([1,2])} and @samp{set([2,1])}
would both represent the same abstract value,
namely the set containing the elements 1 and 2.

For types such as this, which do not have a canonical representation,
the standard definition of equality is not the desired one;
we want equality on sets to mean equality of the abstract values,
not equality of their representations.
To support such types, Mercury allows programmers to specify
a user-defined equality predicate for user-defined types
(not including subtypes):

@example
:- type set(T)
    --->    set(list(T))
            where equality is set_equals.
@end example

@noindent
Here @samp{set_equals} is the name of a user-defined predicate
that is used for equality on the type @samp{set(T)}.
It could for example be defined in terms of a @samp{subset} predicate.

@example
:- pred set_equals(set(T)::in, set(T)::in) is semidet.
set_equals(S1, S2) :-
    subset(S1, S2),
    subset(S2, S1).
@end example

A comparison predicate can also be supplied.

@example
:- type set(T)
    --->    set(list(T))
            where equality is set_equals, comparison is set_compare.

:- pred set_compare(builtin.comparison_result::uo,
    set(T)::in, set(T)::in) is det.

set_compare(Result, Set1, Set2) :-
    promise_equivalent_solutions [Result] (
        set_compare_2(Set1, Set2, Result)
    ).

:- pred set_compare_2(set(T)::in, set(T)::in,
    builtin.comparison_result::uo) is cc_multi.

set_compare_2(set(List1), set(List2), Result) :-
    builtin.compare(Result, list.sort(List1), list.sort(List2)).
@end example

If a comparison predicate is supplied
and the unification predicate is omitted,
a unification predicate is generated by the compiler
in terms of the comparison predicate.
For the @samp{set} example, the generated predicate would be:

@example
set_equals(S1, S2) :-
    set_compare((=), S1, S2).
@end example

If a unification predicate is supplied without a comparison predicate,
the compiler will generate a comparison predicate
which throws an exception of type @samp{exception.software_error} when called.

A type declaration for a type @samp{foo(T1, @dots{}, TN)}
may contain a @samp{where equality is @var{equalitypred}} specification
only if it declares a discriminated union type or a foreign type
(@pxref{Using foreign types from Mercury})
and the following conditions are satisfied:

@itemize @bullet
@item
@var{equalitypred} must be the name of a predicate with signature
@example
:- pred @var{equalitypred}(foo(T1, @dots{}, TN)::in,
                foo(T1, @dots{}, TN)::in) is semidet.
@end example

It is legal for the type, mode and determinism to be more permissive:
the type or the mode's initial insts may be more general
(e.g.@: the type of the equality predicate
could be just the polymorphic type @samp{pred(T, T)})
and the mode's final insts or the determinism may be more specific
(e.g.@: the determinism of the equality predicate
could be any of @code{det}, @code{failure} or @code{erroneous}).

@item
If the type is a discriminated union
then its definition cannot be a single zero-arity constructor.

@item
The equality predicate must be ``pure'' (@pxref{Impurity}).

@item
The equality predicate must be defined in the same module as the type.

@item
If the type is exported the equality predicate must also be exported.

@item
@var{equalitypred} should be an equivalence relation;
that is, it must be symmetric, reflexive, and transitive.
However, the compiler is not required to check this
@footnote{If @var{equalitypred} is not an equivalence relation,
then the program is inconsistent:
its declarative semantics contains a contradiction,
because the additional axioms for the user-defined equality
contradict the standard equality axioms.
That implies that the implementation
may compute any answer at all (@pxref{Formal semantics}),
i.e.@: the behaviour of the program is undefined.}.

@end itemize

Types with user-defined equality can only be used in limited ways.
Because there are multiple representations for the same abstract value,
any attempt to examine the representation of such a value
is a conceptually non-deterministic operation.
In Mercury this is modelled using committed choice nondeterminism.

The semantics of specifying @samp{where equality is @var{equalitypred}}
on the type declaration for a type @var{T} are as follows:

@itemize @bullet
@item
If the program contains any deconstruction unification or switch
on a variable of type @var{T} that could fail,
other than unifications with mode @samp{(in, in)},
then it is a compile-time error.

@item
If the program contains any deconstruction unification or switch
on a variable of type @var{T} that cannot fail,
then that operation has determinism @code{cc_multi}.

@item
Any attempts to examine the representation of a variable of type @var{T}
using facilities of the standard library
(e.g.@: @samp{argument}/3 and @samp{functor/3} in @samp{deconstruct})
that do not have determinism @code{cc_multi} or @code{cc_nondet}
will result in a run-time error.

@item
In addition to the usual equality axioms,
the declarative semantics of the program will contain the axiom
@samp{@var{X} = @var{Y} <=> @var{equalitypred}(@var{X}, @var{Y})}
for all @var{X} and @var{Y} of type @samp{T}.

@item
Any @samp{(in, in)} unifications for type @var{T}
are computed using the specified predicate @var{equalitypred}.

@end itemize

A type declaration for a type @samp{foo(T1, @dots{}, TN)}
may contain a @samp{where comparison is @var{comparepred}} specification
only if it declares a discriminated union type or a foreign type
(@pxref{Using foreign types from Mercury}) and the
following conditions are satisfied:

@itemize @bullet
@item
@var{comparepred} must be the name of a predicate with signature
@example
:- pred @var{comparepred}(builtin.comparison_result::uo,
                foo(T1, @dots{}, TN)::in, foo(T1, @dots{}, TN)::in) is det.
@end example

As with equality predicates,
it is legal for the type, mode and determinism to be more permissive.

@item
If the type is a discriminated union
then its definition cannot be a single zero-arity constructor.

@item
The comparison predicate must also be ``pure'' (@pxref{Impurity}).

@item
The comparison predicate must be defined in the same module as the type.

@item
If the type is exported the comparison predicate must also be exported.

@item
The relation
@example
compare_eq(X, Y) :- @var{comparepred}((=), X, Y).
@end example
must be an equivalence relation;
that is, it must be symmetric, reflexive, and transitive.
The compiler is not required to check this.

@item
The relations
@example
compare_leq(X, Y) :- @var{comparepred}(R, X, Y), (R = (=) ; R = (<)).
compare_geq(X, Y) :- @var{comparepred}(R, X, Y), (R = (=) ; R = (>)).
@end example
must be total order relations:
that is they must be antisymmetric, reflexive and transitive.
The compiler is not required to check this.

@end itemize

For each type for which the declaration has a
@samp{where comparison is @var{comparepred}} specification,
any calls to the standard library predicate @samp{builtin.compare/3}
with arguments of that type are evaluated
as if they were calls to @var{comparepred}.

A type declaration may contain a
@samp{where equality is @var{equalitypred}, comparison is @var{comparepred}}
specification only if in addition to the conditions above,
@samp{all [X, Y] (@var{comparepred}((=), X, Y) <=> @var{equalitypred}(X, Y))}.
The compiler is not required to check this.

@node Higher-order
@chapter Higher-order programming

Mercury supports higher-order functions and predicates
with currying, closures, and lambda expressions.
(To be pedantic, it would be more accurate to say that
Mercury supports higher-order procedures.
This is because in Mercury, when you construct a higher-order term,
you only get one mode of a predicate or function.
If you want multiple modes, you must pass multiple higher-order procedures.)

@menu
* Creating higher-order terms::
* Calling higher-order terms::
* Comparing higher-order terms::
* Higher-order insts and modes::
@end menu

@node Creating higher-order terms
@section Creating higher-order terms
@c NB. This section is pointed to by an error message in compiler/typecheck.m,
@c so if you change the section name, you need to update that error message.

To create a higher-order predicate or function term,
you can use a lambda expression,
or, if the predicate or function has only one mode
and it is not a zero-arity function,
you can just use its name.
For example, if you have declared a predicate

@example
:- pred sum(list(int), int).
:- mode sum(in, out) is det.
@end example

@noindent
the following unifications have the same effect:

@example
X = (pred(List::in, Length::out) is det :- sum(List, Length))
Y = sum
@end example

In the above example,
the type of @code{X} and @code{Y} is @samp{pred(list(int), int)},
which means a predicate of two arguments
of types @code{list(int)} and @code{int} respectively.

Similarly, given

@example
:- func scalar_product(int, list(int)) = list(int).
:- mode scalar_product(in, in) = out is det.
@end example

@noindent
the following three unifications have the same effect:

@example
@group
X = (func(Num, List) = NewList :- NewList = scalar_product(Num, List))
Y = (func(Num::in, List::in) = (NewList::out) is det
        :- NewList = scalar_product(Num, List))
Z = scalar_product
@end group
@end example

In the above example,
the type of @code{X}, @code{Y}, and @code{Z} is
@samp{func(int, list(int)) = list(int)},
which means a function of two arguments,
whose types are @code{int} and @code{list(int)},
with a return type of @code{list(int)}.
As with @samp{:- func} declarations,
if the modes and determinism of the function
are omitted in a higher-order function term,
then the modes default to @code{in} for the arguments,
@code{out} for the function result,
and the determinism defaults to @code{det}.

The Melbourne Mercury implementation currently requires
that you use an explicit lambda expression to specify which mode you want,
if the predicate or function has more than one mode
(but see below for an exception to this rule).

You can also create higher-order function terms
of non-zero arity and higher-order predicate terms by ``currying'',
i.e.@: specifying the first few arguments to a predicate or function,
but leaving the remaining arguments unspecified.
For example, the unification

@example
Sum123 = sum([1,2,3])
@end example

@noindent
binds @code{Sum123} to a higher-order predicate term of type @samp{pred(int)}.
Similarly, the unification

@example
Double = scalar_product(2)
@end example

@noindent
binds @code{Double} to a higher-order function term of type
@samp{func(list(int)) = list(int)}.

As a special case, currying of a multi-moded predicate or function is allowed
provided that the mode of the predicate or function
can be determined from the insts of the higher-order curried arguments.
For example, @samp{P = list.foldl(io.write)} is allowed
because the inst of @samp{io.write} matches
exactly one mode of @samp{list.foldl}.

For higher-order predicate expressions that thread an accumulator pair,
we have syntax that allows you to use DCG notation
in the goal of the expression.
For example,

@example
Pred =
    ( pred(Strings::in, Num::out, di, uo) is det -->
        io.write_string("The strings are: "),
        @{ list.length(Strings, Num) @},
        io.write_strings(Strings),
        io.nl
    )
@end example

@noindent
is equivalent to

@example
Pred =
    ( pred(Strings::in, Num::out, IO0::di, IO::uo) is det :-
        io.write_string("The strings are: ", IO0, IO1),
        list.length(Strings, Num),
        io.write_strings(Strings, IO1, IO2),
        io.nl(IO2, IO)
    )
@end example

Higher-order function terms of zero arity
can only be created using an explicit lambda expression;
you have to use e.g.@: @samp{(func) = foo} rather than plain @samp{foo},
because the latter denotes the result of evaluating the function,
rather than the function itself.

Note that when constructing a higher-order term,
you cannot just use the name of a builtin language construct
such as @samp{=}, @samp{\=}, @samp{call}, or @samp{apply},
and nor can such constructs be curried.
Instead, you must either use an explicit lambda expression,
or you must write a forwarding predicate or function.
For example, instead of

@example
list.filter(\=(2), [1, 2, 3], List)
@end example

@noindent
you must write either

@example
list.filter((pred(X::in) is semidet :- X \= 2), [1, 2, 3], List)
@end example

@noindent
or

@example
list.filter(not_equal(2), [1, 2, 3], List)
@end example

@noindent
where you have defined @samp{not_equal} using

@example
:- pred not_equal(T::in, T::in) is semidet.
not_equal(X, Y) :- X \= Y.
@end example

Another case when this arises is when you want to curry a higher-order term.
Suppose, for example, that you have
a higher-order predicate term @samp{OldPred}
of type @samp{pred(int, char, float)},
and you want to construct a new higher-order predicate term
@samp{NewPred} of type @samp{pred(char, float)} from @samp{OldPred}
by supplying a value for just the first argument.
The solution is the same:
use an explicit lambda expression or a forwarding predicate.
In either case, the body of the lambda expression or the forwarding predicate
must contain a higher-order call with all the arguments supplied.

@node Calling higher-order terms
@section Calling higher-order terms

Once you have created a higher-order predicate term
(sometimes known as a closure),
the next thing you want to do is to call it.
For predicates, you use the builtin goal call/N:

@table @asis
@item @code{call(Closure)}
@itemx @code{call(Closure1, Arg1)}
@itemx @code{call(Closure2, Arg1, Arg2)}
@itemx @dots{}
A higher-order predicate call.  @samp{call(Closure)} just calls the
specified higher-order predicate term.  The other forms append the
specified arguments onto the argument list of the closure before
calling it.
@end table

For example, the goal

@example
call(Sum123, Result)
@end example

@noindent
would bind @code{Result} to the sum of @samp{[1, 2, 3]}, i.e.@: to 6.

For functions, you use the builtin expression apply/N:

@table @asis
@item @code{apply(Closure)}
@itemx @code{apply(Closure1, Arg1)}
@itemx @code{apply(Closure2, Arg1, Arg2)}
@itemx @dots{}
A higher-order function application.
Such a term denotes the result of
invoking the specified higher-order function term with the specified arguments.
@end table

For example, given the definition of @samp{Double} above, the goal

@example
List = apply(Double, [1, 2, 3])
@end example

@noindent
would be equivalent to

@example
List = scalar_product(2, [1, 2, 3])
@end example

@noindent
and so for a suitable implementation of the function @samp{scalar_product/2}
this would bind @code{List} to @samp{[2, 4, 6]}.

One useful higher-order predicate in the Mercury standard library
is @samp{solutions/2}, which has the following declaration:

@example
:- pred solutions(pred(T), list(T)).
:- mode solutions(in(pred(out) is nondet), out) is det.
@end example

The term which you pass to @samp{solutions/2} is a higher-order predicate term.
You can pass the name of a one-argument predicate,
or you can pass a several-argument predicate
with all but one of the arguments supplied (a closure).
The declarative semantics of @samp{solutions/2} can be defined as follows:

@example
solutions(Pred, List) is true if and only if
    all [X] (call(Pred, X) <=> list.member(X, List))
    and List is sorted without any duplicates.
@end example

@noindent
where @samp{call(Pred, X)}
invokes the higher-order predicate term @code{Pred} with argument @code{X},
and where @samp{list.member/2}
is the standard library predicate for list membership.
In other words, @samp{solutions(Pred, List)} finds all the values of @code{X}
for which @samp{call(Pred, X)} is true,
collects these solutions in a list,
sorts the list, and returns that list as its result.
Here is an example: the standard library defines a predicate
@samp{list.perm(List0, List)}

@example
:- pred list.perm(list(T), list(T)).
:- mode list.perm(in, out) is nondet.
@end example

@noindent
which succeeds if and only if List is a permutation of List0.
Hence the following call to solutions

@example
solutions(list.perm([3,1,2]), L)
@end example

@noindent
should return
all the possible permutations of the list @samp{[3,1,2]} in sorted order:

@example
L = [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]].
@end example

See also @samp{unsorted_solutions/2} and @samp{solutions_set/2},
which are defined in the standard library module @samp{solutions}
and documented in the Mercury Library Reference Manual.

@node Comparing higher-order terms
@section Comparing higher-order terms

In Mercury, it is an error
to attempt to unify or to compare two higher-order terms.
This is because the question of whether two higher-order terms are equivalent
is undecidable in the general case.

Note that the compiler will catch only
@emph{direct} attempts at unifications or comparisons of higher-order terms.
Indirect attempts,
using for example polymorphic predicates
such as @samp{list.append([], [P], [Q])},
will result in an error at run-time rather than at compile-time.

@node Higher-order insts and modes
@section Higher-order insts and modes

In Mercury, the mode and determinism
of a higher-order predicate or function term
are generally part of that term's @emph{inst}, not its @emph{type}.
This allows a single higher-order predicate to work
on argument predicates that differ in their modes and in their determinisms,
which is particularly useful for library predicates
such as @samp{list.map} and @samp{list.foldl}.

Consider @samp{list.foldl}, one of the standard fold predicates on lists.
The types of its arguments are given by this predicate declaration:
@example
:- pred foldl(pred(L, A, A), list(L), A, A).
@end example

The first argument is a higher order value (a predicate in this case),
whose types are the types bound by the caller
to the type variables @var{L}, @var{A} and @var{A} respectively,
where @var{L} is the type of the elements in the list in the second argument,
and @var{A} is the type of the accumulator
whose initial and final values are the third and fourth arguments.
The job of the predicate passed in the first argument
is to combine each element in the list
with the current value of the accumulator
to generate the next value of the accumulator,
so in most calls to @samp{list.foldl},
the argument modes and the determinism of that predicate will be
@example
pred(in, in, out) is det
@end example

@noindent
This is a @emph{higher order inst}:
an inst describing the modes of the arguments and the determinism
of a higher order value, which in this case is a predicate.
A @emph{higher order mode} is a mode in which
the initial and/or final inst is a higher order inst.
These @emph{can} be written like this:

@example
pred(in, in, out) is det >> pred(in, in, out) is det
@end example

@noindent
for a predicate being passed to another predicate or function,
and like this

@example
free >> pred(in, in, out) is det
@end example

@noindent
for a predicate being returned from another predicate or function.
In practice, it is far more convenient
to use the builtin mode constructors @samp{in/1} and @samp{out/1} to write

@example
in(pred(in, in, out) is det)
out(pred(in, in, out) is det)
@end example

@noindent
which are each equivalent to the corresponding example just above.

You can use higher order insts and modes in mode declarations like this:

@example
:- mode foldl(in(pred(in, in, out) is det), in, in, out) is det.
@end example

@noindent
The @samp{in()} wrapper around the higher order inst here
declares the first argument of @samp{list.foldl} to be an input,
but the higher order inst inside the wrapper goes further,
by specifying the argument modes and the determinism
of the predicate that @samp{list.foldl} takes @emph{in this mode}.

That last qualification is important,
because @samp{list.foldl} has several modes,
each differing in the argument modes, in the determinism, or both.
The set of mode declarations for @samp{list.foldl} includes

@c This list is a good argument for (possibly constrained) mode variables
@c and determinism variables, which would allow all of the modes above
@c to declared with just
@c :- mode foldl(pred(in, InMode, OutMode) is Det, in, InMode, OutMode) is Det.
@example
:- mode foldl(in(pred(in, in, out) is det), in, in, out) is det.
:- mode foldl(in(pred(in, di, uo) is det), in, di, uo) is det.
:- mode foldl(in(pred(in, in, out) is semidet), in, in, out) is semidet.
:- mode foldl(in(pred(in, in, out) is cc_multi), in, in, out) is cc_multi.
:- mode foldl(in(pred(in, di, uo) is cc_multi), in, di, uo) is cc_multi.
@end example

@c These are the modes (as of 25 July 2023) that we are not showing.
@c :- mode foldl(in(pred(in, di, uo) is semidet), in, di, uo) is semidet.
@c :- mode foldl(in(pred(in, mdi, muo) is det), in, mdi, muo) is det.
@c :- mode foldl(in(pred(in, mdi, muo) is semidet), in, mdi, muo) is semidet.
@c :- mode foldl(in(pred(in, mdi, muo) is nondet), in, mdi, muo) is nondet.
@c :- mode foldl(in(pred(in, in, out) is multi), in, in, out) is multi.
@c :- mode foldl(in(pred(in, in, out) is nondet), in, in, out) is nondet.

@noindent
That means you can pass predicates
with several different argument mode/determinism combinations
as the first argument.
In the case of @samp{list.foldl},
the modes of the accumulator arguments
and the determinism of the whole predicate will follow
the argument modes and the determinism of the higher-order argument,
but one can create predicates (and functions) where this would not be true.
@c An example of that may be nice, but putting it here
@c would probably be more confusing than helpful.

You can give names to such insts and modes
using declarations such as

@example
:- inst std_fold_pred == (pred(in, in, out) is det).
:- mode std_fold_pred_in == in(pred(in, in, out) is det).
@end example

@noindent
Note that the parentheses around the right hand side of the inst declaration
are required, due to the relative precedences
of the @samp{==} and @samp{is} operators.

Given these definitions, the declarations

@example
:- mode foldl(in(std_fold_pred), in, in, out) is det.
:- mode foldl(std_fold_pred_in, in, in, out) is det.
@end example

@noindent
are both equivalent to

@example
:- mode foldl(in(pred(in, in, out) is det), in, in, out) is det.
@end example

@noindent
but they may be more convenient to write,
especially in the presence of more arguments.
@footnote{If all instances of a given type are expected to use
a single inst or a single mode,
there is a convention where programmers will give that inst or mode
the same name as the type.
This works because types, insts and modes belong to separate namespaces,
so the names do not conflict.
Nonetheless, there is usually no need to define a named mode.
It is clearer to write a mode using @samp{in()} or @samp{out()}
around a named inst.}

The general form of higher order insts follows one of two patterns,
one for predicates, and one for functions.

The pattern for predicates is

@example
(pred) is @var{Determinism}
pred(@var{Mode}) is @var{Determinism}
pred(@var{Mode1}, @var{Mode2}) is @var{Determinism}
@dots{}
@end example

while the pattern for functions is

@example
(func) = @var{Mode} is @var{Determinism}
func(@var{Mode1}) = @var{Mode} is @var{Determinism}
func(@var{Mode1}, @var{Mode2}) = @var{Mode} is @var{Determinism}
@dots{}
@end example

In the case of zero-argument predicates and functions,
the parentheses around @samp{pred} and @samp{func}
are required to tell the compiler that these words
are not being used as operators in this case.
And, as explained above, one will usually need parentheses
around any instances of these patterns in Mercury code.

@c ZZZ Note that nontrivial examples for functions
@c are much harder to find than for predicates, because
@c - any functions with the default arg modes and determinism need no decl
@c - we don't want to encourage people to write functions which are not det
@c - we don't want to encourage people to write functions whose args
@c   have uniqueness requirements
@c This leaves functions whose arg modes restrict the allowed function symbols
@c (which would be better expressed using subtypes) and functions whose arg
@c modes include higher order insts/modes, which would be too complex to
@c be useful as an example to novices.

As a convenience,
the language allows you to write a higher order @emph{mode}
using the same syntax as a higher order @emph{inst}.
If @var{HOInst} has the form of a higher order inst,
then writing @var{HOInst} where a mode is required
is the same as writing @samp{in(HOInst)},
which is in turn equivalent to @samp{HOInst >> HOInst}.
Therefore,
you can omit @samp{in()} around the higher order inst of an input argument.
For example,

@example
:- mode foldl(in(pred(in, in, out) is det), in, in, out) is det.
@end example

@noindent
can also be written as

@example
:- mode foldl(pred(in, in, out) is det, in, in, out) is det.
@end example

@noindent
though the former may be easier to understand.

As usual,
if a predicate or function has only one mode,
the @samp{pred} or @samp{func} declaration
can be combined with the @samp{mode} declaration.
Consider the declaration of a function
that computes the intersection of two maps from keys to values:

@example
:- func intersect(func(V, V) = V, map(K, V), map(K, V)) = map(K, V).
@end example

@noindent
One could declare the mode of this function
either using a separate mode declaration, like this:

@example
:- mode intersect(in(func(in, in) = out is det), in, in) = (out) is det.
@end example

@noindent
or by combining the mode declaration with the function declaration, like this:

@example
:- func intersect((func(V, V) = V)::in(func(in, in) = out is det),
    map(K, V)::in, map(K, V)::in) = (map(K, V)::out) is det.
@end example

@noindent
In both cases, just as for the predicate examples above,
the @samp{in()} wrapper around the higher order inst is optional,
though in the combined declaration the parentheses must stay.

Mercury also provides builtin @samp{inst} values for use with solver types.
These follow the patterns

@example
any_pred is @var{Determinism}
any_pred(@var{Mode}) is @var{Determinism}
any_pred(@var{Mode1}, @var{Mode2}) is @var{Determinism}
@dots{}
any_func = @var{Mode} is @var{Determinism}
any_func(@var{Mode1}) = @var{Mode} is @var{Determinism}
any_func(@var{Mode1}, @var{Mode2}) = @var{Mode} is @var{Determinism}
@dots{}
@end example

See @ref{Solver types} for more details.

@c ZZZ What follows is the remains of the original text of the initial part
@c of section 8. It should be deleted when the discussion of the commit
@c that created this commit on 24/25 july 2023 has finished.
@c
@c The language contains builtin @samp{inst} values
@c
@c @example
@c (pred) is @var{Determinism}
@c pred(@var{Mode}) is @var{Determinism}
@c pred(@var{Mode1}, @var{Mode2}) is @var{Determinism}
@c @dots{}
@c (func) = @var{Mode} is @var{Determinism}
@c func(@var{Mode1}) = @var{Mode} is @var{Determinism}
@c func(@var{Mode1}, @var{Mode2}) = @var{Mode} is @var{Determinism}
@c @dots{}
@c @end example
@c
@c These insts represent the instantiation state
@c of variables bound to higher-order predicate and function terms
@c with the appropriate mode and determinism.
@c For example, @samp{pred(out) is det} represents the instantiation state
@c of being bound to a higher-order predicate term which is @code{det}
@c and accepts one output argument;
@c the term @samp{sum([1,2,3])} from the example above
@c is one such higher-order predicate term
@c which matches this instantiation state.
@c
@c As a convenience, the language also contains
@c builtin @samp{mode} values of the same name
@c (and they are what we have been using in the examples up to now).
@c These modes map from the corresponding @samp{inst} to itself.
@c It is as if they were defined by
@c
@c @example
@c :- mode (pred is @var{Determinism}) == in(pred is @var{Determinism}).
@c :- mode (pred(@var{Inst}) is @var{Determinism}) ==
@c     in(pred(@var{Inst}) is @var{Determinism}).
@c @dots{}
@c @end example
@c
@c @noindent
@c using the parametric mode @samp{in/1} mentioned in @ref{Modes}
@c which specified its argument inst
@c as both the initial and the final instantiation state in the mode.
@c
@c @c @node Builtin higher-order insts and modes
@c @c @subsection Builtin higher-order insts and modes
@c
@c If you want to define a predicate
@c which returns a higher-order predicate term,
@c you would use a mode such as @samp{free >> pred(@dots{}) is @dots{}},
@c or @samp{out(pred(@dots{}) is @dots{} )}.
@c @c XXX The space after the dots{} above works around a bug in texi2html.
@c For example:
@c
@c @example
@c :- pred foo(pred(int)).
@c :- mode foo(free >> (pred(out) is det)) is det.
@c
@c foo(sum([1,2,3])).
@c @end example
@c
@c In the above mode declaration, the current Mercury implementation
@c requires parentheses around the higher-order inst.
@c (This is because the operator @samp{>>} binds more tightly than the operator
@c @samp{is}.)
@c
@c For example, given the definition of @samp{foo} above, the goal
@c
@c @example
@c foo((pred(X::out) is det :- X = 6))
@c @end example
@c
@c @noindent
@c is illegal.
@c If you really want to compare higher-order predicates for equivalence,
@c you must program it yourself;
@c for example, the above goal could legally be written as
@c
@c @example
@c P = (pred(X::out) is det :- X = 6),
@c foo(Q),
@c all [X] (call(P, X) <=> call(Q, X)).
@c @end example

@c Delete first menu item, move menu to just before old second item's heading
@c * Builtin higher-order insts and modes::
@menu
* Default insts for functions::
* Combined higher-order types and insts::
@end menu

@node Default insts for functions
@subsection Default insts for functions

In order to call a higher-order term,
the compiler must know its higher-order inst.
This can cause problems when
higher-order terms are placed into a polymorphic collection type
and then extracted,
since the declared mode for the extraction will typically be @code{out}
and the higher-order inst information will be lost.
To partially alleviate this problem,
and to make higher-order functional programming easier,
if the term to be called has a function type,
but no higher-order inst information is explicitly provided,
we assume that it has the default higher-order function inst
@samp{func(in, @dots{}, in) = out is det}.

As a consequence of this,
a higher-order function term can @emph{only} be passed
where a term with no higher-order inst information is expected
if it can be passed
where a term with the default higher-order function inst is expected.
Higher-order predicate terms can always be passed to such a place,
but there is little value in doing so,
because there is no default higher-order inst for predicates,
and therefore it will not be possible to call those terms.

@node Combined higher-order types and insts
@subsection Combined higher-order types and insts

A higher-order type may optionally specify an inst in the following manner:

@example
(pred) is @var{Determinism}
pred(@var{Type}::@var{Mode}) is @var{Determinism}
pred(@var{Type1}::@var{Mode1}, @var{Type2}::@var{Mode2}) is @var{Determinism}
@dots{}
(func) = (@var{Type}::@var{Mode}) is @var{Determinism}
func(@var{Type1}::@var{Mode1}) = (@var{Type}::@var{Mode}) is @var{Determinism}
func(@var{Type1}::@var{Mode1}, @var{Type2}::@var{Mode2}) = (@var{Type}::@var{Mode}) is @var{Determinism}
@dots{}
@end example

When used as argument types of functors in type declarations,
types of this form have two effects.
First, for any unification that constructs a term using such an argument,
there is an additional mode constraint that
the argument must be approximated by the inst.
In other words, to be mode correct, a program must not construct any term
where a functor has an argument that does not have the declared inst,
if one is present.

The second effect is that when a unification deconstructs a ground term
to extract an argument with such a declared inst,
the extracted argument may then be used as if it had that inst.

For example, given this type declaration:

@example
:- type job
    --->    job(pred(int::out, io::di, io::uo) is det).
@end example

the following goal is correct:

@example
:- pred run(job::in, io::di, io::uo) is det.
run(Job, !IO) :-
    Job = job(Pred),
    Pred(Result, !IO),          % Pred has the necessary inst
    write_line(Result, !IO).
@end example

However, the following would be a mode error:

@example
:- pred bad(job::out) is det.
bad(job(p)).                    % Error: p does not have required mode

:- pred p(int::in, io::di, io::out) is det.
@dots{}
@end example

As a new feature, combined higher-order types and insts are only permitted
as direct arguments of functors in discriminated unions.
So the following examples currently result in errors.

@example
% Error: use on the RHS of equivalence types.
:- type p == (pred(io::di, io::uo) is det).
:- type f == (func(int::in) = (int::out) is semidet).

% Error: use inside a type constructor.
:- type jobs
    --->    jobs(list(pred(int::out, io::di, io::uo) is det)).

% Error: use in a pred/func declaration.
:- pred p((pred(io::di, io::uo) is det)::in, io::di, io::uo) is det.
:- func f(func(int::in) = (int::out) is semidet, int) = int.
@end example

Future versions of the language may allow these forms.

@node Modules
@chapter Modules

@menu
* The module system::
* An example module::
* Submodules::
* Module initialisation::
* Module finalisation::
* Module-local mutable variables::
@end menu

@node The module system
@section The module system

The Mercury module system is relatively simple and straightforward.

Each module must start with a @samp{:- module @var{ModuleName}} declaration,
specifying the name of the module.

An @samp{:- interface.} declaration indicates
the start of the module's interface section:
this section specifies the entities that are exported by this module.
Mercury provides support for abstract data types,
by allowing the definition of a type to be kept hidden,
with the interface only exporting the type name.
The interface section may contain definitions of types,
type classes, data constructors, instantiation states, and modes,
and declarations for abstract data types, abstract type class instances,
functions, predicates, and (sub-)modules.
The interface section may not contain definitions
for functions or predicates (i.e.@: clauses),
or definitions of (sub-)modules.

An @samp{:- implementation.} declaration indicates
the start of the module's implementation section.
Any entities declared in this section are local to the module
(and its submodules, if any) and cannot be used by other modules.
The implementation section must contain definitions
for all abstract data types, abstract instance declarations,
functions, predicates, and submodules exported by the module,
as well as for all local types, type class instances, functions,
predicates, and submodules.
The implementation section can be omitted if it is empty.

The module may optionally end
with a @samp{:- end_module @var{ModuleName}} declaration;
the name specified in the @samp{end_module} must be the same
as that in the corresponding @samp{module} declaration.

@c should we mention multipart interfaces and implementations?
@c ===> no

In order to constrain which entity a name refers to,
functor terms representing
predicates, functions, expressions, constructor fields,
types, modes, insts, type classes, instances and sub-modules
can be explicitly module qualified using the @samp{.} operator,
e.g.@: @samp{module.name} or @samp{module.name(Args)}.
Operator terms
(that is, functor terms written using operator syntax)
may also be module qualified,
though this requires putting parentheses
around the operator and its arguments,
e.g.@: @samp{module.(A + B)}.
The module name used in a module qualified term
may itself be module qualified if it is a sub-module,
e.g.@: @samp{module.submodule.name(Args)}.
A functor is @dfn{fully qualified}
if every name that is not a builtin or top-level module
is module qualified.

Currently we also support @samp{__} as an alternative module qualifier,
so you can write @code{module__name} instead of @code{module.name}.

The principal functor of a module qualified term
combines the name and qualifiers with the arity.
For example,
the principal functor of @samp{foo.bar.baz(Xs)} is @samp{foo.bar.baz/1},
and the principal functor of @samp{quux.(A + B)} is @samp{quux.'+'/2}.

If a module wishes to make use of entities exported by other modules,
then it must explicitly import those modules
using one or more @samp{:- import_module @var{Modules}}
or @w{@samp{:- use_module @var{Modules}}} declarations,
in order to make those declarations visible.
In both cases, @var{Modules}
is a comma-separated list of fully qualified module names.
These declarations may occur
either in the interface or the implementation section.
If the imported entities are used in the interface section,
then the corresponding @code{import_module} or @code{use_module}
declaration must also be in the interface section.
If the imported entities are only used in the implementation section,
the @code{import_module} or @code{use_module} declaration
should be in the implementation section.

You may need to module qualify a name if that name has
several applicable definitions,
and the context of its occurrence does not resolve this ambiguity.
Module qualifiers are also useful for readability.
Uses of entities imported using @code{use_module} declarations
@emph{must} be fully qualified.

Certain optimizations require information or source code
for predicates defined in other modules to be as effective as possible.
At the moment, inlining and higher-order specialization
are the only optimizations
that the Mercury compiler can perform across module boundaries.

Exactly one module of the program
must export a predicate @samp{main/2},
which must be declared as either

@example
:- pred main(io.state::di, io.state::uo) is det.
@end example

@noindent
or

@example
:- pred main(io.state::di, io.state::uo) is cc_multi.
@end example

@noindent
(or any declaration equivalent to one of the two above).

Mercury has a standard library which includes over 100 modules,
including modules for
lists, stacks, queues, priority queues, sets, bags (multi-sets),
maps (dictionaries), random number generation, input/output,
and filename and directory handling.
See the Mercury Library Reference Manual for a list of the available modules,
and for the documentation of each module.

@node An example module
@section An example module

For illustrative purposes,
here is the definition of a simple module for managing queues:

@example
:- module queue.
:- interface.

% Declare an abstract data type.

:- type queue(T).

% Declare some predicates which operate on the abstract data type.

:- pred empty_queue(queue(T)).
:- mode empty_queue(out) is det.
:- mode empty_queue(in) is semidet.

:- pred put(queue(T), T, queue(T)).
:- mode put(in, in, out) is det.

:- pred get(queue(T), T, queue(T)).
:- mode get(in, out, out) is semidet.

:- implementation.

% Queues are implemented as lists. We need the `list' module
% for the declaration of the type list(T), with its constructors
% '[]'/0 and '[|]'/2, and for the declaration of the predicate
% list.append/3.

:- import_module list.

% Define the queue ADT.

:- type queue(T) == list(T).

% Define the exported predicates.

empty_queue([]).

put(Queue0, Elem, Queue) :-
    list.append(Queue0, [Elem], Queue).

get([Elem | Queue], Elem, Queue).

:- end_module queue.
@end example

@node Submodules
@section Submodules

As mentioned above, modules may contain submodules.
There are two kinds of submodules,
called nested submodules and separate submodules;
the difference is that nested submodules
are defined in the same source file as the containing module,
whereas separate submodules are defined in separate source files.
Implementations should support separate compilation of separate submodules.

A module may not contain more than one submodule with the same name.

@menu
* Nested submodules::
* Separate submodules::
* Visibility rules::
* Implementation bugs and limitations::
@end menu

@node Nested submodules
@subsection Nested submodules

Nested submodules within a module are delimited by
matching @samp{:- module} and @samp{:- end_module} declarations.
(Note that @samp{:- end_module} for nested submodules
are mandatory, not optional,
even if the nested submodule is the last thing in the source file.
The module name in a @samp{:- module} or @samp{:- end_module}
declaration for a nested submodule need not be fully qualified.)
The sequence of items thus delimited is known as a submodule item sequence.

The interface and implementation parts of a nested submodule
may be specified in two different submodule declarations.
If a submodule item sequence includes an interface section,
then it is a declaration of that submodule;
if it includes an implementation section,
then it is a definition of that submodule;
and if it includes both, then it is both declaration and definition.

It is an error to declare a submodule twice, or to define it twice.
It is an error to define a submodule without declaring it.
As mentioned earlier, it is an error
to define a submodule in the interface section of its parent module.

If a submodule is declared but not explicitly defined,
then there is an implicit definition
with an empty implementation section for that submodule.
This empty implementation section will result in an error
if the interface section of a submodule contains any of the following:

@itemize @bullet
@item
a declaration for a function or a predicate;
@item
an abstract declaration for a type, inst, mode or typeclass,
i.e.@: a declaration that does not itself serve as a definition
of that type, inst, mode or typeclass;
@item
an abstract declaration of a typeclass instance; or
@item
a (doubly, triply, etc) nested submodule
(which perforce has only an interface section, and no implementation section)
and which contains any of the above.
@end itemize

@node Separate submodules
@subsection Separate submodules

Separate submodules are declared using
@samp{:- include_module @var{Modules}} declarations.
Each @samp{:- include_module} declaration specifies
a comma-separated list of submodules.

@example
:- include_module @var{Module1}, @var{Module2}, @dots{}, @var{ModuleN}.
@end example

The module names need not be fully qualified.

Each of the named submodules in an @samp{:- include_module} declaration
must be defined in a separate source file.
The mapping between module names and source file names
is implementation-defined.
The Melbourne Mercury implementation requires that
@itemize
@item @emph{either} every module must be in a file
whose name is the fully qualified module name followed by @samp{.m},
(so a module named e.g.@: @samp{foo.bar.baz},
must be in a file named @file{foo.bar.baz.m})
@item @emph{or} that the programmer tell the implementation
about which files contain which modules
using a command such as @samp{mmc -f *.m}.
(Alternatively, you could replace the @samp{*.m} in that command
with a list of the file names of all the Mercury modules in the program.)
@end itemize

The source file of a separate submodule must contain
the declaration (interface) and definition (implementation) of the submodule.
It must start with a @samp{:- module} declaration
containing the fully qualified module name,
followed by the interface and (if necessary) implementation sections,
and it may optionally end with a @samp{:- end_module} declaration.
(The module name in the @samp{:- end_module} declaration need not be
fully qualified.)

The semantics of separate submodules
are identical to those of nested submodules.
The procedure to transform a separate submodule
into a nested submodule is as follows:

@enumerate
@item
Replace the @samp{:- include_module @var{submodule}} declaration
with the interface section of the submodule
enclosed within @samp{:- module @var{submodule}}
and @samp{:- end_module @var{submodule}} declarations.
@item
Place the implementation section of the submodule
enclosed within @samp{:- module @var{submodule}}
and @samp{:- end_module @var{submodule}} declarations
in the implementation section of the parent module.
@end enumerate

For example

@example
:- module x.
:- interface.
:- include_module y.
:- end_module x.
@end example

@noindent
is equivalent to

@example
:- module x.
:- interface.
    :- module y.
    % interface section of module @samp{y}
    :- end_module y.
:- implementation.
    :- module y.
    % implementation section of module @samp{y}
    :- end_module y.
:- end_module x.
@end example

@node Visibility rules
@subsection Visibility rules

Any declarations in the parent module,
including those in the parent module's implementation section,
are visible in the parent's submodules,
including indirect submodules (i.e.@: sub-submodules, etc.).
Similarly, declarations in the interfaces of any modules
imported using an @samp{:- import_module} or a @samp{:- use_module}
in the parent module
are visible in the parent's submodules, including indirect submodules.

Declarations in a child module are not visible in the parent module,
or in ``sibling'' modules (other children of the same parent),
or in other unrelated modules
unless the child is explicitly imported using
an @samp{:- import_module} or @samp{:- use_module} declaration.
It is an error to import a module without importing all of its parent modules.

Note that a submodule for which
the @samp{:- module} or @samp{:- include_module} declaration
occurs only in the implementation section of the parent module
may only be imported or used by its parent module
or by submodules of its parent module.

As mentioned previously,
all @samp{:- import_module} and @samp{:- use_module} declarations
must use fully qualified module names.

@node Implementation bugs and limitations
@subsection Implementation bugs and limitations

The current implementation of submodules has a couple of minor limitations.

@itemize @bullet
@item
The compiler sometimes reports spurious errors
if you define an equivalence type in a submodule
and export it as an abstract type.
@item
Using @samp{mmake} to do parallel makes (e.g.@: @samp{mmake --jobs 2})
does not always work correctly if you are using nested submodules.
(The work-around is to use separate submodules instead of nested submodules,
i.e.@: to put the submodules in separate source files.)
@end itemize

@node Module initialisation
@section Module initialisation

Modules that interact with foreign libraries or services
may require special initialisation before use.
Such modules may include any number of @samp{initialise} directives
in their implementation sections.
An @samp{initialise} directive has the following form:

@example
:- initialise @var{initpredname}/@var{arity}.
@end example

where the predicate @var{initpredname} must be declared
with one of the following signatures:

@example
:- pred @var{initpredname}(io::di, io::uo) is @var{Det}.
:- impure pred @var{initpredname} is @var{Det}.
@end example

@var{Det} must be either @code{det} or @code{cc_multi}.

The effect of the @samp{initialise} declaration
is to ensure that @samp{@var{initpredname}/@var{arity}} is invoked
before the program's @samp{main/2} predicate.
Initialisation predicates within a module are executed
in the order in which they are specified,
although no order may be assumed between different modules or submodules.
Initialisation predicates are only invoked
after any initialisation required by the Mercury standard library.

If @samp{@var{initpredname}/@var{arity}} terminates with an uncaught exception
then the program will immediately abort execution.
In this circumstance, those predicates specified by other @samp{initialise}
directives that have not yet been executed will not be executed,
@samp{main/2} will not be executed,
and no predicate specified in a @samp{finalise} directive will be executed.

@samp{initialize} is also allowed as a synonym for @samp{initialise}.

@node Module finalisation
@section Module finalisation

Modules that require special finalisation at program termination
may include any number of @samp{finalise} directives
in their implementation sections.

A @samp{finalise} directive has the following form:

@example
:- finalise @var{finalpredname}/@var{arity}.
@end example

where the predicate @samp{finalpredname/arity} must be declared
with one of the following signatures:

@example
:- pred @var{finalpredname}(io::di, io::uo) is @var{Det}.
:- impure pred @var{finalpredname} is @var{Det}.
@end example

@var{Det} must be either @code{det} or @code{cc_multi}.

The effect of the @samp{finalise} declaration
is to ensure that @samp{@var{finalpredname}/@var{arity}} is invoked
after the program's @samp{main} predicate.
Finalisation predicates within a module
are executed in the order in which they are specified,
although no order may be assumed between different modules or submodules.
Any finalisation required by the Mercury standard library
will always occur after any finalisation predicates have been invoked.

If @samp{@var{finalpredname}/@var{arity}}
terminates with an uncaught exception,
then the program will immediately abort execution.
No predicates specified by other @samp{finalise} directives
that have not yet been executed will be executed.
If the program's @samp{main/2} predicate terminates with an uncaught exception,
then no finalisation predicates will be executed.

@samp{finalize} is also allowed as a synonym for @samp{finalise}.

@node Module-local mutable variables
@section Module-local mutable variables

Certain special cases require a module to have
one or more mutable (i.e.@: destructively updateable) variables,
for example to hold the constraint store for a solver type.

A mutable variable is declared using the @samp{mutable} directive:

@example
:- mutable(@var{varname}, @var{vartype}, @var{initial_value}, @var{varinst}, [@var{attribute}, @dots{}]).
@end example

This constructs a new mutable variable with access predicates
that have the following signatures:

@example
:- semipure pred get_@var{varname}(@var{vartype}::out(@var{varinst})) is det.
:- impure   pred set_@var{varname}(@var{vartype}::in(@var{varinst})) is det.
@end example

The initial value of @var{varname} is @var{initial_value},
which is set before the program's @samp{main/2} predicate is executed.

The type @var{vartype} is not allowed
to contain any type variables or have any type class constraints.

The inst @var{varinst} is not allowed to contain any inst variables.
It is also not allowed to be equivalent to,
or contain components that are equivalent to,
the builtin insts @code{free}, @code{unique}, @code{mostly_unique},
@code{dead} (@code{clobbered})
or @code{mostly_dead} (@code{mostly_clobbered}).

The initial value of a mutable, @var{initial_value},
may be any Mercury expression with type @var{vartype} and inst @var{varinst}
subject to the above restrictions.
It may be impure or semipure.

The following @var{attributes} are supported:

@table @asis

@item @samp{trailed}/@samp{untrailed}
This attribute specifies whether
the implementation should generate code
to undo the effects of @samp{set_@var{varname}/1} on backtracking
(@samp{trailed}) or not (@samp{untrailed}).
The default, in case none is specified, is @samp{trailed}.

@item @samp{attach_to_io_state}
This attribute causes the compiler
to also construct access predicates that have the following signatures:

@example
:- pred get_@var{varname}(@var{vartype}::out(@var{varinst}), io::di, io::uo) is det.
:- pred set_@var{varname}(@var{vartype}::in(@var{varinst}),  io::di, io::uo) is det.
@end example

@item @samp{constant}
This attribute causes the compiler to construct
only a @samp{get} access predicate, but not a @samp{set} access predicate.
Since @var{varname} will always have the initial value given to it,
the @samp{get} access predicate is pure; its signature will be:

@example
:- pred get_@var{varname}(@var{vartype}::out(@var{varinst})) is det.
@end example

The @samp{constant} attribute cannot be specified together with
the @samp{attach_to_io_state} attribute
(since they disagree on this signature).
It also cannot be specified together with an explicit @samp{trailed} attribute.

@end table

The Melbourne Mercury compiler also supports the following attributes:

@table @asis

@item @samp{foreign_name(@var{Lang}, @var{Name})}
Allow foreign code to access the mutable variable
in some implementation dependent manner.
@var{Lang} must be a valid target language for this Mercury implementation.
@var{Name} must be a valid identifier in that language.
It is an error to specify
more than one foreign name attribute for each language.

For the C backends,
this attribute allows foreign code to access the mutable variable
as an external variable called @var{Name}.
For the low-level C backend, e.g.@: the asm_fast grades,
the type of this variable will be @code{MR_Word}.
For the high-level C backend, e.g.@: the hlc grades,
the type of this variable depends upon the Mercury type of the mutable.
For mutables of a Mercury primitive type,
the corresponding C type is given
by the mapping in @ref{C data passing conventions}.
For mutables of any other type,
the corresponding C type will be @code{MR_Word}.

This attribute is not currently implemented for the non-C backends.

@item @samp{thread_local}
This attribute allows a mutable to take on different values in each thread.
When a child thread is spawned,
it inherits all the values of thread-local mutables of the parent thread.
Changing the value of a thread-local mutable
does not affect its value in any other threads.

The @samp{thread_local} attribute cannot be specified
together with either of the @samp{trailed} or @samp{constant} attributes.

@end table

It is an error for a @samp{mutable} directive
to appear in the interface section of a module.
The usual visibility rules for submodules
apply to the mutable variable access predicates.

For the purposes of determining
when mutables are assigned their initial values,
the expression @var{initial_value} behaves
as though it were a predicate specified in an @samp{initialise} directive.

@example
:- initialise foo/2.
:- mutable(bar, int, 561, ground, [untrailed]).
:- initialise baz/2.
@end example

In the above example,

@itemize
@item @samp{foo/2} will be invoked first,
@item then @samp{bar} will be set to its initial value of 561,
@item and then @samp{baz/2} will be invoked.
@end itemize

The effect of a mutable initial value expression
terminating with an uncaught exception
is also the same as though it were
a predicate specified in an @samp{initialise} directive.

@node Type classes
@chapter Type classes

Mercury supports constrained polymorphism in the form of type classes.
Type classes allow the programmer to write predicates and functions
which operate on variables of any type (or sequence of types)
for which a certain set of operations is defined.

@menu
* Typeclass declarations::
* Instance declarations::
* Abstract typeclass declarations::
* Abstract instance declarations::
* Type class constraints on predicates and functions::
* Type class constraints on type class declarations::
* Type class constraints on instance declarations::
* Functional dependencies::
@end menu

@node Typeclass declarations
@section Typeclass declarations

A @dfn{type class} is a name for a set of types
(or a set of sequences of types)
for which certain predicates and/or functions,
called the @dfn{methods} of that type class, are defined.
A @samp{typeclass} declaration defines a new type class,
and specifies the set of predicates and/or functions
that must be defined on a type (or sequence of types)
for it (them) to be considered to be an instance of that type class.

The @code{typeclass} declaration
gives the name of the type class that it is defining,
the names of the type variables which are parameters to the type class,
and the operations (i.e.@: methods) which form the interface of the type class.
For each method, all parameters of the typeclass must be determined
by the type declaration of the method.
The values of @emph{most} parameter type variables
are determined by having them occur
in the declared type of an argument of the method.
However, if either the typeclass named in the constraint, or its superclasses,
include any functional dependencies, then
the value of a variable may also be implied by the values of other variables
(@pxref{Functional dependencies}).

For example,

@example
:- typeclass point(T) where [
        % coords(Point, X, Y):
        %       X and Y are the cartesian coordinates of Point
        pred coords(T, float, float),
        mode coords(in, out, out) is det,

        % translate(Point, X_Offset, Y_Offset) = NewPoint:
        %       NewPoint is Point translated X_Offset units in the X direction
        %       and Y_Offset units in the Y direction
        func translate(T, float, float) = T
].
@end example

@noindent
declares the type class @code{point},
which represents points in two dimensional space.

@code{pred}, @code{func} and @code{mode} declarations
are the only legal declarations inside a @code{typeclass} declaration.
The mode and determinism of type class methods
must be explicitly declared or (for functions) defaulted, not inferred.
In other words, for each predicate declared in a type class,
there must be at least one mode declaration,
and each mode declaration in a type class
must include an explicit determinism annotation.
Functions with no explicit mode declaration
get the usual default mode (@pxref{Modes}):
all arguments have mode @code{in}, the result has mode @code{out},
and the determinism is @code{det}.

The number of parameters to the type class (e.g.@: @code{T}) is not limited.
For example, the following is allowed:

@example
:- typeclass a(T1, T2) where [@dots{}].
@end example

The parameters must be distinct variables.
Each @code{typeclass} declaration must have at least one parameter.

It is legal for a @code{typeclass} declaration to declare no methods,
for example

@example
:- typeclass foo(T) where [].
@end example

There must not be more than one type class declaration
with the same name and arity in the same module.

@node Instance declarations
@section Instance declarations

Once the interface of the type class
has been defined in the @code{typeclass} declaration,
we can use an @code{instance} declaration
to define how a particular type (or sequence of types)
satisfies the interface declared in the @code{typeclass} declaration.

An instance declaration has the form

@example
:- instance @var{classname}(@var{typename}(@var{typevar}, @dots{}), @dots{})
        where [@var{method_definition}, @var{method_definition}, @dots{}].
@end example

An @samp{instance} declaration
gives a type for each parameter of the type class.
Each of these types must be either a type with no arguments,
or a polymorphic type whose arguments are all type variables.
@c If this restriction is ever lifted, the algorithms for encoding the names of
@c the data structures describing the instance, in base_typeclass_info.m
@c and/or rtti.m, would need to be updated as well.
For example @code{int}, @code{list(T)},
@code{bintree(K, V)} and @code{bintree(T, T)} are allowed,
but @code{T} and @code{list(int)} are not.
The types in an instance declaration must not be abstract types
which are elsewhere defined as equivalence types.
A program may not contain
more than one instance declaration for a particular type
(or sequence of types, in the case of a multi-parameter type class)
and typeclass.
These restrictions ensure that there are no overlapping instance declarations,
i.e.@: for each typeclass there is at most one instance declaration
that may be applied to any type (or sequence of types).

There is no special interaction between subtypes and the typeclass system.
A subtype is @emph{not} automatically an instance of a typeclass if
there is an @samp{instance} declaration for its supertype.

Each @var{method_definition} entry
in the @samp{where [@dots{}]} part of an @code{instance} declaration
defines the implementation of one of the class methods for this instance.
There are two ways of defining methods.

The first way to define a method is by
giving the name of the predicate or function which implements that method.
In this case, the @var{method_definition} must have one of the following forms:

@example
pred(@var{method_name}/@var{arity}) is @var{predname}
func(@var{method_name}/@var{arity}) is @var{funcname}
@end example

@noindent
The @var{predname} or @var{funcname} must name
a predicate or function of the specified arity
whose type, modes, determinism, and purity are at least as permissive
as the declared type, modes, determinism, and purity
of the class method with the specified @var{method_name} and @var{arity},
after the types of the arguments in the instance declaration
have been substituted in place of the parameters in the type class declaration.

The second way of defining methods is
by listing the clauses for the definition inside the instance declaration.
A @var{method_definition} can be a clause.
These clauses are just like the clauses
used to define ordinary predicates or functions (@pxref{Items}),
and so they can be facts, rules, or DCG rules.
The only difference is that in instance declarations,
clauses are separated by commas rather than being terminated by periods,
and so rules and DCG rules in instance declarations
must normally be enclosed in parentheses.
As with ordinary predicates,
you can have more than one clause for each method.
The clauses must satisfy
the declared type, modes, determinism and purity for the method,
after the types of the arguments in the instance declaration
have been substituted in place of the parameters in the type class declaration.

These two ways are mutually exclusive:
each method must be defined either by a single naming definition
(using the @samp{pred(@dots{}) is @var{predname}}
or @samp{func(@dots{}) is @var{funcname}} form),
or by a set of one or more clauses, but not both.

Here is an example of an instance declaration
and the different kinds of method definitions that it can contain:

@example
@group
:- typeclass foo(T) where [
    func method1(T, T) = int,
    func method2(T) = int,
    pred method3(T::in, int::out) is det,
    pred method4(T::in, io.state::di, io.state::uo) is det,
    func method5(bool, T) = T
].

:- instance foo(int) where [
    % method defined by naming the implementation
    func(method1/2) is (+),

    % method defined by a fact
    method2(X) = X + 1,

    % method defined by a rule
    (method3(X, Y) :- Y = X + 2),

    % method defined by a DCG rule
    (method4(X) --> io.print(X), io.nl),

    % method defined by multiple clauses
    method5(no, _) = 0,
    (method5(yes, X) = Y :- X + Y = 0)
].
@end group
@end example

Each @samp{instance} declaration
must define an implementation for every method
declared in the corresponding @samp{typeclass} declaration.
It is an error to define more than one implementation
for the same method within a single @samp{instance} declaration.

Any call to a method must have argument types
(and in the case of functions, return type)
which are constrained to be a member of that method's type class,
or which match one of the instance declarations
visible at the point of the call.
A method call will invoke the predicate or function
specified for that method in the instance declaration
that matches the types of the arguments to the call.

Note that even if a type class has no methods,
an explicit instance declaration is required
for a type to be considered an instance of that type class.

Here is an example of some code using an instance declaration:

@example
:- type coordinate
    --->    coordinate(
                float,  % X coordinate
                float   % Y coordinate
            ).

:- instance point(coordinate) where [
    pred(coords/3) is coordinate_coords,
    func(translate/3) is coordinate_translate
].

:- pred coordinate_coords(coordinate, float, float).
:- mode coordinate_coords(in, out, out) is det.

coordinate_coords(coordinate(X, Y), X, Y).

:- func coordinate_translate(coordinate, float, float) = coordinate.

coordinate_translate(coordinate(X, Y), Dx, Dy) = coordinate(X + Dx, Y + Dy).
@end example

We have now made the @code{coordinate} type
an instance of the @code{point} type class.
If we introduce a new type @code{coloured_coordinate}
which represents a point in two dimensional space
with a colour associated with it,
it can also become an instance of the type class:

@example
:- type rgb
    --->    rgb(
                int,
                int,
                int
            ).

:- type coloured_coordinate
    --->    coloured_coordinate(
                float,
                float,
                rgb
            ).

:- instance point(coloured_coordinate) where [
    pred(coords/3) is coloured_coordinate_coords,
    func(translate/3) is coloured_coordinate_translate
].

:- pred coloured_coordinate_coords(coloured_coordinate, float, float).
:- mode coloured_coordinate_coords(in, out, out) is det.

coloured_coordinate_coords(coloured_coordinate(X, Y, _), X, Y).

:- func coloured_coordinate_translate(coloured_coordinate, float, float)
    = coloured_coordinate.

coloured_coordinate_translate(coloured_coordinate(X, Y, Colour), Dx, Dy)
    = coloured_coordinate(X + Dx, Y + Dy, Colour).
@end example

If we call @samp{translate/3}
with the first argument having type @samp{coloured_coordinate},
this will invoke @samp{coloured_coordinate_translate}.
Likewise, if we call @samp{translate/3}
with the first argument having type @samp{coordinate},
this will invoke @samp{coordinate_translate}.

Further instances of the type class could be made,
e.g.@: a type that represents the point using polar coordinates.

Since methods may be defined using clauses,
and the interface sections of modules may @emph{not} include clauses,
instance declarations that specify method definitions
may appear only in the implementation section of a module.
If you want to export the knowledge that a type, or a sequence of types,
is a member of a given typeclass,
then put a version of the instance declaration
that omits all method definitions
(@pxref{Abstract instance declarations})
into the interface section of the module
that contains the full instance declaration in its implementation section.

@node Abstract typeclass declarations
@section Abstract typeclass declarations

Abstract typeclass declarations
are typeclass declarations whose definitions are hidden.
An abstract typeclass declaration has the same form as a typeclass declaration,
but without the @samp{where[@dots{}]} part.
An abstract typeclass declaration defines
a name for a set of (sequences of) types,
but does not define what methods must be implemented
for instances of the type class.

Like abstract type declarations,
abstract typeclass declarations are only useful
in the interface section of a module.
Each abstract typeclass declaration must be accompanied
by a corresponding non-abstract typeclass declaration
that defines the methods for that type class.

Non-abstract instance declarations can only be made
in scopes where the non-abstract typeclass declaration is visible.

@node Abstract instance declarations
@section Abstract instance declarations

Abstract instance declarations are
instance declarations whose implementations are hidden.
An abstract instance declaration has the same form as an instance declaration,
but without the @samp{where [@dots{}]} part.
An abstract instance declaration declares that
a sequence of types is an instance of a particular type class
without defining how the type class methods are implemented for those types.
Like abstract type declarations,
abstract instance declarations are only useful
in the interface section of a module.
Each abstract instance declaration must be accompanied
in the implementation section of the same module
by a corresponding non-abstract instance declaration
that defines how the type class methods are implemented.

Here is an example:

@example
:- module hashable.
:- interface.
:- import_module int, string.

:- typeclass hashable(T) where [func hash(T) = int].
:- instance hashable(int).
:- instance hashable(string).

:- implementation.

:- instance hashable(int) where [func(hash/1) is hash_int].
:- instance hashable(string) where [func(hash/1) is hash_string].

:- func hash_int(int) = int.
hash_int(X) = X.

:- func hash_string(string) = int.
hash_string(S) = H :-
    % Use the standard library predicate string.hash/2.
    string.hash(S, H).

:- end_module hashable.
@end example

@node Type class constraints on predicates and functions
@section Type class constraints on predicates and functions

Mercury allows a type class constraint
to appear as part of a predicate or function's type signature.
This constrains the values that can be taken by type variables
in the signature to belong to particular type classes.

A type class constraint has the form:

@example
<= @var{Typeclass}(@var{Type}, @dots{}), @dots{}
@end example

@noindent
where @var{Typeclass} is the name of a type class and @var{Type} is a type.
Any variable that appears in @var{Type} must be determined
by the predicate's or function's type signature.
A variable is determined by a type signature
if it appears in the type signature,
but if functional dependencies are present,
then it may also be determined from other variables
(@pxref{Functional dependencies}).
Each type class constraint in a predicate or function declaration
must contain at least one variable.

For example

@example
:- pred distance(P1, P2, float) <= (point(P1), point(P2)).
:- mode distance(in, in, out) is det.

distance(A, B, Distance) :-
    coords(A, Xa, Ya),
    coords(B, Xb, Yb),
    XDist = Xa - Xb,
    YDist = Ya - Yb,
    Distance = sqrt(XDist*XDist + YDist*YDist).
@end example

In the above example,
the @code{distance} predicate
is able to calculate the distance between any two points,
regardless of their representation,
as long as the @code{coords} operation has been defined.
These constraints are checked at compile time.

@node Type class constraints on type class declarations
@section Type class constraints on type class declarations

Type class constraints may also appear in @code{typeclass} declarations,
meaning that one type class is a ``superclass'' of another.

The arguments of a constraint on a type class declaration
must be either type variables or ground types.
Each constraint must contain at least one variable argument
and all variables that appear in the arguments
must also be arguments to the type class in question.

For example, the following declares the @samp{ring} type class,
which describes types with a particular set of numerical operations defined:

@example
:- typeclass ring(T) where [
    func zero = (T::out) is det,               % '+' identity
    func one = (T::out) is det,                % '*' identity
    func plus(T::in, T::in) = (T::out) is det, % '+'/2 (forward mode)
    func mult(T::in, T::in) = (T::out) is det, % '*'/2 (forward mode)
    func negative(T::in) = (T::out) is det     % '-'/1 (forward mode)
].
@end example

We can now add the following declaration:

@example
@group
:- typeclass euclidean(T) <= ring(T) where [
    func div(T::in, T::in) = (T::out) is det,
    func mod(T::in, T::in) = (T::out) is det
].
@end group
@end example

This introduces a new type class, @code{euclidean},
of which @code{ring} is a superclass.
The operations defined by the @code{euclidean} type class
are @code{div}, @code{mod},
as well as all those defined by the @code{ring} type class.
Any type declared to be an instance of @code{euclidean}
must also be declared to be an instance of @code{ring}.

Type class constraints on type class declarations
give rise to a superclass relation.
This relation must be acyclic.
That is, it is an error
if a type class is its own (direct or indirect) superclass.

@node Type class constraints on instance declarations
@section Type class constraints on instance declarations

Type class constraints may also be placed upon instance declarations.
The arguments of such constraints
must be either type variables or ground types.
Each constraint must contain at least one variable argument
and all variables that appear in the arguments
must be type variables that appear in the types in the instance declaration.

For example, consider the following declaration
of a type class of types that may be printed:

@example
:- typeclass portrayable(T) where [
    pred portray(T::in, io.state::di, io.state::uo) is det
].
@end example

The programmer could declare instances such as

@example
:- instance portrayable(int) where [
    pred(portray/3) is io.write_int
].

:- instance portrayable(char) where [
    pred(portray/3) is io.write_char
].
@end example

However, when it comes to writing the instance declaration
for a type such as @code{list(T)},
we want to be able print out the list elements
using the @code{portray/3} for the particular type of the list elements.
This can be achieved by
placing a type class constraint on the @code{instance} declaration,
as in the following example:

@example
:- instance portrayable(list(T)) <= portrayable(T) where [
    pred(portray/3) is portray_list
].

:- pred portray_list(list(T), io.state, io.state) <= portrayable(T).
:- mode portray_list(in, di, uo) is det.

portray_list([], !IO).
portray_list([X | Xs], !IO) :-
    portray(X, !IO),
    io.write_char(' ', !IO),
    portray_list(Xs, !IO).
@end example

For abstract instance declarations,
the type class constraints on an abstract instance declaration
must exactly match the type class constraints
on the corresponding non-abstract instance declaration
that defines that instance.
@c XXX The current implementation does not enforce that rule.

The abstract version of the above instance declaration would be

@example
:- instance portrayable(list(T)) <= portrayable(T).
@end example

@node Functional dependencies
@section Functional dependencies

Type class constraints may include any number of functional dependencies.
A @dfn{functional dependency} constraint
takes the form @code{(@var{Domain} -> @var{Range})}.
The @var{Domain} and @var{Range} arguments are either single type variables,
or conjunctions of distinct type variables separated by commas.

@example
        :- typeclass @var{Typeclass}(@var{Var}, @dots{})@
<= ((@var{D} -> @var{R}), @dots{}) @dots{}

        :- typeclass @var{Typeclass}(@var{Var}, @dots{})@
<= (@var{D1}, @var{D2}, @dots{} -> @var{R1}, @var{R2}, @dots{}), @dots{}
@end example

Each type variable must appear in the parameter list of the typeclass.
Abstract typeclass declarations must have
exactly the same functional dependencies as their concrete forms.

Mutually recursive functional dependencies are allowed,
so the following examples are legal:

@example
        :- typeclass foo(A, B) <= ((A -> B), (B -> A)).
        :- typeclass bar(A, B, C, D)@
<= ((A, B -> C), (B, C -> D), (D -> A, C)).
@end example

A functional dependency on a typeclass places an additional requirement
on the set of instances which are allowed for that type class.
The requirement is that
all types bound to variables in the range of the functional dependency
must be able to be uniquely determined
by the types bound to variables in the domain of the functional dependency.
If more than one functional dependency is present,
then the requirement for each one must be satisfied.

For example, given the typeclass declaration

@example
:- typeclass baz(A, B) <= (A -> B) where @dots{}
@end example

@noindent
it would be illegal to have both of the instances

@example
:- instance baz(int, int) where @dots{}
:- instance baz(int, string) where @dots{}
@end example

@noindent
although either one would be acceptable on its own.

The following instance would also be illegal

@example
:- instance baz(string, list(T)) where @dots{}
@end example

@noindent
since the variable @code{T} may not always be bound to the same type.
However, the instance

@example
:- instance baz(list(S), list(T)) <= baz(S, T) where @dots{}
@end example

is legal because
the @samp{baz(S, T)} constraint ensures that
whatever @code{T} is bound to,
it is always uniquely determined from the binding of @code{S}.

The extra requirements that result from the use of functional dependencies
allow the bindings of some variables
to be determined from the bindings of others.
This in turn relaxes some of the requirements
of typeclass constraints on predicate and function signatures,
and on existentially typed data constructors.

Without any functional dependencies, all variables in constraints
must appear in the signature of the predicate or function being declared.
However, variables which are in the range of a functional dependency
need not appear in the signature,
since it is known that their bindings will be determined
from the bindings of the variables in the domain.
@c XXX What about a class with two fundeps: A -> B, and B -> A.
@c Both A and B are in the range of a fundep.
@c The above text seems to imply that it is ok for *neither* to appear
@c in the signature.
@c Should we replace "since it is known that their bindings will be determined"
@c with "provided that their bindings are determined"?

More formally, the constraints on a predicate or function signature
@emph{induce} a set of functional dependencies
on the variables appearing in those constraints.
A functional dependency @samp{(A1, @dots{} -> B1, @dots{})}
is induced from a constraint
@samp{@var{Typeclass}(@var{Type1}, @dots{})}
if and only if the typeclass @samp{@var{Typeclass}}
has a functional dependency @samp{(D1, @dots{} -> R1, @dots{})},
and for each typeclass parameter @samp{Di} there exists an @samp{Aj}
for every type variable appearing in the @samp{@var{Typek}}
corresponding to @samp{Di},
and each @samp{Bi} appears in the @samp{@var{Typej}}
bound to the typeclass parameter @samp{Rk} for some @var{k}.

For example, with the definition of @code{baz} above,
the constraint @code{baz(map(X, Y), list(Z))}
induces the constraint @code{(X, Y -> Z)},
since @var{X} and @var{Y} appear in the domain argument,
and @var{Z} appears in the range argument.

The set of type variables determined from a signature
is the @emph{closure} of the set appearing in the signature
under the functional dependencies induced from the constraints.
The closure is defined as the smallest set of variables
which includes all of the variables appearing in the signature,
and is such that, for each induced functional dependency
@samp{@var{Domain} -> @var{Range}},
if the closure includes all of the variables in @var{Domain}
then it includes all of the variables in @var{Range}.

For example, the declaration

@example
:- pred p(X, Y) <= baz(map(X, Y), list(Z)).
@end example

@noindent
is acceptable since the closure of @{@var{X},
@var{Y}@} under the induced functional dependency must include @var{Z}.
Moreover, the typeclass @code{baz/2} would be allowed
to have a method that only uses the first parameter, @var{A},
since the second parameter, @var{B}, would always be determined from the first.

Note that, since all instances must satisfy the superclass constraints,
the restrictions on instances obviously transfer from superclass to subclass.
Again, this allows the requirements of typeclass constraints to be relaxed.
Thus, the functional dependencies on the ancestors of constraints
also induce functional dependencies on the variables,
and the closure that we calculate takes these into account.

For example, in this code

@example
:- typeclass quux(P, Q, R) <= baz(R, P) where @dots{}

:- pred q(Q, R) <= quux(P, Q, R).
@end example
the signature of @code{q/2} is acceptable
since the superclass constraint on @code{quux/3}
induces the dependency @samp{R -> P} on the type variables,
hence @var{P} is in the closure of @{@var{Q}, @var{R}@}.

The presence of functional dependencies
also allows ``improvement'' to occur during type inference.
This can occur in two ways.
First, if two constraints of a given class match
on all of the domain arguments of a functional dependency on that class,
then it can be inferred that they also match on the range arguments.
For example,
given the constraints @w{@code{baz(A, B1)}} and @w{@code{baz(A, B2)}},
it will be inferred that @code{B1 = B2}.

Similarly, if a constraint of a given class
is subsumed by a known instance of that class in the domain arguments,
then its range arguments can be unified
with the corresponding instance range arguments.
For example, given the instance:

@example
:- instance baz(list(T), string) where @dots{}
@end example

@noindent
then the constraint @code{baz(list(int), X)}
can be improved with the inference that @w{@code{X = string}}.

@node Existential types
@chapter Existential types

Existentially quantified type variables
(or simply ``existential types'' for short)
are useful tools for data abstraction.
In combination with type classes,
they allow you to write code in an ``object oriented'' style
that is similar to the use of interfaces in Java
or abstract base classes in C++.

Mercury supports existential type quantifiers
on predicate and function declarations, and in data type definitions.
You can put type class constraints on existentially quantified type variables.

@menu
* Existentially typed predicates and functions::
* Existential class constraints::
* Existentially typed data types::
* Some idioms using existentially quantified types::
@end menu

@node Existentially typed predicates and functions
@section Existentially typed predicates and functions

@menu
* Syntax for explicit type quantifiers::
* Semantics of type quantifiers::
* Examples of correct code using type quantifiers::
* Examples of incorrect code using type quantifiers::
@end menu

@node Syntax for explicit type quantifiers
@subsection Syntax for explicit type quantifiers

Type variables in type declarations for polymorphic predicates or functions
are normally universally quantified.
However, it is also possible to existentially quantify such type variables,
by using an explicit existential quantifier of the form @samp{some @var{Vars}}
before the @samp{pred} or @samp{func} declaration,
where @var{Vars} is a list of variables.

For example:

@example
% Here the type variable `T' is existentially quantified.
:- some [T] pred foo(T).

% Here the type variables `T1' and `T2' are existentially quantified.
:- some [T1, T2] func bar(int, list(T1), set(T2)) = pair(T1, T2).

% Here the type variable `T2' is existentially quantified,
% but the type variables `T1' and `T3' are universally quantified.
:- some [T2] pred foo(T1, T2, T3).
@end example

Explicit universal quantifiers, of the form @samp{all @var{Vars}},
are also permitted on @samp{pred} and @samp{func} declarations,
although they are not necessary, since universal quantification is the default.
(If both universal and existential quantifiers are present,
the universal quantifiers must precede the existential quantifiers.)
For example:

@example
% Here the type variable `T2' is existentially quantified,
% but the type variables `T1' and `T3' are universally quantified.
:- all [T3] some [T2] pred foo(T1, T2, T3).
@end example

@node Semantics of type quantifiers
@subsection Semantics of type quantifiers

If a type variable in the type declaration
for a polymorphic predicate or function is universally quantified,
this means the caller will determine the value of the type variable,
and the callee must be defined so that it will work
for @emph{all} types which are an instance of its declared type.

For an existentially quantified type variable,
the situation is the converse:
the @emph{callee} must determine the value of the type variable,
and all @emph{callers} must be defined so as to work
for all types which are an instance of the called procedure's declared type.

When type checking a predicate or function,
if a variable has a type that occurs
as a universally quantified type variable
in the predicate or function declaration,
or a type that occurs as an existentially quantified type variable
in the declaration of one of the predicates or functions that it calls,
then its type is treated as an opaque type.
This means that there are very few things
which it is legal to do with such a variable ---
basically you can only pass it to another procedure expecting the same type,
unify it with another value of the same type,
put it in a polymorphic data structure,
or pass it to a polymorphic procedure
whose argument type is universally quantified.
(Note, however, that the standard library includes some quite powerful
procedures such as @samp{io.write} which can be useful in this context.)

A non-variable type (i.e.@: a type that is not a type variable)
is considered @emph{more general}
than an existentially quantified type variable.
Type inference will therefore never infer
an existentially quantified type for a predicate or function
unless that predicate or function calls (directly or indirectly)
a predicate or function which was explicitly declared
to have an existentially quantified type.

Note that an existentially typed procedure
is not allowed to have different types
for its existentially typed arguments in different clauses
(even mode-specific clauses)
or in different subgoals of a single clause;
however, the same effect can be achieved in other ways
(@pxref{Some idioms using existentially quantified types}).

For procedures involving calls to existentially-typed predicates or functions,
the compiler's mode analysis must take account
of the modes for type variables in all polymorphic calls.
Universally quantified type variables have mode @code{in},
whereas existentially quantified type variables have mode @code{out}.
As usual, the compiler's mode analysis
will attempt to reorder the elements of conjunctions
in order to satisfy the modes.

@node Examples of correct code using type quantifiers
@subsection Examples of correct code using type quantifiers

Here are some examples of type-correct code
using universal and existential types.

@example
/* simple examples */

:- pred foo(T).
foo(_).
        % ok

:- pred call_foo.
call_foo :- foo(42).
        % ok (T = int)

:- some [T] pred e_foo(T).
e_foo(X) :- X = 42.
        % ok (T = int)

:- pred call_e_foo.
call_e_foo :- e_foo(_).
        % ok

/* examples using higher-order functions */

:- func bar(T, T, func(T) = int) = int.
bar(X, Y, F) = F(X) + F(Y).
        % ok

:- func call_bar = int.
call_bar = bar(2, 3, (func(X) = X*X)).
        % ok (T = int)
        % returns 13 (= 2*2 + 3*3)

:- some [T] pred e_bar(T, T, func(T) = int).
:-          mode e_bar(out, out, out(func(in) = out is det)).
e_bar(2, 3, (func(X) = X * X)).
        % ok (T = int)

:- func call_e_bar = int.
call_e_bar = F(X) + F(Y) :- e_bar(X, Y, F).
        % ok
        % returns 13 (= 2*2 + 3*3)

@end example

@node Examples of incorrect code using type quantifiers
@subsection Examples of incorrect code using type quantifiers

Here are some examples of code using universal and existential types
that contains type errors.

@example
/* simple examples */

:- pred bad_foo(T).
bad_foo(42).
        % type error

:- some [T] pred e_foo(T).
e_foo(42).
        % ok

:- pred bad_call_e_foo.
bad_call_e_foo :- e_foo(42).
        % type error

:- some [T] pred e_bar1(T).
e_bar1(42).
e_bar1(42).
e_bar1(43).
        % ok (T = int)

:- some [T] pred bad_e_bar2(T).
bad_e_bar2(42).
bad_e_bar2("blah").
        % type error (cannot unify types `int' and `string')

:- some [T] pred bad_e_bar3(T).
bad_e_bar3(X) :- e_foo(X).
bad_e_bar3(X) :- e_foo(X).
        % type error (attempt to bind type variable `T' twice)

@end example

@node Existential class constraints
@section Existential class constraints

Existentially quantified type variables
are especially useful in combination with type class constraints.

Type class constraints can be either universal or existential.
Universal type class constraints are written using @samp{<=},
as described in @ref{Type class constraints on predicates and functions};
they signify a constraint that the @emph{caller} must satisfy.
Existential type class constraints are written in the same syntax
as universal constraints, but using @samp{=>} instead of @samp{<=};
they signify a constraint that the @emph{callee} must satisfy.
If a declaration has both universal and existential constraints,
then the existential constraints must precede the universal constraints.

For example:

@example
% Here `c1(T2)' and `c2(T2)' are existential constraints,
% and `c3(T1)' is a universal constraint,
:- all [T1] some [T2] ((pred p(T1, T2) => (c1(T2), c2(T2))) <= c3(T1)).
@end example

Existential constraints must only constrain type variables
that are explicitly existentially quantified.
Likewise, universal constraints must only constrain type variables
that are universally quantified,
although in this case the quantification does not have to be explicit
because universal quantification is the default
(see @ref{Syntax for explicit type quantifiers}).

@node Existentially typed data types
@section Existentially typed data types

Type variables occurring in the body of a discriminated union type
definition may be existentially quantified.
Constructor definitions within discriminated union type definitions
may be preceded by an existential type quantifier
and followed by one or more existential type class constraints.

For example:

@example
% A simple heterogeneous list type.
:- type list_of_any
    --->    nil_any
    ;       some [T] cons_any(T, list_of_any).

% A heterogeneous list type with a type class constraint.
:- typeclass showable(T) where [ func show(T) = string ].
:- type showable_list
    --->    nil
    ;       some [T] (cons(T, showable_list) => showable(T)).

% A different way of doing the same kind of thing, this
% time using the standard type list(T).
:- type showable
    --->    some [T] (s(T) => showable(T)).
:- type list_of_showable == list(showable).

% Here is an arbitrary example involving multiple type variables
% and multiple constraints.
:- typeclass foo(T1, T2) where [ /* @dots{} */ ].
:- type bar(T)
    --->    f1
    ;       f2(T)
    ;       some [T1] f3(T1)
    ;       some [T1, T2] f4(T1, T2, T) => (showable(T1), showable(T2))
    ;       some [T1, T2] f5(list(T1), T2) => foo(T1, T2).
@end example

Construction and deconstruction of existentially quantified data types
are inverses:
when constructing a value of an existentially quantified data type,
the ``existentially quantified'' functor acts
for purposes of type checking like a universally quantified function:
the caller will determine the values of the type variables.
Conversely, for deconstruction the functor acts
like an existentially quantified function:
the caller must be defined so as to work
for all possible values of the existentially quantified type variables
which satisfy the declared type class constraints.

In order to make this distinction clear to the compiler,
whenever you want to construct a value
using an existentially quantified functor,
you must prepend @samp{new } onto the functor name.
This tells the compiler to treat it as though it were universally quantified:
the caller can bind that functor's existentially quantified type variables
to any type which satisfies the declared type class constraints.
Conversely, any occurrence without the @samp{new } prefix
must be a deconstruction, and is therefore existentially quantified:
the caller must not bind the existentially quantified type variables,
but the caller is allowed to depend on those type variables
satisfying the declared type class constraints, if any.

For example, the function @samp{make_list} constructs
a value of type @samp{showable_list}
containing a sequence of values of different types,
all of which are instances of the @samp{showable} class

@example
:- instance showable(int).
:- instance showable(float).
:- instance showable(string).

:- func make_list = showable_list.
make_list = List :-
        Int = 42,
        Float = 1.0,
        String = "blah",
        List =  'new cons'(Int,
                'new cons'(Float,
                'new cons'(String, nil))).
@end example

@noindent
while the function @samp{process_list} below
applies the @samp{show} method of the @samp{showable} class
to the values in such a list.

@example
:- func process_list(showable_list) = list(string).
process_list(nil) = [].
process_list(cons(Head, Tail)) = [show(Head) | process_list(Tail)].
@end example

@noindent
There are some restrictions on the forms that
existentially typed data constructors can take.

The first restriction is that no type variable may be quantified
both universally, by being listed as an argument of the type constructor,
and existentially, by being listed in the existential type quantifier
before the data constructor.
The type @samp{t12} violates this restriction:

@example
:- type t12(T)
    --->    f1(T)
    ;       some [T] f2(T).
@end example

@noindent
The reason for the restriction is simple:
the reference of @samp{T} in the @samp{f2} data constructor
being simultaneously inside the scope of more than one quantification
can mislead readers who see one of the quantifications,
and stop looking for the other.
The simplest way to avoid such confusion
is to require the programmer to avoid having one quantification shadow another.

The second restriction is that
type variables listed
in the existential type quantifier before the data constructor
cannot be repeated.
Type variables in the argument list of the type constructor
also cannot be repeated,
whether or not the data constructors of that type
have existential types.
The type @samp{t34} violates both these restrictions:

@example
:- type t34(A, B, A)
    --->    f3(A, B)
    ;       some [C, D, D] f4(C, D).
@end example

The third and final restriction is that
every existentially quantified type variable
must occur
@itemize
@item either in one of the argument types of the data constructor,
@item or in one of the type class constraints on the data constructor,
in the range of a functional dependency.
@end itemize

@noindent
This means that the type @samp{t5} in
@example
:- type t5
    --->    some [T1, T2] f5(T1) => xable(T1, T2).
@end example
@noindent
violates this restriction
@emph{unless} the type class @samp{xable} has a functional dependency
that determines the type bound to its second argument
from the type bound to its first.

The reason for this restriction is that
the identity of the type bound to the existential type variable
must somehow be decided at runtime.
It can either be given by the type of an argument,
or determined through a functional dependency
from the types bound to one or more other existential type variables.

@node Some idioms using existentially quantified types
@section Some idioms using existentially quantified types

The standard library module @samp{univ}
provides an abstract type named @samp{univ} which can hold values of any type.
You can form heterogeneous containers
(containers that can hold values of different types at the same time)
by using data structures that contain @code{univ}s, e.g.@: @samp{list(univ)}.

The interface to @samp{univ} includes the following:

@example
% `univ' is a type which can hold any value.
:- type univ.

% The function univ/1 takes a value of any type and constructs
% a `univ' containing that value (the type will be stored along
% with the value)
:- func univ(T) = univ.

% The function univ_value/1 takes a `univ' argument and extracts
% the value contained in the `univ' (together with its type).
% This is the inverse of the function univ/1.
:- some [T] func univ_value(univ) = T.
@end example

The @samp{univ} type in the standard library
is in fact a simple example of an existentially typed data type.
It could be implemented as follows:

@example
:- implementation.
:- type univ
    --->    some [T] mkuniv(T).
univ(X) = 'new mkuniv'(X).
univ_value(mkuniv(X)) = X.
@end example

An existentially typed procedure
is not allowed to have different types for its existentially typed arguments
in different clauses or in different subgoals of a single clause.
For instance, both of the following examples are illegal:

@example
:- some [T] pred bad_example(string, T).

bad_example("foo", 42).
bad_example("bar", "blah").
    % type error (cannot unify `int' and `string')

:- some [T] pred bad_example2(string, T).

bad_example2(Name, Value) :-
    ( Name = "foo", Value = 42
    ; Name = "bar", Value = "blah"
    ).
    % type error (cannot unify `int' and `string')
@end example

However, using @samp{univ},
it is possible for an existentially typed function
to return values of different types at each invocation.

@example
:- some [T] pred good_example(string, T).

good_example(Name, univ_value(Univ)) :-
    ( Name = "foo", Univ = univ(42)
    ; Name = "bar", Univ = univ("blah")
    ).
@end example

Using @samp{univ} does not work if you also want to use type class constraints.
If you want to use type class constraints,
then you must define your own existentially typed data type,
analogous to @samp{univ}, and use that:

@example
:- type univ_showable
    --->    some [T] (mkshowable(T) => showable(T)).

:- some [T] pred harder_example(string, T) => showable(T).

harder_example(Name, Showable) :-
    ( Name = "bar", Univ = 'new mkshowable'(42)
    ; Name = "bar", Univ = 'new mkshowable'("blah")
    ),
    Univ = mkshowable(Showable).
@end example

The issue can also arise for mode-specific clauses
(@pxref{Different clauses for different modes}).
For instance, the following example is illegal:

@example
:- some [T] pred bad_example3(string, T).
:-         mode bad_example3(in(bound("foo")), out) is det.
:-          mode bad_example3(in(bound("bar")), out) is det.
:- pragma promise_pure(bad_example3/2).
bad_example3("foo"::in(bound("foo")), 42::out).
bad_example3("bar"::in(bound("bar")), "blah"::out).
    % type error (cannot unify `int' and `string')
@end example

The solution is similar,
although in this case an intermediate predicate is required:

@example
:- some [T] pred good_example3(string, T).
:-          mode good_example3(in(bound("foo")), out) is det.
:-          mode good_example3(in(bound("bar")), out) is det.
good_example3(Name, univ_value(Univ)) :-
        good_example3_univ(Name, Univ).

:- pred good_example3_univ(string, univ).
:- mode good_example3_univ(in(bound("foo")), out) is det.
:- mode good_example3_univ(in(bound("bar")), out) is det.
:- pragma promise_pure(good_example3_univ/2).
good_example3_univ("foo"::in(bound("foo")), univ(42)::out).
good_example3_univ("bar"::in(bound("bar")), univ("blah")::out).
@end example

@c -----------------------------------------------------------------------

@node Type conversions
@chapter Type conversions

(This is a new and experimental feature, subject to change.)

A term may be converted from one type @var{FromType}
to another type @var{ToType}
using a type conversion expression of the form:

@example
coerce(@var{Term})
@end example

The expression is type-correct if and only if
@var{FromType} and @var{ToType} are both discriminated union types,
and after replacing the principal type constructors with base types
(@pxref{Subtypes})
the two types have the same type constructor,
and the arguments of the common type constructor
satisfy the type parameter variance restrictions below.

Let @var{FromType} expand out to @samp{base(S1, ..., Sn)}
and @var{ToType} expand out to @samp{base(T1, ..., Tn)},
where @samp{base(B1, ..., Bn)} is the common base type,
and @var{Bi} is the i'th type parameter,
which is bound to @var{Si} in @var{FromType}
and @var{Ti} in @var{ToType}.

For each pair of corresponding type arguments,
one of the following must be true:
@itemize
@item
@samp{Si = Ti}
if the two types are the same

@item
@samp{Si < Ti}
if @var{Si} is a subtype of @var{Ti}
by the relation below

@item
@samp{Ti < Si}
if @var{Ti} is a subtype of @var{Si}
by the relation below
@end itemize

Otherwise, the @code{coerce} expression is not type-correct.
@c NOTE: we deliberately disallow coercion between arbitrary phantom types.

Furthermore,
@samp{Si = Ti} must be true
if @var{Bi} occurs in one or more of these locations
in the @samp{base/n} type definition:

@itemize
@item
in a higher-order type

@item
in a foreign type

@item
in an abstract type

@item
in a solver type

@item
in a discriminated union type,
other than a recursive type of the exact form @samp{base(B1, ..., Bn)}
@end itemize

The relation @samp{S < T} is true when @samp{S != T} and either:
@itemize
@item
@samp{S} and @samp{T} are both discriminated union types,
and @samp{S} is a subtype of @samp{T} by visible subtype definitions;
or

@item
@samp{S} and @samp{T} are both tuple types of the same arity,
and for each pair of corresponding argument types @samp{Si} and @samp{Ti},
@samp{Si < Ti} is true.
@end itemize

@heading Mode checking

Type conversion expressions must also be mode-correct.
Intuitively, conversion from a subtype to its supertype is safe,
but a conversion from a supertype to one of its subtypes is safe only if
the inst approximating the term to be converted
indicates that the result would also be valid in the subtype.

Mode checking proceeds by simultaneously traversing
the inst tree of the @code{coerce} argument,
the type tree of the @code{coerce} argument,
and the type tree of the result term,
and producing the inst tree of the result term
if the conversion is valid.
Let
@itemize
@item
@var{InstX} be the current node in the @code{coerce} argument's inst tree,
@item
@var{InstY} be the current node in the result inst tree,
@item
@var{TypeX} be the current node in the @code{coerce} argument's type tree,
@item
@var{TypeY} be the current node in the result type tree,
@item
@var{TypeCtorX} be the principal type constructor of @var{TypeX},
@item
@var{TypeCtorY} be the principal type constructor of @var{TypeY}.
@end itemize

In the following, @var{X} < @var{Y} means
@var{X} is a subtype of @var{Y}
by visible subtype definitions.

For each node @var{InstX}:
@itemize
@item
If @var{InstX} is a recursive node in the inst tree
(i.e. it is its own ancestor),
then we require @var{TypeX} =< @var{TypeY}.
Let @var{InstY} = @var{InstX}.

@item
Otherwise, if @var{InstX} is a @code{bound} node:

    @itemize
    @item
    If @var{TypeX} is an existentially quantified type variable,
    then @var{InstY} = @var{InstX}.

    @item
    If @var{TypeX} is not an existentially quantified type variable,
    then each of the function symbols listed in @var{InstX}
    must name a constructor in @var{TypeCtorY}.
    Let @var{InstY} be a @code{bound} inst containing those same
    function symbols;
    the insts for the arguments of each function symbol
    are then checked and constructed recursively.
    @end itemize

@item
Otherwise, if @var{InstX} is a @code{ground} node:

    @itemize
    @item
    If @var{TypeX} = @var{TypeY},
    or if @var{TypeX} is an existentially quantified type variable,
    then let @var{InstY} = @var{InstX}.
    @c This includes higher-order types.

    @item
    If @var{TypeX} < @var{TypeY}, then
    let @var{InstY} be the @code{bound} node constructed
    using the process below.
    @end itemize

@item
Otherwise,
the @code{coerce} expression is not mode-correct.
@end itemize

To construct a @samp{bound} node @var{InstY}
from a @samp{ground} node @var{InstX}:

@itemize
@item
If @var{TypeX} = @var{TypeY}
or if @var{TypeX} is a recursive node in the type tree
(i.e. it is its own ancestor),
then let @var{InstY} be @code{ground}.

@item
Otherwise, let @var{InstY} be a @code{bound} inst
containing all of the constructors in @var{TypeCtorX};
the insts for the arguments of each function symbol
are constructed recursively.
@end itemize

@heading Examples

Assume we have:

@example
:- type fruit
   --->    apple
   ;       lemon
   ;       orange.

:- type citrus =< fruit
   --->    lemon
   ;       orange.
@end example

This function is type and mode-correct:

@example
:- func f1(citrus) = fruit.

f1(X) = coerce(X).
@end example

This function is type-correct but not mode-correct
because some @code{fruit}s are not @code{citrus}:

@example
:- func f2(fruit) = citrus.

f2(X) = coerce(X).  % incorrect
@end example

This function is mode-correct
because the initial inst of the input argument
limits the range of @code{fruit} values
to those that would also be valid in @code{citrus}:

@example
:- inst citrus for fruit/0
    --->    lemon
    ;       orange.

:- func f3(fruit) = citrus.
:- mode f3(in(citrus)) = out is det.

f3(X) = coerce(X).
@end example

Finally,
this function is type-incorrect because
in the coerce expression,
the type parameter @var{T} of @code{wrap/1}
is bound to @code{fruit} in the input type,
but @code{citrus} in the result type.

@example
:- type wrap(T)
    --->    wrap(T).

:- func f4(func(fruit) = int) = (func(citrus) = int).

f4(X) = Y :-
    wrap(Y) = coerce(wrap(X)).  % incorrect
@end example

@c -----------------------------------------------------------------------

@node Exception handling
@chapter Exception handling

Mercury procedures may throw exceptions.
Exceptions may be caught
using the predicates defined in the @samp{exception} library module,
or using try goals.

@noindent
A @samp{try} goal has the following form:

@example
    try @var{Params} @var{Goal}
    then @var{ThenGoal}
    else @var{ElseGoal}
    catch @var{Term} -> @var{CatchGoal}
    @dots{}
    catch_any @var{CatchAnyVar} -> @var{CatchAnyGoal}
@end example

@var{Goal}, @var{ThenGoal}, @var{ElseGoal}, @var{CatchGoal},
@var{CatchAnyGoal} must be valid goals.

@var{Goal} must have one of the following determinisms:
@code{det}, @code{semidet}, @code{cc_multi}, or @w{@code{cc_nondet}}.

The non-local variables of @var{Goal}
must not have an inst equivalent to
@code{unique}, @w{@code{mostly_unique}} or @code{any},
unless they have the type @samp{io.state}.
@c or (later) the store/1.)

@var{Params} must be a valid list of zero or more try parameters.

The ``then'' part is mandatory.
The ``else'' part is mandatory if @var{Goal} may fail;
otherwise it must be omitted.
There may be zero or more ``catch'' branches.
The ``catch_any'' part is optional.
@var{CatchAnyVar} must be a single variable.

The try parameter @samp{io} takes a single argument,
which must be the name of a state variable prefixed by @samp{!};
for example, @samp{io(!IO)}.
The state variable must have the type @samp{io.state},
and be in scope of the try goal.
The state variable is threaded through @var{Goal},
so it may perform I/O but cannot fail.
If no @samp{io} parameter exists, @var{Goal} may not perform I/O and may fail.

A try goal has determinism @code{cc_multi}.
@c Exception: if all of the then/else/catch/catch_any parts only succeed
@c without binding non-local variables then the determinism is det.
@c In the implementation we may still infer cc_multi though.

On entering a try goal, @var{Goal} is executed.
If it succeeds without throwing an exception, @var{ThenGoal} is executed.
Any variables bound by @var{Goal} are visible in @var{ThenGoal} only.
If @var{Goal} fails, then @var{ElseGoal} is executed.

If @var{Goal} throws an exception,
the exception value is unified with
each of the @var{Term}s in the ``catch'' branches in turn.
On the first successful unification,
the corresponding @var{CatchGoal} is executed
(and other ``catch'' and ``catch_any'' branches ignored).
Variables bound during the unification of the @var{Term}
are in scope of the corresponding @var{CatchGoal}.

If the exception value does not unify
with any of the terms in ``catch'' branches,
and a ``catch_any'' branch is present,
the exception is bound to @var{CatchAnyVar}
and the @var{CatchAnyGoal} executed.
@var{CatchAnyVar} is visible in the @var{CatchAnyGoal} only,
and is existentially typed, i.e.@: it has type @samp{some [T] T}.

Finally, if the thrown value did not unify with any ``catch'' term,
and there is no ``catch_any'' branch, the exception is rethrown.

@noindent
The declarative semantics of a try goal is:

@example
@group
    (
        try [] Goal
        then Then
        else Else
        catch CP1 -> CG1
        catch CP2 -> CG2
        @dots{}
        catch_any CAV -> CAG
    )
    <=>
    (
        Goal, Then
    ;
        not Goal, Else
    ;
        some [Excp]
        ( if Excp = CP1 then
            CG1
        else if Excp = CP2 then
            CG2
        else if @dots{}
            @dots{}
        else
            Excp = CAV,
            CAG
        )
    ).
@end group
@end example

If no @samp{else} branch is present, then @samp{Else = fail}.
If no @samp{catch_any} branch is present, then @samp{CAG = fail}.

@noindent
An example of a try goal that performs I/O is:

@example
:- pred p_carefully(io::di, io::uo) is cc_multi.

p_carefully(!IO) :-
    (try [io(!IO)] (
        io.write_string("Calling p\n", !IO),
        p(Output, !IO)
    )
    then
        io.write_string("p returned: ", !IO),
        io.write(Output, !IO),
        io.nl(!IO)
    catch S ->
        io.write_string("p threw a string: ", !IO),
        io.write_string(S, !IO),
        io.nl(!IO)
    catch 42 ->
        io.write_string("p threw 42\n", !IO)
    catch_any Other ->
        io.write_string("p threw something: ", !IO),
        io.write(Other, !IO),
        % Rethrow the value.
        throw(Other)
    ).
@end example

@noindent
One common use for exceptions is to check the input
and throw an exception if it is invalid.
It might be tempting to implement this
with a predicate such as the following:

@example
:- pred check_target(target::in) is det.

check_target(Target) :-
    ( if ... then
        true
    else
        throw("invalid target")
    ).
@end example

@noindent
This code warrants caution, however.
Consider the following usage:

@example
shoot(Target, !IO) :-
    check_target(Target),
    unsafe_shoot(Target, !IO).
@end example

@noindent
Mercury may reorder conjunctions,
which is (probably) not what the user intended in this case.
Furthermore,
Mercury may optimize away the call to @samp{check_target/1} entirely,
since the mode-determinism assertion for
a @samp{det} predicate with no outputs
essentially states that it is equivalent to @samp{true}.

The strict sequential semantics can be used
to guarantee that these changes will not occur
(@pxref{Formal semantics}).
However,
we recommend implementing checks like these in the following way,
to avoid depending on the choice of operational semantics:

@example
shoot(Target0, !IO) :-
    check_target(Target0, Target),
    unsafe_shoot(Target, !IO).

:- pred check_target(target::in, target::out) is det.

check_target(Target0, Target) :-
    ( if ... then
        Target = Target0
    else
        throw("invalid target")
    ).
@end example

@node Formal semantics
@chapter Formal semantics

A legal Mercury program is one that complies with the syntax,
type, mode, determinism, and module system rules specified in earlier chapters.
If a program does not comply with those rules,
the compiler must report an error.

For each legal Mercury program,
there is an associated predicate calculus theory
whose language is specified by the type declarations in the program
and whose axioms are the completion of the clauses for all predicates
in the program,
plus the usual equality axioms extended with the completion of the
equations for all functions in the program,
plus axioms corresponding to the mode-determinism assertions
(@pxref{Determinism}),
plus axioms specifying the semantics of library predicates and functions.
The declarative semantics of a legal Mercury program
is specified by this theory.

Mercury implementations must be sound:
the answers they compute must be true in every model of the theory.
Mercury implementations are not required to be complete:
they may fail to compute an answer in finite time,
or they may exhaust the resource limitations of the execution
environment, even though an answer is provable in the theory.
However, there are certain minimum requirements that they
must satisfy with respect to completeness.

There is an operational semantics of Mercury programs called the
@dfn{strict sequential} semantics.
In this semantics,
the program is executed top-down using SLDNF resolution
(or something equivalent),
starting from @samp{main/2}
preceded by any module initialisation goals
(as per @ref{Module initialisation}),
and followed by any module finalisation goals
(as per @ref{Module finalisation}).
Function calls, conjunctions and disjunctions are all
executed in depth-first left-to-right order.
Conjunctions and function calls are
``minimally'' reordered as required by the modes:
the order is determined by selecting the first mode-correct sub-goal
(conjunct or function call),
executing that, then selecting the first of the remaining sub-goals
which is now mode-correct, executing that, and so on.
There is no interleaving of different individual conjuncts or function calls:
the sub-goals are reordered, not split and interleaved.
Function application is strict, not lazy.
Predicate calls are strict in the sense that
goals will be executed irrespective of any mode-determinism assertions,
even if they loop,
are @samp{erroneous},
or are @samp{det} and contain no outputs.
@c XXX should document the operational semantics of switches and if-then-elses

Mercury implementations are required to provide a method of processing
Mercury programs which is equivalent to the strict sequential
semantics.

There is another operational semantics of Mercury programs
called the @dfn{strict commutative} semantics.
This semantics is equivalent to the strict sequential semantics
except that there is no requirement that
function calls, conjunctions and disjunctions be executed left-to-right;
they may be executed in any order, and may even be interleaved.
Furthermore, the order may differ
each time a particular goal is entered.

As well as providing the strict sequential semantics,
Mercury implementations may provide
one or more implementation-defined operational semantics,
as long as any such implementation-defined operational semantics
is at least as complete as the strict commutative semantics.
An implementation-defined operational semantics
is ``at least as complete'' as the strict commutative semantics
if and only if the implementation-defined operational semantics
guarantees to compute an answer in finite time
for any program for which an answer would be computed in finite time
for all possible executions under the strict commutative semantics
(i.e.@: for all possible orderings of
function calls, conjunctions and disjunctions).

Thus, to summarize,
there are in fact a variety of different operational semantics for Mercury.
One of them, the strict sequential semantics,
is deterministic---the behaviour is always specified exactly.
Programs are executed top-down,
mode analysis does ``minimal'' reordering (in a precisely defined sense),
function calls, conjunctions and disjunctions
are executed depth-first left-to-right,
and function and predicate evaluation is strict.
All implementations are required to support the strict sequential semantics,
so that a program which works on one implementation using this semantics
will be guaranteed to work on any other implementation.
However, implementations are also allowed to support
other operational semantics
(which may be non-deterministic)
as long as they are sound with respect to the declarative semantics,
and meet a minimum level of completeness.

This compromise allows Mercury to be used in several different ways.
Programmers who care more about ease of programming and portability
than about efficiency can use the strict sequential semantics,
and can then be guaranteed that
if their program works on one correct implementation,
it will work on all correct implementations.
Compiler implementors who want to write optimizing implementations
that do lots of clever code reorderings and other high-level transformations
or that want to offer parallelizing implementations
which take maximum advantage of parallelism
can define different semantic models.
Programmers who care about efficiency more than portability
can write code for these implementation-defined semantic models.
Programmers who care about efficiency @emph{and} portability
can achieve this by writing code for the strict commutative semantics.
In some ways this is not as easy as using the strict sequential semantics,
since it is in general not sufficient
to test your programs on just one implementation
if you are to be sure that it will be able to use
the maximally efficient operational semantics on any implementation.
On the other hand,
if you do write code which works for all possible executions
under the strict commutative semantics,
then you can be guaranteed that it will work correctly
on every implementation,
under every possible implementation-defined operational semantics.

The Melbourne Mercury implementation offers
eight different operational semantics,
which can be selected with different combinations
of the following options:

@table @code
@item --no-reorder-conj
Only do minimal reordering of conjunctions.
@item --no-reorder-disj
Do not reorder disjunctions.
@item --no-fully-strict
Predicate calls are not strict
(function application is always strict in the current implementation).
This option allows the compiler to improve completeness
by optimizing away infinite loops,
goals with determinism @code{erroneous},
and goals with determinism @code{det} and no outputs.
@end table

@noindent
The default semantics is the strict commutative semantics.
The strict sequential semantics can be selected with the
@samp{--no-reorder-conj} and @samp{--no-reorder-disj} options.

Future implementations of Mercury
may wish to offer other implementation-defined operational semantics.
For example, they may wish to provide semantics
in which function evaluation is lazy rather than strict,
semantics with a guaranteed fair search rule, and so forth.

@node Foreign language interface
@chapter Foreign language interface

@menu
* Calling foreign code from Mercury::  How to implement a Mercury predicate
                                       or function as a call to code
                                       written in a different
                                       programming language.
* Calling Mercury from foreign code::  How to call a Mercury predicate
                                       or function from a different
                                       programming language.
* Data passing conventions::           How Mercury types are passed to
                                       different languages.
* Using foreign types from Mercury::   How to use a type defined in
                                       a different programming language
                                       in Mercury code.
* Using foreign enumerations in Mercury code:: How to use an enumeration type
                                                defined in a foreign language
                                                in Mercury code.
* Using Mercury enumerations in foreign code:: How to use an enumeration type
                                               defined in Mercury in a
                                               different programming language.
* Adding foreign declarations::        How to add declarations of
                                       entities in other programming
                                       languages.
* Declaring Mercury exports to other modules::
                                       How to call Mercury procedures from a
                                       different programming language in
                                       another module.
* Adding foreign definitions::         How to add definitions of
                                       entities in other programming
                                       languages.
* Language specific bindings::         Information specific to each
                                       foreign language.

@end menu

This chapter documents the foreign language interface.

@node Calling foreign code from Mercury
@section Calling foreign code from Mercury

Mercury procedures can be implemented
using fragments of foreign language code using @samp{pragma foreign_proc}.

@menu
* pragma foreign_proc::         Defining Mercury procedures using foreign code.
* Foreign code attributes::     Describing properties of foreign
                                functions or code.
@end menu

@node pragma foreign_proc
@subsection pragma foreign_proc

A declaration of the form

@example
:- pragma foreign_proc("@var{Lang}",
    @var{Pred}(@var{Var1}::@var{Mode1}, @var{Var2}::@var{Mode2}, @dots{}),
    @var{Attributes}, @var{Foreign_Code}).
@end example

@noindent
or

@example
:- pragma foreign_proc("@var{Lang}",
    @var{Func}(@var{Var1}::@var{Mode1}, @var{Var2}::@var{Mode2}, @dots{}) = (@var{Var}::@var{Mode}),
    @var{Attributes}, @var{Foreign_Code}).
@end example

@noindent
means that any calls to the specified mode of @var{Pred} or @var{Func}
will result in execution of the foreign code given in @var{Foreign_Code}
written in language @var{Lang},
if @var{Lang} is selected as the foreign language code by this implementation.
See the ``Foreign Language Interface'' chapter of the Mercury User's Guide,
for more information about how the implementation selects
the appropriate @samp{foreign_proc} to use.

The foreign code fragment may refer to the specified variables
(@var{Var1}, @var{Var2}, @dots{}, and @var{Var}) directly by name.
It is an error for a variable to occur more than once in the argument list.
These variables will have foreign language types
corresponding to their Mercury types,
as determined by language and implementation specific rules.

All @samp{foreign_proc} implementations are assumed to be impure.
If they are actually pure or semipure,
they must be explicitly promised as such by the user
(either by using foreign language attributes specified below,
or a @samp{promise_pure} or @samp{promise_semipure} pragma
as specified in @ref{Impurity}).

Additional restrictions on the foreign language interface code
depend on the foreign language and compilation options.
For more information, including the list of supported foreign languages
and the strings used to identify them,
see the language specific information
in the ``Foreign Language Interface'' chapter of the Mercury User's Guide.

If there is a @code{pragma foreign_proc} declaration
for any mode of a predicate or function,
then there must be either a clause or a @code{pragma foreign_proc} declaration
for every mode of that predicate or function.

Here is an example of code using @samp{pragma foreign_proc}.
The following code defines a Mercury function @samp{sin/1}
which calls the C function @samp{sin()} of the same name.

@example
@group
:- func sin(float) = float.
:- pragma foreign_proc("C",
    sin(X::in) = (Sin::out),
    [promise_pure, may_call_mercury],
"
    Sin = sin(X);
").
@end group
@end example

If the foreign language code does not recursively invoke Mercury code,
as in the above example, then you can use @samp{will_not_call_mercury}
in place of @samp{may_call_mercury} in the declarations above.
This allows the compiler to use a slightly more efficient calling convention.
(If you use this form, and the foreign code @emph{does} invoke Mercury code,
then the behaviour is undefined --- your program may misbehave or crash.)

If there are both Mercury definitions and foreign_proc definitions
for a procedure and/or foreign_proc definitions for different languages,
it is implementation-defined which definition is used.

For pure and semipure procedures,
the declarative semantics of the foreign_proc definitions
must be the same as that of the Mercury code.
The only thing that is allowed to differ is the efficiency
(including the possibility of non-termination)
and the order of solutions.

It is an error for a procedure with a @samp{pragma foreign_proc} declaration
to have a determinism of @code{multi} or @code{nondet}.

Since foreign_procs with the determinism @code{multi} or @code{nondet}
cannot be defined directly,
procedures with those determinisms
that require foreign code in their implementation
must be defined using a combination
of Mercury clauses and (semi)deterministic foreign_procs.
The following implementation for the standard library predicate
@samp{string.append/3} in the mode @samp{append(out, out, in) is multi}
illustrates this technique:

@example
:- pred append(string, string, string).
:- mode append(out, out, in) is multi.

append(S1, S2, S3) :-
    S3Len = string.length(S3),
    append_2(0, S3Len, S1, S2, S3).

:- pred append_2(int::in, int::in, string::out, string::out, string::in) is multi.

append_2(NextS1Len, S3Len, S1, S2, S3) :-
    ( if NextS1Len = S3Len then
        append_3(NextS1Len, S3Len, S1, S2, S3)
    else
        (
            append_3(NextS1Len, S3Len, S1, S2, S3)
        ;
            append_2(NextS1Len + 1, S3Len, S1, S2, S3)
        )
    ).

:- pred append_3(int::in, int::in, string::out, string::out, string::in) is det.

:- pragma foreign_proc("C",
    append_3(S1Len::in, S3Len::in, S1::out, S2::out, S3::in),
    [will_not_call_mercury, promise_pure],
"
    S1 = allocate_string(S1Len);   /* Allocate a new string of length S1Len */
    memcpy(S1, S3, S1Len);
    S1[S1Len] = '\\0';
    S2 = allocate_string(S2, S3Len - S1Len);
    strcpy(S2, S3Len + S1Len);
").

@end example

@node Foreign code attributes
@subsection Foreign code attributes

As described above,
@samp{pragma foreign_proc} declarations may include a list of attributes
describing properties of the given foreign function or code.
All Mercury implementations must support the attributes listed below.
They may also support additional attributes.

The attributes which must be supported by all implementations
are as follows:

@table @asis

@item @samp{may_call_mercury}/@samp{will_not_call_mercury}
This attribute declares whether execution inside this foreign language code may
call back into Mercury or not.
The default, in case neither is specified, is @samp{may_call_mercury}.
Specifying @samp{will_not_call_mercury}
may allow the compiler to generate more efficient code.
If you specify @samp{will_not_call_mercury},
but the foreign language code @emph{does} invoke Mercury code,
then the behaviour is undefined.

@item @samp{promise_pure}/@samp{promise_semipure}
This attribute promises that
the purity of the given predicate or function definition is pure or semipure.
It is equivalent to a corresponding @samp{pragma promise_pure}
or @samp{pragma promise_semipure} declaration (@pxref{Impurity}).
If omitted, the clause specified by the @samp{foreign_proc}
is assumed to be impure.

@item @samp{thread_safe}/@samp{not_thread_safe}/@samp{maybe_thread_safe}
This attribute declares whether or not it is safe
for multiple threads to execute this foreign language code concurrently.
The default, in case none is specified, is @samp{not_thread_safe}.
If the foreign language code is declared @samp{thread_safe},
then the Mercury implementation is permitted
to execute the code concurrently from multiple threads
without taking any special precautions.
If the foreign language code is declared @samp{not_thread_safe},
then the Mercury implementation
must not invoke the code concurrently from multiple threads.
If the Mercury implementation does use multithreading,
then it must take appropriate steps to prevent this.
(The multithreaded version of the Melbourne Mercury implementation
protects @samp{not_thread_safe} code using a mutex:
C code that is not thread-safe has code inserted around it
to obtain and release a mutex.
All non-thread-safe foreign language code shares a single mutex.)
@c XXX this can cause deadlocks if not_thread_safe foreign language code calls
@c     Mercury which calls foreign language code
If the foreign language code is declared @samp{maybe_thread_safe}
then whether the code is considered
@samp{thread_safe} or @samp{not_thread_safe}
depends upon a compiler flag.
This attribute is useful when the thread safety of the foreign code itself
is conditional.
The Melbourne Mercury compiler uses the @samp{--maybe-thread-safe} option
to set the value of the @samp{maybe_thread_safe} attribute.
@end table

Additional attributes which are supported by the Melbourne Mercury compiler
are as follows:

@table @asis

@item @samp{tabled_for_io}
This attribute should be attached to foreign procedures that do I/O.
It tells the debugger to make calls to the foreign procedure idempotent.
This allows the debugger to safely retry across such calls
and also allows safe declarative debugging of code containing such calls.
For more information,
see the ``I/O tabling'' section of the Mercury User's Guide.
If the foreign procedure contains gotos or static variables then the
@samp{pragma no_inline} directive should also be given.
Note that currently I/O tabling will only be done
for foreign procedures that take a pair of I/O state arguments.
Impure foreign procedures that perform I/O will not be made idempotent,
even if the tabled_for_io attribute is present.
Note also that the tabled_for_io attribute
will likely be replaced in a future release with a more general solution.

@item @samp{terminates}/@samp{does_not_terminate}
This attribute specifies
the termination properties of the given predicate or function definition.
It is equivalent to the corresponding @samp{pragma terminates}
or @samp{pragma does_not_terminate} declaration.
If omitted, the termination property of the procedure is determined
by the value of the
@samp{may_call_mercury}/@samp{will_not_call_mercury} attribute.
See @ref{Termination analysis} for more details.

@item @samp{will_not_throw_exception}
This attribute promises that the given predicate or function
will not make calls back to Mercury
that may result in an exception being thrown.
It is an error to apply this attribute
to procedures that have determinism @code{erroneous}.
This attribute is ignored for code
that is declared as not making calls back to Mercury
via the @samp{will_not_call_mercury} attribute.
Note: predicates or functions that have polymorphic arguments
but do not explicitly throw an exception,
via a call to @samp{exception.throw/1} or @samp{require.error/1},
may still throw exceptions because they may be called
with arguments whose types have user-defined equality or comparison predicates.
If these user-defined equality or comparison predicates throw exceptions
then unifications or comparisons involving these types
may also throw exceptions.
As such, we recommend that only implementors of the Mercury system
use this annotation for polymorphic predicates and functions.

@c @item @samp{high_level_backend}
@c The foreign_proc will apply only on the high level backend.
@c @item @samp{low_level_backend}
@c The foreign_proc will apply only on the low level backend.

@item @samp{will_not_modify_trail/may_modify_trail}
This attribute declares whether or not
a foreign procedure modifies the trail (see @ref{Trailing}).
Specifying that a foreign procedure will not modify the trail
may allow the compiler to generate more efficient code for that procedure.
In compilation grades that do not support trailing, this attribute is ignored.
The default, in case none is specified, is @samp{may_modify_trail}.

@item @samp{will_not_call_mm_tabled/may_call_mm_tabled}
This attribute declares whether or not
a foreign procedure makes calls back to Mercury procedures
that are evaluated using minimal model tabling
(@pxref{Tabled evaluation}).
Specifying that a foreign procedure will not call
procedures evaluated using minimal model tabling
may allow the compiler to generate more efficient code.
In compilation grades that do not support minimal model tabling,
this attribute is ignored.
These attributes may not be used with procedures
that do not make calls back to Mercury,
i.e.@: that have the @samp{will_not_call_mercury} attribute.
The default for foreign procedures that @samp{may_call_mercury},
in case none is specified, is
@samp{may_call_mm_tabled}.

@item @samp{affects_liveness/does_not_affect_liveness}
This attribute declares whether or not a foreign procedure
uses and/or modifies any part of the Mercury virtual machine
(registers, stack slots)
through means other than its arguments.
The @samp{affects_liveness} attribute says that it does;
The @samp{does_not_affect_liveness} attribute says that it does not.
In the absence of either attribute,
the compiler assumes @samp{affects_liveness},
unless the code of the foreign_proc in question is empty.

@item @samp{may_duplicate/may_not_duplicate}
This attribute tells the compiler
whether it is allowed to duplicate the foreign code fragment
through optimizations such as inlining.
The @samp{may_duplicate} attribute says that it may;
the @samp{may_not_duplicate} attribute says that it may not.
In the absence of either attribute,
the compiler is allowed make its own judgement in the matter,
based on factors such as the size of the code fragment.

@item @samp{may_export_body/may_not_export_body}
This attribute tells the compiler
whether it is allowed to duplicate the foreign code fragment
outside of the target file for the module that
defines the foreign procedure.
The @samp{may_export_body} attribute says that it may;
the @samp{may_not_export_body} attribute says that it may not.
The default is @samp{may_export_body}.

@c @item
@c @samp{does_not_allocate_memory/allocates_bounded_memory/allocates_unbounded_memory}
@c This attribute declares whether a foreign procedure
@c allocates any memory on the Mercury heap,
@c and if it does, whether the amount allocated
@c is guaranteed to be smaller than the bound given
@c by the reserve space of the native garbage collector.

@c @item
@c @samp{registers_roots/does_not_register_roots/does_not_have_roots}
@c This attribute declares whether a foreign procedure
@c registers with the native garbage collector
@c all the root pointers it accesses.
@c This must always include
@c all global variables maintained by the foreign procedure.
@c If the foreign procedure may call Mercury,
@c it must also include any storage location in which
@c the foreign procedure stores roots before any call to Mercury
@c (since a gc may take place during such a call).

@c @item @samp{no_sharing/unknown_sharing/sharing(MaybeTypes, SharingList)}
@c This attribute declares whether or not a foreign procedure creates any
@c structure sharing @ref{Structure sharing analysis} between its input
@c and output arguments.
@c Specifying that a foreign
@c procedure generates no sharing (attribute @samp{no_sharing}) is a promise
@c to the compiler that the procedure does not create any sharing
@c between its arguments. The attribute @samp{unknown_sharing} specifies
@c that the
@c procedure may create any possible sharing between the arguments.
@c Finally, using
@c @samp{sharing(MaybeTypes, SharingList)} it is possible to specify a list of
@c sharing arguments, declaring that the foreign procedure creates at most
@c the specified sharing between the arguments. @samp{MaybeTypes} takes
@c the values
@c @samp{no/yes(Types)}, where @samp{Types} corresponds to the types used in
@c the predicate or function declaration for this foreign procedure.
@c @samp{SharingList} consists of a list
@c @samp{[SharingPairA, SharingPairB, ...]}, where each sharing pair
@c is represented by a pair @samp{cel(Vari, Seli) - cel(Varj, Selj)}.
@c @samp{Vari, Varj} must be variables that are part of the mode declaration
@c of the @samp{foreign_proc} definition. @samp{Seli, Selj} select
@c the subterms of the given arguments that actually share. Each selector
@c @samp{Seli} is written as a list of types @samp{[Type1, Type2, ...]}
@c representing a path in the term structure of the given argument. An
@c empty list designates the complete term to which the argument corresponds.
@c The types can make use of type variables as long as @samp{MaybeTypes} is
@c set to @samp{yes(Types)}, and the type variables occur in any of the types
@c used in @samp{Types}.
@c
@c @example
@c :- pred array.init_2(int::in, T::in, array(T)::array_uo) is det.
@c
@c :- pragma foreign_proc("C",
    @c array.init_2(Size::in, Item::in, Array::array_uo),
    @c [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
    @c sharing(yes(int, T, array(T)), [cel(Item,[]) - cel(Array,[T])])],
@c "
    @c ML_alloc_array(Array, Size + 1, MR_ALLOC_ID);
    @c ML_init_array(Array, Size, Item);
@c ").
@c @end example
@c
@c This sharing declaration promises that a call
@c @code{init_2(Size, Item, Array)}, with types @code{int, T, array(T)}
@c may create sharing between any
@c subterms of type @code{T} of the resulting array @code{Array} and the
@c term @code{Item}. Reformulated: the elements of @code{Array} may refer
@c to the same memory locations as @code{Item}.

@end table

@c -----------------------------------------------------------------------

@node Calling Mercury from foreign code
@section Calling Mercury from foreign code

Mercury procedures may be exported
so that they can be called by code written in a foreign language.

A declaration of the form:

@example
:- pragma foreign_export("@var{Lang}",
    @var{Pred}(@var{Mode1}, @var{Mode2}, @dots{}), "@var{ForeignName}").
@end example

@noindent
or

@example
:- pragma foreign_export("@var{Lang}",
    @var{Func}(@var{Mode1}, @var{Mode2}, @dots{}) = @var{Mode},
    "@var{ForeignName}").
@end example

@noindent
exports a procedure for use by foreign language @var{Lang}.
For each exported procedure,
the Mercury implementation will create an interface
to the named Mercury procedure in the foreign language
using the name @var{ForeignName}.
The form of this interface is dependent upon the specified foreign language.
For further details see the language specific information below.

It is an error to export a Mercury procedure
that has a determinism of @code{multi} or @code{nondet}.

@c -----------------------------------------------------------------------

@node Data passing conventions
@section Data passing conventions

For each supported foreign language,
we explain how to map a Mercury type to a type in that foreign language.
We also map the Mercury parameter passing convention
to the foreign language's parameter passing convention.

@menu
* C data passing conventions ::
* C# data passing conventions ::
* Java data passing conventions ::
@end menu

@node C data passing conventions
@subsection C data passing conventions

The Mercury primitive types are mapped to the following C types:

@multitable {Mercury_type} {MR_Unsigned}
@headitem Mercury type @tab C type
  @item @code{int}    @tab @code{MR_Integer}
  @item @code{int8}   @tab @code{int8_t}
  @item @code{int16}  @tab @code{int16_t}
  @item @code{int32}  @tab @code{int32_t}
  @item @code{int64}  @tab @code{int64_t}
  @item @code{uint}   @tab @code{MR_Unsigned}
  @item @code{uint8}  @tab @code{uint8_t}
  @item @code{uint16} @tab @code{uint16_t}
  @item @code{uint32} @tab @code{uint32_t}
  @item @code{uint64} @tab @code{uint64_t}
  @item @code{float}  @tab @code{MR_Float}
  @item @code{char}   @tab @code{MR_Char}
  @item @code{string} @tab @code{MR_String}
@end multitable

In the current implementation,
@code{MR_Integer} is a typedef for a signed integral type
which is the same size as a pointer of type @samp{void *};
@code{MR_Unsigned} is a typedef for an unsigned integral type
which is the same size as a pointer of type @samp{void *};
@code{MR_Float} is a typedef for @code{double}
(unless the program and the Mercury library
were compiled with @samp{--single-prec-float},
in which case it is a typedef for @code{float});
@code{MR_Char} is a typedef for a signed 32-bit integral type
and @code{MR_String} is a typedef for @samp{char *}.

Mercury variables of primitive types are passed to and from C
as C variables of the corresponding C type.

For the Mercury standard library type @samp{bool.bool},
there is a corresponding C type, @code{MR_Bool}.
C code can refer to the boolean data constructors @samp{yes} and @samp{no},
as @code{MR_YES} and @code{MR_NO} respectively.

For the Mercury standard library type @samp{builtin.comparison_result},
there is a corresponding C type, @code{MR_Comparison_Result}.
C code can refer to the data constructors of this type,
@samp{(<)}, @samp{(=)} and @samp{(>)},
as @code{MR_COMPARE_LESS}, @code{MR_COMPARE_EQUAL}
and @code{MR_COMPARE_GREATER} respectively.

Mercury variables of a type
for which there is a C @samp{pragma foreign_type} declaration
(@pxref{Using foreign types from Mercury})
will be passed as the corresponding C type.

Mercury tuple types are passed as @code{MR_Tuple},
which in the current implementation is a typedef
for a pointer of type @samp{void *} if @samp{--high-level-code} is enabled,
and a typedef for @code{MR_Word} otherwise.

Mercury variables of any other type are passed as a @code{MR_Word},
which in the current implementation is a typedef
for an unsigned type whose size is the same size as a pointer.
(Note: it would in fact be better for each Mercury type
to map to a distinct abstract type in C,
since that would be more type-safe,
and thus we may change this in a future release.
We advise programmers who are manipulating Mercury types in C code
to use typedefs for each user-defined Mercury type,
and to treat each such type as an abstract data type.
This is good style
and it will also minimize any compatibility problems
if and when we do change this.)

Mercury lists can be manipulated by C code using the following macros,
which are defined by the Mercury implementation.

@example
MR_list_is_empty(list)     /* test if a list is empty */
MR_list_head(list)         /* get the head of a list */
MR_list_tail(list)         /* get the tail of a list */
MR_list_empty()            /* create an empty list */
MR_list_cons(head,tail)    /* construct a list with the given head and tail */
@end example

Note that the use of these macros is subject to some caveats
(@pxref{Memory management for C}).

The implementation provides the macro @code{MR_word_to_float}
for converting a value of type @code{MR_Word} to one of type @code{MR_Float},
and the macro @code{MR_float_to_word}
for converting a value of type @code{MR_Float} to one of type @code{MR_Word}.
These macros must be used to perform these conversions
since for some Mercury implementations
@samp{sizeof(MR_Float)} is greater than @samp{sizeof(MR_Word)}.

The following fragment of C code illustrates
the correct way to extract the head of a Mercury list of floats.

@example
MR_Float f;
f = MR_word_to_float(MR_list_head(list));
@end example

Omitting the call to @code{MR_word_to_float} in the above example
would yield incorrect results for implementations
where @samp{sizeof(MR_Float)} is greater than @samp{sizeof(MR_Word)}.

Similarly, the implementation provides
the macros @code{MR_word_to_int64} and @code{MR_word_to_uint64}
for converting values of type @code{MR_Word}
to ones of type @code{int64_t} or @code{uint64_t} respectively,
and the macros @code{MR_int64_to_word} and @code{MR_uint64_to_word}
for converting values of type @code{int64_t} or @code{uint64_t} respectively
to ones of type @code{MR_Word}.
These macros must be used to perform these conversions
since for some Mercury implementations
@samp{sizeof(int64_t)} or @samp{sizeof(uint64_t)}
are greater than @samp{sizeof(MR_Word)}.

@node C# data passing conventions
@subsection C# data passing conventions

The Mercury primitive types are mapped
to the following Common Language Infrastructure (CLI) and C# types:

@multitable {Mercury_type} {System_String} {double}
  @headitem Mercury type @tab CLI type @tab C# type
  @item @code{int}    @tab @code{System.Int32}  @tab @code{int}
  @item @code{int8}   @tab @code{System.SByte}  @tab @code{sbyte}
  @item @code{int16}  @tab @code{System.Int16}  @tab @code{short}
  @item @code{int32}  @tab @code{System.Int32}  @tab @code{int}
  @item @code{int64}  @tab @code{System.Int64}  @tab @code{long}
  @item @code{uint}   @tab @code{System.UInt32} @tab @code{uint}
  @item @code{uint8}  @tab @code{System.Byte}   @tab @code{byte}
  @item @code{uint16} @tab @code{System.UInt16} @tab @code{ushort}
  @item @code{uint32} @tab @code{System.UInt32} @tab @code{uint}
  @item @code{uint64} @tab @code{System.UInt64} @tab @code{ulong}
  @item @code{float}  @tab @code{System.Double} @tab @code{double}
  @item @code{char}   @tab @code{System.Int32}  @tab @code{int}
  @item @code{string} @tab @code{System.String} @tab @code{string}
@end multitable

Note that the Mercury type @code{char} is mapped like @code{int};
@emph{not} to the CLI type @code{System.Char}
because that only holds 16-bit numeric values.

For the Mercury standard library type @samp{bool.bool},
there is a corresponding C# type, @code{mr_bool.Bool_0}.
C# code can refer to the boolean data constructors @samp{yes} and @samp{no},
as @code{mr_bool.YES} and @code{mr_bool.NO} respectively.

For the Mercury standard library type @samp{builtin.comparison_result},
there is a corresponding C# type, @code{builtin.Comparison_result_0}.
C# code can refer to the data constructors of this type,
@samp{(<)}, @samp{(=)} and @samp{(>)},
as @code{builtin.COMPARE_LESS}, @code{builtin.COMPARE_EQUAL}
and @code{builtin.COMPARE_GREATER} respectively.

Mercury variables of a type
for which there is a C# @samp{pragma foreign_type} declaration
(@pxref{Using foreign types from Mercury})
will be passed as the corresponding C# type.
Both reference and value types are supported.

Mercury tuple types are passed as @samp{object[]}
where the length of the array is the number of elements in the tuple.

Mercury variables whose type is a type variable
will be passed as @code{System.Object}.

Mercury variables whose type is a Mercury discriminated union type
will be passed as a CLI type
whose type name is determined from the Mercury type name
(ignoring any type parameters) followed by an underscore
and then the type arity, expressed as a decimal integer.
The first character of the type name will have its case inverted,
and the name may be mangled to satisfy C# lexical rules.

@noindent
For example, the following Mercury type
corresponds to the C# class that follows (some implementation details elided):

@example
:- type maybe(T)
    --->    yes(yes_field :: T)
    ;       no.

public static class Maybe_1 @{
    public static class Yes_1 : Maybe_1 @{
        public object yes_field;
        public Yes_1(object x) @{ @dots{} @}
    @}
    public static class No_0 : Maybe_1 @{
        public No_0() @{ @dots{} @}
    @}
@}
@end example

C# code generated by the Mercury compiler
is placed in the @samp{mercury} namespace.
Mercury module qualifiers are converted into a C# class name
by concatenating the components with double underscore separators (@samp{__}).
For example the Mercury type @samp{foo.bar.baz/1} will be passed
as the C# type @samp{mercury.foo__bar.Baz_1}.

Mercury array types are mapped to @code{System.Array}.

Mercury variables whose type is a Mercury equivalence type
will be passed as the representation
of the right hand side of the equivalence type.

This mapping is subject to change
and you should try to avoid writing code
that relies heavily upon a particular representation of Mercury terms.

Mercury arguments declared with input modes
are passed by value to the C# function.

Arguments of type @samp{io.state} or @samp{store.store(_)}
are not passed or returned at all.
(The reason for this is that these types represent mutable state,
and in C# modifications to mutable state are done via side effects,
rather than argument passing.)

The handling of multiple output arguments is as follows.

If the Mercury procedure is deterministic and has no output arguments,
then the return type of the C# function is @samp{void};
if it has one output argument,
then the return value of the function is that output argument.

If the Mercury procedure is deterministic and has two or more output arguments,
then the return type of the C# function is @samp{void}.
At the position of each output argument,
the C# function has an @samp{out} parameter.

If the Mercury procedure is semi-deterministic
then the C# function returns a @samp{bool}.
A @samp{true} return value denotes success and @samp{false} denotes failure.
Output arguments are handled in the same way
as multiple outputs for deterministic procedures,
using @samp{out} parameters.

Mercury lists can be manipulated by C# code using the following methods,
which are defined by the Mercury implementation.

@example
bool      list.is_empty(List_1 list)     // test if a list is empty
object    list.det_head(List_1 list)     // get the head of a list
List_1    list.det_tail(List_1 list)     // get the tail of a list
List_1    list.empty_list()              // create an empty list
List_1    list.cons(object head, List_1 tail)
                                         // construct a list with
                                         //  the given head and tail
@end example

@node Java data passing conventions
@subsection Java data passing conventions

The Mercury primitive types are mapped to the following Java types:
@multitable {Mercury_type} {java_lang_String}
  @headitem Mercury type @tab Java type
  @item @code{int}     @tab @code{int}
  @item @code{int8}    @tab @code{byte}
  @item @code{int16}   @tab @code{short}
  @item @code{int32}   @tab @code{int}
  @item @code{int64}   @tab @code{long}
  @item @code{uint}    @tab @code{int}
  @item @code{uint8}   @tab @code{byte}
  @item @code{uint16}  @tab @code{short}
  @item @code{uint32}  @tab @code{int}
  @item @code{uint64}  @tab @code{long}
  @item @code{float}   @tab @code{double}
  @item @code{char}    @tab @code{int}
  @item @code{string}  @tab @code{java.lang.String}
@end multitable

Note that since Java lacks unsigned integer types,
Mercury's unsigned integer types correspond to signed integer types in Java.

Also, note that the Mercury type @code{char} is mapped like @code{int};
@emph{not} to the Java type @code{char}
because that only holds 16-bit numeric values.

For the Mercury standard library type @samp{bool.bool},
there is a corresponding Java type, @code{bool.Bool_0}.
Java code can refer to the boolean data constructors @samp{yes} and @samp{no},
as @code{bool.YES} and @code{bool.NO} respectively.

For the Mercury standard library type @samp{builtin.comparison_result},
there is a corresponding Java type, @code{builtin.Comparison_result_0}.
Java code can refer to the data constructors of this type,
@samp{(<)}, @samp{(=)} and @samp{(>)},
as @code{builtin.COMPARE_LESS}, @code{builtin.COMPARE_EQUAL}
and @code{builtin.COMPARE_GREATER} respectively.

Mercury variables of a type
for which there is a Java @samp{pragma foreign_type} declaration
(@pxref{Using foreign types from Mercury})
will be passed as the corresponding Java type.

Mercury tuple types are passed as @code{java.lang.Object[]}
where the length of the array is the number of elements in the tuple.

Mercury variables whose types are universally quantified type variables
will have generic types.
Mercury variables whose types are existentially quantified type variables
will be passed as @code{java.lang.Object}.

Mercury variables whose type is a Mercury discriminated union type
will be passed as a Java type
whose type name is determined from the Mercury type name
(ignoring any type parameters) followed by an underscore
and then the type arity, expressed as a decimal integer.
The first character of the type name will have its case inverted,
and the name may be mangled to satisfy Java lexical rules.
Generics are used in the Java type for any type parameters.

@noindent
For example, the following Mercury type
corresponds to the Java class that follows
(some implementation details elided):

@example
:- type maybe(T)
    --->    yes(yes_field :: T)
    ;       no.

public static class Maybe_1<T> @{
    public static class Yes_1<T> extends Maybe_1 @{
        public T yes_field;
        public Yes_1(T x) @{ @dots{} @}
    @}
    public static class No_0<T> extends Maybe_1 @{
        public No_0() @{ @dots{} @}
    @}
@}
@end example

Java code generated by the Mercury compiler
is placed in the @samp{jmercury} package.
Mercury module qualifiers are converted into a Java class name
by concatenating the components with double underscore separators (@samp{__}).
For example the Mercury type @samp{foo.bar.baz/1}
will be passed as the Java type @samp{jmercury.foo__bar.Baz_1}.

Mercury array types are mapped to Java array types.

Mercury variables whose type is a Mercury equivalence type will be passed
as the representation of the right hand side of the equivalence type.

This mapping is subject to change and you should try to avoid writing code
that relies heavily upon a particular representation of Mercury terms.

Mercury arguments declared with input modes
are passed by value to the corresponding Java function.
If the Mercury procedure is a function whose result has an input mode,
then the Mercury function result is appended to the list of input parameters,
so that the Mercury function result
becomes the last parameter to the corresponding Java function.

Arguments of type @samp{io.state} or @samp{store.store(_)}
are not passed or returned at all.
(The reason for this is that these types represent mutable state,
and in Java modifications to mutable state are done via side effects,
rather than argument passing.)

The handling of multiple output arguments is as follows.

If the Mercury procedure is deterministic and has no output arguments,
then the return type of the Java function is @code{void};
if it has one output argument,
then the return value of the function is that output argument.

If the Mercury procedure is deterministic and has two or more output arguments,
then the return type of the Java function is @code{void}.
At the position of each output argument,
the Java function takes a value of the type @samp{jmercury.runtime.Ref<T>}
where @samp{T} is the Java type
corresponding to the type of the output argument.
@samp{Ref} is a class with a single field @samp{val},
which is assigned the output value when the function returns.

If the Mercury procedure is semi-deterministic,
then the Java function returns a @samp{boolean}.
A @samp{true} return value denotes success and @samp{false} denotes failure.
Output arguments are handled in the same way
as multiple outputs for deterministic procedures, using the @samp{Ref} class.
On failure the values of the @samp{val} fields are undefined.

Mercury lists can be manipulated by Java code using the following methods,
which are defined by the Mercury implementation.

@example
boolean   list.is_empty(List_1<E> list)     // test if a list is empty
E         list.det_head(List_1<E> list)     // get the head of a list
List_1<E> list.det_tail(List_1<E> list)     // get the tail of a list
List_1<E> list.empty_list()                 // create an empty list
<E, F extends E> List_1<E> list.cons(F head, List_1<E> tail)
                                            // construct a list with
                                            // the given head and tail
@end example

@c -----------------------------------------------------------------------

@node Using foreign types from Mercury
@section Using foreign types from Mercury

Types defined in a foreign language can be accessed in Mercury
using a declaration of the form

@example
:- pragma foreign_type(@var{Lang}, @var{MercuryTypeName}, @var{ForeignTypeDescriptor}).
@end example

This defines @var{MercuryTypeName}
as a synonym for type @var{ForeignTypeDescriptor}
defined in the foreign language @var{Lang}.
@var{MercuryTypeName} must be the name of
either an abstract type or a discriminated union type.
In both cases,
@var{MercuryTypeName} must be declared with @samp{:- type} as usual.
The @samp{pragma foreign_type} must not have wider visibility
than the type declaration
(if the @samp{pragma foreign_type} declaration is in the interface,
the @samp{:- type} declaration must be also).

If @var{MercuryTypeName} names a discriminated union type,
that type cannot be the base type of any subtypes,
nor can it be a subtype itself (@pxref{Subtypes}).

@var{ForeignTypeDescriptor} defines
how the Mercury type is mapped for a particular foreign language.
Specific syntax is given in the language specific information below.

@var{MercuryTypeName} is treated as an abstract type
at all times in Mercury code.
However, if @var{MercuryTypeName}
is one of the parameters of a foreign_proc for @var{Lang},
and the @samp{pragma foreign_type} declaration is visible to the foreign_proc,
it will be passed to that foreign_proc
as specified by @var{ForeignTypeDescriptor}.

The same type may have a foreign language definition
for more than one foreign language.
The definition used in the generated code
will be the one for the foreign language
that is most appropriate for the target language of the compilation
(see the language specific information below for details).
All the foreign language definitions must have the same visibility.

A type which has one or more foreign language definitions
may also have a Mercury definition,
which must define a discriminated union type.
The constructors for this Mercury type will only be visible in Mercury clauses
for predicates or functions with @samp{pragma foreign_proc} clauses
for all of the languages for which there are
@samp{foreign_type} declarations for the type.

You can also associate assertions about the properties of the foreign type
with the @samp{foreign_type} declaration, using the following syntax:

@example
:- pragma foreign_type(@var{Lang}, @var{MercuryTypeName}, @var{ForeignTypeDescriptor},
    [@var{ForeignTypeAssertion}, @dots{}]).
@end example

Currently, three kinds of assertions are supported.

The @samp{can_pass_as_mercury_type} assertion
states that on the C backends, values of the given type
can be passed to and from Mercury code without boxing,
via simple casts, which is faster.
This requires the type to be either an integer type or a pointer type,
and requires it to be castable to @samp{MR_Word} and back
without loss of information
(which means that its size may not be greater than the size of @samp{MR_Word}).

The @samp{word_aligned_pointer} assertion implies
@samp{can_pass_as_mercury_type} and additionally states that values of the
given type are pointer values clear in the tag bits.
It allows the Mercury implementation to avoid boxing values of the given type
when the type appears as the sole argument of a data constructor.

The @samp{stable} assertion is meaningful
only in the presence of the @samp{can_pass_as_mercury_type}
or @samp{word_aligned_pointer} assertions.
It states that either the C type is an integer type,
or it is a pointer type pointing to memory that will never change.
Together, these assertions are sufficient to allow
tabling (@pxref{Tabled evaluation})
and the @samp{compare_representation} primitive
to work on values of such types.

Violations of any of these assertions are very likely to result
in the generated executable silently doing the wrong thing,
giving no clue to where the problem might be.
Since deciding whether a C type satisfies the conditions of these assertions
requires knowledge of the internals of the Mercury implementation,
we do not recommend the use of any of these assertions
unless you are confident of your expertise in those internals.

As with discriminated union types,
programmers can specify the unification @w{and/or} comparison predicates
to use for values of the type using the following syntax
(@pxref{User-defined equality and comparison}):

@example
:- pragma foreign_type(@var{Lang}, @var{MercuryTypeName}, @var{ForeignTypeDescriptor})
        where equality is @var{EqualityPred}, comparison is @var{ComparePred}.
@end example

You can use Mercury foreign language interfacing declarations
which specify language @var{X} to interface to types
that are actually written in a different language @var{Y},
provided that @var{X} and @var{Y} have compatible interface conventions.
Support for this kind of compatibility
is described in the language specific information below.

@c -----------------------------------------------------------------------

@node Using foreign enumerations in Mercury code
@section Using foreign enumerations in Mercury code

While a @samp{pragma foreign_type} declaration
imports a foreign @emph{type} into Mercury,
a @samp{pragma foreign_enum} declaration
imports @emph{the values of the constants of an enumeration type} into Mercury.

While languages such as C have special syntax for defining enumeration types,
in Mercury, an enumeration type is simply an ordinary discriminated union type
whose function symbols all have arity zero.

Given an enumeration type such as
@example
:- type unix_file_permissions
    --->    user_read
    ;       user_write
    ;       user_executable
    ;       group_read
    ;       group_write
    ;       group_executable
    ;       other_read
    ;       other_write
    ;       other_executable.
@end example

the values used to represent each constant
are usually decided by the Mercury compiler.
However, the values assigned this way
may not match the values expected by foreign language code
that uses values of the enumeration,
and even if they happen to match,
programmers probably would not want to @emph{rely} on this coincidence.

This is why Mercury supports a mechanism that allows programmers
to specify the representation of each constant in an enumeration type
when generating code for a given target language.
This mechanism is the @samp{pragma foreign_enum} declaration,
which looks like this:

@example
@group
:- pragma foreign_enum("C", unix_file_permissions/0,
[
    user_read        - "S_IRUSR",
    user_write       - "S_IWUSR",
    user_executable  - "S_IXUSR",
    group_read       - "S_IRGRP",
    group_write      - "S_IWGRP",
    group_executable - "S_IXGRP",
    other_read       - "S_IROTH",
    other_write      - "S_IWOTH",
    other_executable - "S_IXOTH"
]).
@end group
@end example

(Unix systems have a standard header file
that defines each of @samp{S_IRUSR}, @dots{}, @samp{S_IXOTH}
as macros that each expand to an integer constant;
these constants happen @emph{not} to be the ones
that the Mercury compiler would assign to those constants.)

The general form of @samp{pragma foreign_enum} declarations is

@example
:- pragma foreign_enum("@var{Lang}", @var{MercuryType}, @var{CtorValues}).
@end example

where @var{CtorValues} is a list of pairs of the form:

@example
@group
[
    ctor_0 - "ForeignValue_0",
    ctor_1 - "ForeignValue_1",
    @dots{}
    ctor_N - "ForeignValue_N"
]
@end group
@end example

The first element of each pair
is a constant (function symbol of arity 0) of the type @var{MercuryType},
and the second is either a numeric or a symbolic name
for the integer value in the language @var{Lang}
that the programmer wants to be used to represent that constructor.

The mapping defined by this list of pairs must form a bijection,
i.e.@: the list must map distinct constructors to distinct values,
and vice versa.
The Mercury compiler is not required to check this, because it cannot;
even if two symbolic names (such as C macros) are distinct,
they may expand to the same integer in the target language.

Mercury implementations may impose
further foreign-language-specific restrictions
on the form that values used to represent enumeration constructors may take.
See the language specific information below for details.

It is an error for any given @var{MercuryType}
to be the subject of more than one @samp{pragma foreign_enum} declaration
for any given foreign language,
since that would amount to an attempt
to specify two or more (probably) conflicting representations
for each of the type's function symbols.

@c XXX we need to specify a behaviour when there are multiple supported
@c foreign languages.

A @samp{pragma foreign_enum} declaration must occur in the implementation
section of the module that defines the type @var{MercuryType}.
Because of this, the names of the constants
need not and must not be module qualified.

Note that the default comparison for types
that are the subject of a @samp{pragma foreign_enum} declaration
will be defined by the foreign values,
rather than the order of the constructors in the type declaration
(as would otherwise be the case).

@c -----------------------------------------------------------------------

@node Using Mercury enumerations in foreign code
@section Using Mercury enumerations in foreign code

A @samp{pragma foreign_enum} declaration
imports the values of the constants of an enumeration type into Mercury.
However, sometimes one needs the reverse:
the ability to @emph{export}
the values of the constants of an enumeration type
(whether those values were assigned by @samp{foreign_enum} pragmas or not)
from Mercury to foreign language code in
@samp{foreign_proc} and @samp{foreign_code} pragmas.
This is what @samp{pragma foreign_export_enum} declarations are for.

These pragmas have the following general form:

@example
:- pragma foreign_export_enum("@var{Lang}", @var{MercuryType},
    @var{Attributes}, @var{Overrides}).
@end example

When given such a pragma,
the compiler will define a symbolic name in language @var{Lang}
for each of the constructors of @var{MercuryType}
(which must be an enumeration type).
Each symbolic name allows code in that foreign language
to create a value corresponding to that of the constructor it represents.
(The exact mechanism used depends upon the foreign language;
see the language specific information below for further details.)

For each foreign language,
there is a default mapping between the name of a Mercury constructor
and its symbolic name in the language @var{Lang}.
This default mapping is not required to map
every valid constructor name to a valid name in language @var{Lang};
where it does not, the programmer must specify a valid symbolic name.
The programmer may also choose to map a constructor to a symbolic name
that differs from the one supplied
by the default mapping for language @var{Lang}.
@var{Overrides} is a list
whose elements are pairs of constructor names and strings.
The latter specify the name that the implementation should use
as the symbolic name in the foreign language.
@var{Overrides} has the following form:

@example
[cons_I - "symbol_I", @dots{}, cons_J - "symbol_J"]
@end example

This can be used to provide
either a valid symbolic name where the default mapping does not,
or to override a valid symbolic name generated by the default mapping.
This argument may be omitted if @var{Overrides} is empty.

The argument @var{Attributes} is a list of optional attributes.
If empty,
it may be omitted from the @samp{pragma foreign_export_enum} declaration
if the @var{Overrides} argument is also omitted.
The following attributes must be supported by all Mercury implementations.

@table @asis

@item @samp{prefix(@var{Prefix})}
Prefix each symbolic name, regardless of how it was generated,
with the string @var{Prefix}.
A @samp{pragma foreign_export_enum} declaration
may contain at most one @samp{prefix} attribute.

@item @samp{uppercase}
Convert any alphabetic characters in a Mercury constructor name to uppercase
when generating the symbolic name using the default mapping.
Symbolic names specified by the programmer using @var{Overrides}
are not affected by this attribute.
If the @samp{prefix} attribute is also specified,
then the prefix is added to the symbolic name
@emph{after} the conversion to uppercase has been performed,
i.e.@: the characters in the prefix
are not affected by the @samp{uppercase} attribute.

@end table

The implementation does not check
the validity of a symbolic name in the foreign language
until after the effects of any attributes have been applied.
This means that attributes may cause
an otherwise valid symbolic name to become invalid, or vice versa.

A Mercury module may contain @samp{pragma foreign_export_enum} declarations
that refer to imported types, subject to the usual visibility restrictions.

A Mercury module, or program, may contain
more than one @samp{pragma foreign_export_enum} declaration
for a given Mercury type for a given language.
This can be useful when a project is transitioning
from using one naming scheme for Mercury constants in foreign code
to another naming scheme.

It is an error if the mapping between constructors and symbolic names
in a @samp{pragma foreign_export_enum} declaration
does not form a bijection.
It is also an error
if two separate @samp{pragma foreign_export_enum} declarations
for a given foreign language, @emph{whether or not for the same type},
specify the same symbolic name,
since in that case, the Mercury compiler would generate
two conflicting definitions for that symbolic name.
However, the Mercury implementation is not required to check either condition.

A @samp{pragma foreign_export_enum} declaration
may occur only in the implementation section of a module.

@c -----------------------------------------------------------------------

@node Adding foreign declarations
@section Adding foreign declarations

Foreign language declarations
(such as type declarations, header file inclusions or macro definitions)
can be included in the Mercury source file
as part of a @samp{foreign_decl} declaration of the form

@example
:- pragma foreign_decl("@var{Lang}", @var{DeclCode}).
@end example

This declaration will have effects equivalent to
including the specified @var{DeclCode}
in an automatically generated source file
of the specified programming language,
in a place appropriate for declarations,
and linking that source file with the Mercury program
(after having compiled it with a compiler
for the specified programming language, if appropriate).

Entities declared in @samp{pragma foreign_decl} declarations
are visible in @samp{pragma foreign_code}, @samp{pragma foreign_type},
@samp{pragma foreign_proc}, and @samp{pragma foreign_enum} declarations
that specify the same foreign language and occur in the same Mercury module.

By default, the contents of @samp{pragma foreign_decl} declarations
are also visible in the same kinds of declarations in other modules
that import the module containing the @samp{pragma foreign_decl} declaration.
This is because they may be required to make sense
of types defined using @samp{pragma foreign_type}
and/or predicates defined using @samp{pragma foreign_proc}
in the containing module,
and these may be visible in other modules,
especially in the presence of intermodule optimization.

If you do not want the contents of a @samp{pragma foreign_decl} declaration
to be visible in foreign language code in other modules,
you can use the following variant of the declaration:

@example
:- pragma foreign_decl("@var{Lang}", local, @var{DeclCode}).
@end example

Note: currently only the C backend
supports this variant of the @samp{pragma foreign_decl} declaration.

The Melbourne Mercury implementation additionally supports the forms

@example
:- pragma foreign_decl("@var{Lang}", include_file("@var{Path}")).
:- pragma foreign_decl("@var{Lang}", local, include_file("@var{Path}")).
@end example

These have the same effects as the standard forms
except that the contents of the file referenced by @var{Path}
are included in place of the string literal in the last argument,
without further interpretation.
@var{Path} may be an absolute path to a file,
or a path to a file relative to the directory
that contains the source file of the module containing the declaration.
The interpretation of the path is platform-dependent.
If the filesystem uses a different character set or encoding
from the Mercury source file (which must be UTF-8),
the file may not be found.

@samp{mmc --make} and @samp{mmake} treat included files
as dependencies of the module.

@c -----------------------------------------------------------------------

@node Declaring Mercury exports to other modules
@section Declaring Mercury exports to other modules

The declarations for Mercury predicates or functions
exported to a foreign language using a @samp{pragma foreign_export} declaration
are visible to foreign code
in a @samp{pragma foreign_code} or @samp{pragma foreign_proc} declaration
of the same module, and also in those of any submodules.
By default, they are not visible to the foreign code
in @samp{pragma foreign_code} or @samp{pragma foreign_proc} declarations
in any other module,
but this default can be overridden (giving access to all other modules)
using a declaration of the form:

@example
:- pragma foreign_import_module("@var{Lang}", @var{ImportedModule}).
@end example

@noindent
where @var{ImportedModule} is the name of the module
containing the @samp{pragma foreign_export} declarations.

If @var{Lang} is @code{"C"}, this is equivalent to

@example
:- pragma foreign_decl("C", "#include ""@var{ImportedModule}.mh""").
@end example

@noindent
where @file{@var{ImportedModule}.mh} is the automatically generated header file
containing the C declarations for the predicates and functions exported to C.

@samp{pragma foreign_import_module} should be used
instead of the explicit @code{#include}
because @samp{pragma foreign_import_module} tells the implementation
that @file{@var{ImportedModule}.mh} must be built before the object file
for the module containing the @samp{pragma foreign_import_module} declaration.

@c This sentence used to be here:
@c
@c  A cycle of @samp{pragma foreign_import_module}, where the language is
@c  @samp{"C#"} or @samp{"Java"}, is not permitted.
@c
@c but seems to be obsolete now. Julien says (on m-dev, 2019 jun 20):
@c
@c  That line (minus the bit about the Java) grade was added by Peter Ross
@c  in commit e868b11d. At the time, it was with reference to the old IL
@c  backend, the foreign languages involved were IL, C# and MC++.
@c
@c  I suspect the restriction on foreign_import_module cycles may have
@c  arisen because C# and MC++ were secondary (i.e.@: non-target) foreign
@c  languages. Compilation of secondary foreign languages is handled by
@c  hoisting the code out to a separate source file and that requires
@c  the foreign_import_module graph to be acyclic.
@c
@c  None of that line is applicable to C# and Java when they are the target
@c  language.

Note that the Melbourne Mercury implementation often behaves
as if @samp{pragma foreign_import_module} declarations
were implicitly added to modules.
However, programmers should @emph{not} depend on this behaviour;
they should always write explicit
@samp{pragma foreign_import_module} declarations wherever they are needed.

@c -----------------------------------------------------------------------

@node Adding foreign definitions
@section Adding foreign definitions

Definitions of foreign language entities
(such as functions or global variables)
may be included using a declaration of the form

@example
:- pragma foreign_code("@var{Lang}", @var{Code}).
@end example

This declaration will have effects equivalent to
including the specified @var{Code} in an automatically generated source file
of the specified programming language,
in a place appropriate for definitions,
and linking that source file with the Mercury program
(after having compiled it with a compiler
for the specified programming language, if appropriate).

Entities declared in @samp{pragma foreign_code} declarations
are visible in @samp{pragma foreign_proc} declarations
that specify the same foreign language and occur in the same Mercury module.

The Melbourne Mercury implementation additionally supports the form

@example
:- pragma foreign_code("@var{Lang}", include_file("@var{Path}")).
@end example

This has the same effect as the standard form
except that the contents of the file referenced by @var{Path}
are included in place of the string literal in the last argument,
without further interpretation.
@var{Path} may be an absolute path to a file,
or a path to a file relative to the directory
that contains the source file of the module containing the declaration.
The interpretation of the path is platform-dependent.
If the filesystem uses a different character set or encoding
from the Mercury source file (which must be UTF-8),
the file may not be found.

@samp{mmc --make} and @samp{mmake} treat included files
as dependencies of the module.

@c -----------------------------------------------------------------------

@node Language specific bindings
@section Language specific bindings

@c Please keep this menu in alphabetical order

@menu
* Interfacing with C            :: How to write code to interface with C
* Interfacing with C#           :: How to write code to interface with C#
* Interfacing with Java         :: How to write code to interface with Java
@end menu

All Mercury implementations should support interfacing with C.
The set of other languages supported is implementation-defined.
A suitable compiler or assembler for the foreign language
must be available on the system.

The Melbourne Mercury implementation supports
interfacing with the following languages:

@table @asis

@c Please keep this table in alphabetical order

@item @samp{C}
Use the string @code{"C"} to set the foreign language to C.

@item @samp{C#}
Use the string @code{"C#"} to set the foreign language to C#.

@item @samp{Java}
Use the string @code{"Java"} to set the foreign language to Java.

@end table

@c -----------------------------------------------------------------------

@node Interfacing with C
@subsection Interfacing with C

@menu
* Using pragma foreign_type for C        :: Declaring C types in Mercury
* Using pragma foreign_enum for C        :: Assigning Mercury enumerations
                                            values in C
* Using pragma foreign_export_enum for C :: Using Mercury enumerations in C
* Using pragma foreign_proc for C        :: Calling C code from Mercury
* Using pragma foreign_export for C      :: Calling Mercury code from C
* Using pragma foreign_decl for C        :: Including C declarations in Mercury
* Using pragma foreign_code for C        :: Including C code in Mercury
* Memory management for C                :: Caveats about passing dynamically
                                            allocated memory to or from C.
* Linking with C object files            :: Linking with C object files and
                                            libraries.

@end menu

@node Using pragma foreign_type for C
@subsubsection Using pragma foreign_type for C

A C @samp{pragma foreign_type} declaration has the form:

@example
:- pragma foreign_type("C", @var{MercuryTypeName}, "@var{CForeignType}").
@end example

For example,

@example
:- pragma foreign_type("C", long_double, "long double").
@end example

The @var{CForeignType} can be any C type name
that obeys the following restrictions.
Function types, array types, and incomplete types are not allowed.
The type name must be such that when declaring a variable in C of that type,
no part of the type name is required after the variable name.
(This rule prohibits, for example,
function pointer types such as @samp{void (*)(void)};
however, it would be OK to use a typedef name
which was defined as a function pointer type.)

C preprocessor directives (such as @samp{#if})
may not be used in @var{CForeignType}.
(You can however use a typedef name that refers to a type
defined in a @samp{pragma foreign_decl} declaration,
and the @samp{pragma foreign_decl} declaration
may contain C preprocessor directives.)

@c @strong{With @samp{--gc accurate}, foreign_types which are C pointer types
@c must not point to the Mercury heap.}

If the @var{MercuryTypeName} is the type of a parameter of a procedure
defined using @samp{pragma foreign_proc},
it will be passed to the foreign_proc's foreign language code
as @var{CForeignType}.

Furthermore, any Mercury procedure exported with @samp{pragma foreign_export}
will use @var{CForeignType} as the type
for any parameters whose Mercury type is @var{MercuryTypeName}.

The builtin Mercury type @code{c_pointer} may be used
to pass C pointers between C functions which are called from Mercury.
For example:

@example
:- module pointer_example.
:- interface.

:- type complicated_c_structure.

% Initialise the abstract C structure that we pass around in Mercury.
:- pred initialise_complicated_structure(complicated_c_structure::uo) is det.

% Perform a calculation on the C structure.
:- pred do_calculation(int::in, complicated_c_structure::di,
        complicated_c_structure::uo) is det.

:- implementation.

% Our C structure is implemented as a c_pointer.
:- type complicated_c_structure
    --->    complicated_c_structure(c_pointer).

:- pragma foreign_decl("C",
   extern struct foo *init_struct(void);
   extern struct foo *perform_calculation(int, struct foo *);
");

:- pragma foreign_proc("C",
    initialise_complicated_structure(Structure::uo),
    [promise_pure, will_not_call_mercury],
"
    Structure = init_struct();
").

:- pragma foreign_proc("C",
    do_calculation(Value::in, Structure0::di, Structure::uo),
    [promise_pure, will_not_call_mercury],
"
    Structure = perform_calculation(Value, Structure0);
").
@end example

We strongly recommend the use of @samp{pragma foreign_type}
instead of @code{c_pointer}
as the use of @samp{pragma foreign_type} results in more type-safe code.

@node Using pragma foreign_enum for C
@subsubsection Using pragma foreign_enum for C

Foreign enumeration values in C must be constants of type @code{MR_Integer}.
A foreign enumeration value may be specified by one of the following:
@itemize
@item An integer literal.
@item An enumeration constant.
@item A preprocessor macro that expands to either an integer literal or
an enumeration constant.
@end itemize

@node Using pragma foreign_export_enum for C
@subsubsection Using pragma foreign_export_enum for C

For C the symbolic names generated by a @samp{pragma foreign_export_enum}
must form valid C identifiers.
These identifiers are used as the names of preprocessor macros.
The body of each of these macros expands to a value
that is identical to that of the constructor
to which the symbolic name corresponds
in the mapping established
by the @w{@samp{pragma foreign_export_enum}} declaration.

As noted in the @ref{C data passing conventions},
the type of these values is @code{MR_Word}.

The default mapping
used by @samp{pragma foreign_export_enum} declarations for C
is to use the Mercury constructor name as the base of the symbolic name.
For example, the symbolic name for the Mercury constructor @samp{foo}
would be @code{foo}.

@c It would be useful if there were some documented way of mapping
@c these things into [0, N - 1], e.g.@: for array lookups.

@node Using pragma foreign_proc for C
@subsubsection Using pragma foreign_proc for C

The input and output variables
will have C types corresponding to their Mercury types,
as determined by the rules specified in @ref{C data passing conventions}.

The C code fragment may declare local variables,
up to a total size of 10kB for the procedure.
@c The relevant parameter is LOCALS_SIZE, defined in runtime/mercury_engine.c.
If a procedure requires more than this for its local variables,
the code can be moved into a separate function
(defined in a @samp{pragma foreign_code} declaration, for example).

The C code fragment should not declare any labels or static variables
unless there is also a @samp{pragma no_inline} declaration
or a @samp{may_not_duplicate} foreign code attribute for the procedure.
The reason for this is that otherwise
the Mercury implementation may inline the procedure
by duplicating the C code fragment for each call.
If the C code fragment declared a static variable,
inlining it in this way could result
in the program having multiple instances of the static variable,
rather than a single shared instance.
If the C code fragment declared a label,
inlining it in this way could result in an error
due to the same label being defined twice inside a single C function.

C code in a @code{pragma foreign_proc} declaration
for any procedure whose determinism indicates that it can fail
must assign a truth value to the macro @code{SUCCESS_INDICATOR}.
For example:

@example
:- pred string.contains_char(string, character).
:- mode string.contains_char(in, in) is semidet.

:- pragma foreign_proc("C",
    string.contains_char(Str::in, Ch::in),
    [will_not_call_mercury, promise_pure],
"
    SUCCESS_INDICATOR = (strchr(Str, Ch) != NULL);
").
@end example

@code{SUCCESS_INDICATOR} should not be used
other than as the target of an assignment.
(For example, it may be @code{#define}d to a register,
so you should not try to take its address.)
Procedures whose determinism indicates that they cannot fail
should not access @code{SUCCESS_INDICATOR}.

Arguments whose mode is input
will have their values set by the Mercury implementation
on entry to the C code.
If the procedure succeeds,
the C code must set the values of all output arguments.
If the procedure fails,
the C code need only set @code{SUCCESS_INDICATOR} to false (zero).

The behaviour of a procedure
defined using a @samp{pragma foreign_proc} declaration
whose body contains a @code{return} statement is undefined.

@node Using pragma foreign_export for C
@subsubsection Using pragma foreign_export for C

A @samp{pragma foreign_export} declaration for C has the form:

@example
:- pragma foreign_export("C", @var{MercuryMode}, "@var{C_Name}").
@end example

For example,

@example
:- pragma foreign_export("C", foo(in, in, out), "FOO").
@end example

For each Mercury module
containing @samp{pragma foreign_export} declarations for C,
the Mercury implementation
will automatically create a header file for that module
which declares a C function @var{C_Name}()
for each of the @samp{pragma foreign_export} declarations.
Each such C function is the C interface to the specified Mercury procedure.

The type signature of the C interface to a Mercury procedure
is determined as follows.
Mercury types are converted to C types
according to the rules in @ref{C data passing conventions}.
Input arguments are passed by value.
For output arguments,
the caller must pass the address in which to store the result.
If the Mercury procedure can fail,
then its C interface function returns a truth value
indicating success or failure.
If the Mercury procedure is a Mercury function that cannot fail,
and the function result has an output mode,
then the C interface function will return the Mercury function result value.
Otherwise the function result is appended as an extra argument.
@c XXX We need to update this for dummy unit types.
Arguments of type @samp{io.state} or @samp{store.store(_)}
are not passed at all.
(The reason for this is that these types represent mutable state,
and in C modifications to mutable state are done via side effects,
rather than argument passing.)

Calling polymorphically typed Mercury procedures from C
is a little bit more difficult
than calling ordinary (monomorphically typed) Mercury procedures.
The simplest method is to just create monomorphic forwarding procedures
that call the polymorphic procedures, and export them,
rather than exporting the polymorphic procedures.

If you do export a polymorphically typed Mercury procedure,
the compiler will prepend one @samp{type_info} argument
to the parameter list of the C interface function
for each distinct type variable in the Mercury procedure's type signature.
The caller must arrange to pass in appropriate @samp{type_info} values
corresponding to the types of the other arguments passed.
These @samp{type_info} arguments can be obtained
using the Mercury @samp{type_of} function
in the Mercury standard library module @samp{type_desc}.

To use the C declarations produced see @ref{Using pragma foreign_decl for C}.

Throwing an exception across the C interface is not supported.
That is, if a Mercury procedure that is exported to C using
@samp{pragma foreign_export} throws an exception
which is not caught within that procedure,
then you will get undefined behaviour.

@node Using pragma foreign_decl for C
@subsubsection Using pragma foreign_decl for C

Any macros, function prototypes, or other C declarations
that are used in @samp{foreign_code}, @samp{foreign_type}
or @samp{foreign_proc} pragmas
must be included using a @samp{foreign_decl} declaration of the form

@example
:- pragma foreign_decl("C", @var{HeaderCode}).
@end example

@noindent
@var{HeaderCode} can be a C @samp{#include} line, for example

@example
:- pragma foreign_decl("C", "#include <math.h>")
@end example

@noindent
or

@example
:- pragma foreign_decl("C", "#include ""tcl.h""").
@end example

@noindent
or it may contain any C declarations, for example

@example
:- pragma foreign_decl("C", "
        extern int errno;
        #define SIZE 200
        struct Employee @{
                char name[SIZE];
        @};
        extern int bar;
        extern void foo(void);
").
@end example

Mercury automatically includes certain headers such as @code{<stdlib.h>},
but you should not rely on this,
as the set of headers which Mercury automatically includes
is subject to change.

If a Mercury predicate or function exported
using a @samp{pragma foreign_export} declaration
is to be used within a @samp{:- pragma foreign_code}
or @samp{:- pragma foreign_proc} declaration,
then the header file for the module containing
the @samp{pragma foreign_export} declaration
should be included using a @samp{pragma foreign_import_module} declaration,
for example

@example
:- pragma foreign_import_module("C", exporting_module).
@end example

@node Using pragma foreign_code for C
@subsubsection Using pragma foreign_code for C

Definitions of C functions or global variables
may be included using a declaration of the form

@example
:- pragma foreign_code("C", @var{Code}).
@end example

For example,

@example
:- pragma foreign_code("C", "
        int bar = 42;
        void foo(void) @{@}
").
@end example

Such code is copied verbatim into the generated C file.

@node Memory management for C
@subsubsection Memory management for C

Passing pointers to dynamically-allocated memory
from Mercury to code written in other languages, or vice versa,
is in general implementation-dependent.

The current Mercury implementation
supports two different methods of memory management:
conservative garbage collection, or no garbage collection.
The latter is suitable only for programs
with very short running times (less than a second),
which makes the former the standard method for almost all Mercury programs.

Conservative garbage collection makes inter-language calls simplest.
Mercury uses the Boehm-Demers-Weiser conservative garbage collector,
which we also call simply Boehm gc.
This has its own set of functions for allocating memory blocks,
such as @samp{MR_GC_NEW},
which are documented in @file{runtime/mercury_memory.h}.
Memory blocks allocated by these functions,
either in C code generated by the Mercury compiler
or in C code hand written by programmers,
are automatically reclaimed when they are no longer referred to
either from the stack, from global variables,
or from other memory blocks allocated by Boehm gc functions.
Note that these are the @emph{only} places
where Boehm gc looks for pointers to the blocks it has allocated.
If the only pointers to such a block occur in other parts of memory,
such as in memory blocks allocated by @samp{malloc},
the Boehm collector won't see them, and may collect the block prematurely.
Programmers can avoid this either
by not storing pointers to Boehm-allocated memory in malloc-allocated blocks,
or by storing them e.g.@: on the stack as well.

Boehm gc recognizes pointers to the blocks it has allocated
only if they point either to the start to the block,
or to a byte in the first word of the block;
pointers into the middle of a block beyond the first word
won't keep the block alive.

Pointers to Boehm-allocated memory blocks
can be passed freely between Mercury and C code
provided these restrictions are observed.

Note that the Boehm collector cannot and does not recover
memory allocated by other methods, such as @samp{malloc}.

When using no garbage collection,
heap storage is reclaimed only on backtracking.
This requires programmers to be careful
not to retain pointers to memory on the Mercury heap
after Mercury has backtracked
to before the point where that memory was allocated.
They must also avoid the use of the macros
@code{MR_list_empty()} and @code{MR_list_cons()}.
(The reason for this is that they may access Mercury's @samp{MR_hp} register,
which might not be valid in C code.
Using them in the bodies of procedures
defined using @samp{pragma foreign_proc} with
@samp{will_not_call_mercury} would probably work,
but we don't advise it.)
Instead, you can write Mercury functions to perform these actions
and use @samp{pragma foreign_export} to access them from C.
This alternative method also works with conservative garbage collection.

Future Mercury implementations
may use non-conservative methods of garbage collection.
For such implementations,
it will be necessary to explicitly register pointers passed to C
with the garbage collector.
The mechanism for doing this has not yet been decided on.
It would be desirable to provide a single memory management interface
for use when interfacing with other languages
that can work for all methods of memory management,
but more implementation experience is needed
before we can formulate such an interface.

@node Linking with C object files
@subsubsection Linking with C object files

A Mercury implementation should allow you
to link with object files or libraries that were produced by compiling C code.
The exact mechanism for linking with C object files
is implementation-dependent.
The following text describes how it is done
for the Melbourne Mercury implementation.

To link an existing object file or archive of object files
into your Mercury code,
use the command line option @samp{--link-object}.
For example, the following will link the object file @samp{my_function.o}
from the current directory when compiling the program @samp{prog}:

@example
mmc --link-object my_functions.o prog
@end example

The command line option @samp{--library} (or @samp{-l} for short)
can be used to link an existing library into your Mercury code.
For example, the following will link the library file @file{libfancy_library.a},
or perhaps the shared version @file{libfancy_library.so},
from the directory @file{/usr/local/contrib/lib},
when compiling the program @samp{prog}:

@example
mmc -R/usr/local/contrib/lib -L/usr/local/contrib/lib -lfancy_library prog
@end example

As illustrated by the example,
the command line options @samp{-R}, @samp{-L} and @samp{-l},
have the same meaning as they do with the Unix linker.

For more information,
see the ``Libraries'' chapter of the Mercury User's Guide.

@c ----------------------------------------------------------------------------

@node Interfacing with C#
@subsection Interfacing with C#

@menu
* Using pragma foreign_type for C#      :: Declaring C# types in Mercury
* Using pragma foreign_enum for C#      :: Assigning Mercury enumerations
                                           values in C#
* Using pragma foreign_export_enum for C# :: Using Mercury enumerations in C#
* Using pragma foreign_proc for C#      :: Calling C# code from Mercury
* Using pragma foreign_export for C#    :: Calling Mercury code from C#
* Using pragma foreign_decl for C#      :: Including C# declarations in Mercury
* Using pragma foreign_code for C#      :: Including C# code in Mercury
@end menu

@node Using pragma foreign_type for C#
@subsubsection Using pragma foreign_type for C#

A C# @samp{pragma foreign_type} declaration has the form:

@example
:- pragma foreign_type("C#", @var{MercuryTypeName}, "@var{C#-Type}").
@end example

The @var{C#-Type} can be any accessible C# type.

The effect of this declaration is that
Mercury values of type @var{MercuryTypeName}
will be passed to and from C# foreign_procs as having type @var{C#-Type}.

Furthermore, any Mercury procedure exported with @samp{pragma foreign_export}
will use @var{C#-Type} as the type
for any parameters whose Mercury type is @var{MercuryTypeName}.

@node Using pragma foreign_enum for C#
@subsubsection Using pragma foreign_enum for C#

Foreign enumeration values in C# must be a constant value expression
which is a valid initializer
within an enumeration of underlying type @code{int}.

@node Using pragma foreign_export_enum for C#
@subsubsection Using pragma foreign_export_enum for C#

For C# the symbolic names generated by a @samp{pragma foreign_export_enum}
must form valid C# identifiers.
These identifiers are used as the names of static class members.

The default mapping
used by @samp{pragma foreign_export_enum} declarations for C#
is to use the Mercury constructor name as the base of the symbolic name.
For example, the symbolic name for the Mercury constructor @samp{foo}
would be @code{foo}.

@node Using pragma foreign_proc for C#
@subsubsection Using pragma foreign_proc for C#

The C# code from C# @samp{pragma foreign_proc} declarations
will be placed in the bodies of static member functions
of an automatically generated C# class.
Since such C# code will become part of a static member function,
it must not refer to the @code{this} keyword.
It may however refer to static member variables or static member functions
declared with @samp{pragma foreign_code}.

The input and output variables for a C# @samp{pragma foreign_proc}
will have C# types corresponding to their Mercury types.
The exact rules for mapping Mercury types to C# types
are described in @ref{C# data passing conventions}.

C# code in a @code{pragma foreign_proc} declaration
for any procedure whose determinism indicates that it can fail
must assign a value of type @code{bool}
to the variable @code{SUCCESS_INDICATOR}.
For example:

@example
:- pred string.contains_char(string, character).
:- mode string.contains_char(in, in) is semidet.

:- pragma foreign_proc("C#",
    string.contains_char(Str::in, Ch::in),
    [will_not_call_mercury, promise_pure],
"
    SUCCESS_INDICATOR = (Str.IndexOf(Ch) != -1);
").
@end example

@noindent
C# code for procedures whose determinism indicates that they cannot fail
should not access @code{SUCCESS_INDICATOR}.

Arguments whose mode is input
will have their values set by the Mercury implementation
on entry to the C# code.
If the procedure succeeds,
the C# code must set the values of all output arguments.
If the procedure fails,
the C# code need only set @code{SUCCESS_INDICATOR} to false.

@node Using pragma foreign_export for C#
@subsubsection Using pragma foreign_export for C#

A @samp{pragma foreign_export} declaration for C# has the form:

@example
:- pragma foreign_export("C#", @var{MercuryMode}, "@var{C#_Name}").
@end example

For example,

@example
:- pragma foreign_export("C#", foo(in, in, out), "FOO").
@end example

The type signature of the C# interface to a Mercury procedure
is as described in @ref{C# data passing conventions}.

Calling polymorphically typed Mercury procedures from C#
is a little bit more difficult
than calling ordinary (monomorphically typed) Mercury procedures.
The simplest method is to just create monomorphic forwarding procedures
that call the polymorphic procedures, and export them,
rather than exporting the polymorphic procedures.

If you do export a polymorphically typed Mercury procedure,
the compiler will prepend one @samp{type_info} argument
to the parameter list of the C# interface function
for each distinct type variable in the Mercury procedure's type signature.
The caller must arrange to pass in appropriate @samp{type_info} values
corresponding to the types of the other arguments passed.
These @samp{type_info} arguments can be obtained
using the Mercury @samp{type_of} function
in the Mercury standard library module @samp{type_desc}.

@node Using pragma foreign_decl for C#
@subsubsection Using pragma foreign_decl for C#

@samp{pragma foreign_decl} declarations for C#
can be used to provide any top-level C# declarations
(e.g.@: @samp{using} declarations or auxiliary class definitions)
which are needed by C# code in @samp{pragma foreign_proc} declarations
in that module.

For example:

@example
:- pragma foreign_decl("C#", "
        using System;
").
:- pred hello(io.state::di, io.state::uo) is det.
:- pragma foreign_proc("C#",
     hello(_IO0::di, _IO::uo),
     [will_not_call_mercury, promise_pure],
"
    // here we can refer directly to Console rather than System.Console
    Console.WriteLine(""hello world"");
").
@end example

@node Using pragma foreign_code for C#
@subsubsection Using pragma foreign_code for C#

The C# code from @samp{pragma foreign_proc} declarations for C#
will be placed in the bodies of static member functions
of an automatically generated C# class.
@samp{pragma foreign_code} can be used to define
additional members of this automatically generated class,
which can then be referenced by @samp{pragma foreign_proc} declarations
for C# from that module.

For example:

@example
:- pragma foreign_code("C#", "
    static int counter = 0;
").

:- impure pred incr_counter is det.
:- pragma foreign_proc("C#",
    incr_counter,
    [will_not_call_mercury], "
    counter++;
").

:- semipure func get_counter = int.
:- pragma foreign_proc("C#",
    get_counter = (Result::out),
    [will_not_call_mercury, promise_semipure],
"
    Result = counter;
").
@end example

@c ----------------------------------------------------------------------------

@node Interfacing with Java
@subsection Interfacing with Java

@menu
* Using pragma foreign_type for Java :: Declaring Java types in Mercury
* Using pragma foreign_enum for Java :: Assigning Mercury enumerations
                                        values in Java
* Using pragma foreign_export_enum for Java :: Using Mercury enumerations in
                                               Java
* Using pragma foreign_proc for Java :: Calling Java code from Mercury
* Using pragma foreign_export for Java :: Calling Mercury from Java code
* Using pragma foreign_decl for Java :: Including Java declarations in Mercury
* Using pragma foreign_code for Java :: Including Java code in Mercury
@end menu

@node Using pragma foreign_type for Java
@subsubsection Using pragma foreign_type for Java

A Java @samp{pragma foreign_type} declaration has the form:

@example
:- pragma foreign_type("Java", @var{MercuryTypeName}, "@var{JavaType}").
@end example

The @var{JavaType} can be any accessible Java type.

The effect of this declaration is that
Mercury values of type @var{MercuryTypeName}
will be passed to and from Java foreign_procs as having type @var{JavaType}.

Furthermore, any Mercury procedure exported with @samp{pragma foreign_export}
will use @var{JavaType} as the type
for any parameters whose Mercury type is @var{MercuryTypeName}.

@node Using pragma foreign_enum for Java
@subsubsection Using pragma foreign_enum for Java

@samp{pragma foreign_enum} is currently not supported for Java.

@node Using pragma foreign_export_enum for Java
@subsubsection Using pragma foreign_export_enum for Java

For Java the symbolic names generated by a @samp{pragma foreign_export_enum}
must form valid Java identifiers.
These identifiers are used as the names of static class members
which are assigned instances of the enumeration class.

The @code{equals} method should be used
for equality testing of enumeration values in Java code.

The default mapping
used by @samp{pragma foreign_export_enum} declarations for Java
is to use the Mercury constructor name as the base of the symbolic name.
For example, the symbolic name for the Mercury constructor @samp{foo}
would be @code{foo}.

@node Using pragma foreign_proc for Java
@subsubsection Using pragma foreign_proc for Java

The Java code from Java @samp{pragma foreign_proc} declarations
will be placed in the bodies of static member functions
of an automatically generated Java class.
Since such Java code will become part of a static member function,
it must not refer to the @code{this} keyword.
It may however refer to static member variables or static member functions
declared with @samp{pragma foreign_code}.

The input and output variables for a Java @samp{pragma foreign_proc}
will have Java types corresponding to their Mercury types.
The exact rules for mapping Mercury types to Java types
are described in @ref{Java data passing conventions}.

The Java code in a @code{pragma foreign_proc} declaration
for a procedure whose determinism indicates that it can fail
must assign a value of type @code{boolean}
to the variable @code{SUCCESS_INDICATOR}.
For example:

@example
:- pred string.contains_char(string, character).
:- mode string.contains_char(in, in) is semidet.

:- pragma foreign_proc("Java",
    string.contains_char(Str::in, Ch::in),
    [will_not_call_mercury, promise_pure],
"
    SUCCESS_INDICATOR = (Str.indexOf(Ch) != -1);
").
@end example

@noindent
Java code for procedures whose determinism indicates that they cannot fail
should not refer to the @code{SUCCESS_INDICATOR} variable.

Arguments whose mode is input
will have their values set by the Mercury implementation
on entry to the Java code.
With our current implementation,
the Java code must set the values of all output variables,
even if the procedure fails
(i.e.@: sets the @code{SUCCESS_INDICATOR} variable to @code{false}).
@c If the procedure
@c succeeds, the Java code must set the values of all output arguments
@c If the procedure fails, the Java code need only
@c set the @code{SUCCESS_INDICATOR} variable to false.

@node Using pragma foreign_export for Java
@subsubsection Using pragma foreign_export for Java

A @samp{pragma foreign_export} declaration for Java has the form:

@example
:- pragma foreign_export("Java", @var{MercuryMode}, "@var{Java_Name}").
@end example

For example,

@example
:- pragma foreign_export("Java", foo(in, in, out), "FOO").
@end example

The type signature of the Java interface to a Mercury procedure
is as described in @ref{Java data passing conventions}.

Calling polymorphically typed Mercury procedures from Java
is a little bit more difficult
than calling ordinary (monomorphically typed) Mercury procedures.
The simplest method is to just create monomorphic forwarding procedures
that call the polymorphic procedures, and export them,
rather than exporting the polymorphic procedures.

If you do export a polymorphically typed Mercury procedure,
the compiler will prepend one @samp{type_info} argument
to the parameter list of the Java interface function
for each distinct type variable in the Mercury procedure's type signature.
The caller must arrange to pass in appropriate @samp{type_info} values
corresponding to the types of the other arguments passed.
These @samp{type_info} arguments can be obtained
using the Mercury @samp{type_of} function
in the Mercury standard library module @samp{type_desc}.

@node Using pragma foreign_decl for Java
@subsubsection Using pragma foreign_decl for Java

@samp{pragma foreign_decl} declarations for Java
can be used to provide any top-level Java declarations
(e.g.@: @samp{import} declarations or auxiliary class definitions)
which are needed by Java code in @samp{pragma foreign_proc} declarations
in that module.

For example:

@example
:- pragma foreign_decl("Java", "
import javax.swing.*;
import java.awt.*;

class MyApplet extends JApplet @{
    public void init() @{
        JLabel label = new JLabel(""Hello, world"");
        label.setHorizontalAlignment(JLabel.CENTER);
        getContentPane().add(label);
    @}
@}
").
:- pred hello(io.state::di, io.state::uo) is det.
:- pragma foreign_proc("Java",
    hello(_IO0::di, _IO::uo),
    [will_not_call_mercury],
"
    MyApplet app = new MyApplet();
    // @dots{}
").
@end example

@node Using pragma foreign_code for Java
@subsubsection Using pragma foreign_code for Java

The Java code from @samp{pragma foreign_proc} declarations for Java
will be placed in the bodies of static member functions
of an automatically generated Java class.
@samp{pragma foreign_code} can be used to define additional members
of this automatically generated class,
which can then be referenced by @samp{pragma foreign_proc} declarations
for Java from that module.

For example:

@example
:- pragma foreign_code("Java", "
    static int counter = 0;
").

:- impure pred incr_counter is det.
:- pragma foreign_proc("Java",
    incr_counter,
    [will_not_call_mercury],
"
    counter++;
").

:- semipure func get_counter = int.
:- pragma foreign_proc("Java",
    get_counter = (Result::out),
    [will_not_call_mercury, promise_semipure],
"
    Result = counter;
").
@end example

@c ----------------------------------------------------------------------------

@node Impurity
@chapter Impurity declarations

In order to efficiently implement certain predicates,
it is occasionally necessary to venture outside pure logic programming.
Other predicates cannot be implemented at all
within the paradigm of logic programming,
for example, all solutions predicates.
Such predicates are often written using the foreign language interface.
Sometimes, however, it would be more convenient, or more efficient,
to write such predicates using the facilities of Mercury.
For example, it is much more convenient to access
arguments of compound Mercury terms in Mercury than in C,
and the ability of the Mercury compiler to specialize code
can make higher-order predicates written in Mercury
significantly more efficient than similar C code.

One important aim of Mercury's impurity system
is to make the distinction between the pure and impure code very clear.
This is done by requiring every impure predicate or function to be so declared,
and by requiring every call to an impure predicate or function
to be flagged as such.
Predicates or functions that are implemented
in terms of impure predicates or functions
are assumed to be impure themselves
unless they are explicitly promised to be pure.

Please note that the facilities described here are needed only very rarely.
The main intent is for implementing language primitives
such as the all solutions predicates,
or for implementing interfaces to foreign language libraries
using the foreign language interface.
Any other use of @samp{impure} or @samp{semipure} probably indicates
either a weakness in the Mercury standard library,
or the programmer's lack of familiarity with the standard library.
Newcomers to Mercury are hence encouraged to @strong{skip this section}.

@menu
* Purity levels::               Choosing the right level of purity.
* Purity ordering::             How purity levels are ordered
* Impurity semantics::          What impure code means.
* Declaring impurity::          Declaring predicates impure.
* Impure goals::                Marking a goal as impure.
* Promising purity::            Promising that a predicate is pure.
* Impurity example::            A simple example using impurity.
* Higher-order impurity::       Using impurity with higher-order code.
@end menu

@node Purity levels
@section Choosing the right level of purity

Mercury distinguishes three ``levels'' of purity:

@table @dfn
@item pure
For pure procedures,
the set of solutions depends only on the values of the input arguments.
They do not interact with the ``real'' world (i.e., do any input/output)
without taking an io.state (@pxref{Types}) as input
and returning one as output,
and do not change the value of any data structure
that will not be undone on backtracking
(unless the data structure would be unreachable on backtracking).
Note that equality axioms are important
when considering the value of data structures.
The declarative semantics of pure predicates
is never affected by the invocation of other predicates.
It is not possible for the invocation of pure predicates
to affect the operational behaviour of non-pure predicates and vice versa.

By default, Mercury predicates and functions are pure.
Without using the foreign language interface,
writing mode-specific clauses
or calling other impure predicates and functions,
it is impossible to write impure code in Mercury.

@item semipure
Semipure predicates are just like pure predicates,
except that their declarative semantics may be affected
by the invocation of impure predicates.
That is, they are sensitive to the state of the computation
other than as reflected by their input arguments,
though they do not affect the state themselves.

@item impure
Impure predicates may perform I/O or modify hidden state,
even if these side effects alter the operational semantics of other code.
However, impure predicates may not change
the declarative semantics of pure code.
They must be type-, mode-, determinism- and uniqueness correct.

@end table

@node Purity ordering
@section Purity ordering

The three levels of purity (which we will simply call the purity)
have a total ordering defined upon them: @code{pure > semipure > impure}.

@node Impurity semantics
@section Impurity semantics

It is important to the proper operation of impure and semipure code,
to the flexibility of the compiler to optimize pure code,
and to the semantics of the Mercury language,
that a clear distinction be drawn
between ordinary Mercury code and imperative code written with Mercury syntax.
How Mercury draws this distinction will be explained below;
the purpose of this section
is to explain the semantics of programs with impure predicates.

A @emph{declarative} semantics of impure Mercury code
would be largely useless,
because the declarative semantics cannot capture the intent of the programmer.
Impure predicates are executed for their side-effects,
which by definition are not part of their declarative semantics.
Thus it is the @emph{operational} semantics of impure predicates
that Mercury must specify, and Mercury implementations must respect.

The operational semantics
of a Mercury predicate which invokes @emph{impure} code
is a modified form of the @emph{strict sequential} semantics
(@pxref{Formal semantics}).
@emph{Impure} goals may not be reordered relative to any other goals;
not even ``minimal'' reordering as implied by the modes is permitted.
If any such reordering is needed, this is a mode error.
However, @emph{pure} and @emph{semipure} goals
may be reordered as the compiler desires
(within the bounds of the semantics the user has specified for the program)
as long as they are not moved across an impure goal.
Execution of impure goals is strict: they must be executed if they are reached,
even if it can be determined that
the computation cannot lead to successful termination.

Semipure goals can be given a ``contextual'' declarative semantics.
They cannot have any side-effects,
so it is expected that,
given the context in which they are called
(relative to any impure goals in the program),
their declarative semantics fully captures the intent of the programmer.
Thus a semipure goal has a perfectly consistent declarative semantics,
until an impure goal is reached.
After that, it has another (possibly different) declarative semantics,
until the next impure goal is executed, and so on.
Mercury implementations must respect
this contextual nature of the semantics of semipure goals;
within a single context,
an implementation may treat a semipure goal as if it were pure.

@node Declaring impurity
@section Declaring impure functions and predicates

Every Mercury predicate or function
has exactly two purity values associated with it.
One is the @emph{declared} purity of the predicate or function,
which is given by the programmer.
The other value is the @emph{inferred} purity,
which is calculated from the purity of goals
in the body of the predicate or function.

A predicate is declared to be impure or semipure
by preceding the word @code{pred} in its @code{pred} declaration
with @code{impure} or @code{semipure}, respectively.
Similarly, a function is declared impure or semipure
by preceding the word @code{func} in its @code{func} declaration
with @code{impure} or @code{semipure}.
That is, the declarations

@example
:- impure pred @var{Pred}(@var{Arguments}@dots{}).
:- semipure pred @var{Pred}(@var{Arguments}@dots{}).
:- impure func @var{Func}(@var{Arguments}@dots{}) = @var{Result}.
:- semipure func @var{Func}(@var{Arguments}@dots{}) = @var{Result}.
@end example

@noindent
declare the predicate @var{Pred} and the function @var{Func}
to be impure and semipure, respectively.

Type class methods may also be declared as @code{impure} or @code{semipure}
by preceding the word @code{pred} or @code{func}
with the appropriate purity level.
An instance of the type class must provide method implementations
that are at least as pure as the method declaration.

@node Impure goals
@section Marking a goal as impure

If predicate @code{p/N} is declared to be @code{impure} (@code{semipure})
then all calls to @code{p/N} must be annotated
with @code{impure} (@code{semipure}):

@example
impure p(X1, X2, @dots{}, XN)
@end example

If function @code{f/N} is declared to be @code{impure} (@code{semipure})
then all applications of @code{f/N}
must be obtained by unification with a variable
and the unification goal as a whole be annotated with @code{impure}

@example
impure X = f(X1, X2, @dots{}, XN)
@end example

Any call or unification goal containing a non-local variable
with inst @code{any} that appears in a negated context
(i.e., in a negation or in the condition of an if-then-else goal)
must be given an @code{impure} annotation
because it may violate referential transparency.

Compound goals should not have purity annotations.

The compiler will report an error
if a required purity annotation is omitted from a call or unification goal
or if a @code{semipure} annotation is used
where an @code{impure} annotation is required.
The compiler will report a warning
if a semipure goal is annotated with @code{impure}
or a pure goal is annotated with @code{semipure}.

The requirement that impure or semipure calls
be marked with @code{impure} or @code{semipure}
allows someone reading the code to tell which goals are not pure,
making code which relies on side effects somewhat less mysterious.
Furthermore, it means that
if a call is @emph{not} preceded by @code{impure} or @code{semipure},
then the reader can rely on the call having a proper declarative semantics,
without hidden side-effects.

@node Promising purity
@section Promising that a predicate is pure

Predicates that are implemented in terms of impure or semipure predicates
are assumed to have the least of the purity of the goals in their body.
The declared purity of a predicate
must not be more pure than the inferred purity;
if it is, the compiler must generate an error.
If the declared purity is less pure than the inferred purity,
the compiler should issue a warning
(this is similar to the above case for goals).
Because the inferred purity of the predicate
is calculated from the declared purity of the calls it executes,
the lowest purity bound is propagated up
from callee to caller through the program.

In some cases,
the impurity of a predicate's body is an implementation detail
which should not be exposed to callers.
These predicates are pure or semipure
even though they call impure or semipure predicates.
The only way for the programmer to stop the propagation of impurity
is to explicitly promise that the predicate or function is pure or semipure.

Of course, the Mercury compiler cannot verify
that the predicate's purity matches the promise,
so it is the programmer's responsibility to ensure this.
If a predicate is promised pure or semipure and is not,
the behaviour of the program is undefined.

The programmer may promise that a predicate or function is pure or semipure
using the @code{promise_pure} and @code{promise_semipure} pragmas:

@example
:- pragma promise_pure(@var{Name}/@var{Arity}).
:- pragma promise_semipure(@var{Name}/@var{Arity}).
@end example

Programmers should be very careful
about mixing code that is promised pure
with impure predicates or functions that may manipulate the same hidden state
(for example, the impure predicates
used to implement a predicate that is promised pure);
the @samp{promise_pure} declaration is supposed to promise that
impure code cannot change the declarative semantics of pure code.
The module system can be used to minimize the possibility
of making errors with such code,
by keeping impure predicates or functions
behind the interface where code is promised pure.

Note that the @samp{promise_pure}, @samp{promise_semipure},
and @samp{promise_impure} scopes described in @ref{Goals}
may be used to promise purity at the finer level of goals within clauses.

@node Impurity example
@section An example using impurity

The following example illustrates
how a pure predicate may be implemented using impure code.
Note that this code is not reentrant, and so is not useful as is.
It is meant only as an example.

@example
:- pragma foreign_decl("C", "#include <limits.h>").
:- pragma foreign_decl("C", "extern MR_Integer max;").

:- pragma foreign_code("C", "MR_Integer max;").

:- impure pred init_max is det.
:- pragma foreign_proc("C",
    init_max,
    [will_not_call_mercury],
"
    max = INT_MIN;
").

:- impure pred set_max(int::in) is det.
:- pragma foreign_proc("C",
   set_max(X::in),
   [will_not_call_mercury],
"
    if (X > max) max = X;
").

:- semipure func get_max = (int::out) is det.
:- pragma foreign_proc("C",
    get_max = (X::out),
    [promise_semipure, will_not_call_mercury],
"
    X = max;
").

:- pragma promise_pure(max_solution/2).
:- pred max_solution(pred(int), int).
:- mode max_solution(pred(out) is multi, out) is det.

max_solution(Generator, Max) :-
    impure init_max,
    (
        Generator(X),
        impure set_max(X),
        fail
    ;
        semipure Max = get_max
    ).
@end example

@node Higher-order impurity
@section Using impurity with higher-order code

Higher-order code can manipulate impure or semipure predicates and functions,
provided that explicit purity annotations are used in three places:
on the higher-order types, on lambda expressions, and on higher-order calls.
(There are no purity annotations on higher-order insts and modes, however.)

@menu
* Purity annotations on higher-order types::
* Purity annotations on lambda expressions::
* Purity annotations on higher-order calls::
@end menu

@node Purity annotations on higher-order types
@subsection Purity annotations on higher-order types

Ordinary higher-order types,
such as @samp{pred(T1, T2)} and @samp{func(T1, T2) = T},
represent only pure predicates or pure functions.
But for each ordinary higher-order type @var{Foo},
there are two corresponding types
@samp{semipure @var{Foo}} and @samp{impure @var{Foo}}.
These types can be used for higher-order code
that needs to manipulate impure or semipure procedures.
For example the type @samp{impure func(int) = int}
represents impure functions from @code{int} to @code{int}.

There are no implicit conversions and no subtyping relationship
between ordinary higher-order types
and the corresponding impure or semipure higher-order types.
However, a value of an ordinary higher-order type
can be explicitly ``converted''
to a value of an impure (or semipure) higher-order type
by wrapping it in an impure (or semipure) lambda expression
that just calls the pure higher-order term.

@node Purity annotations on lambda expressions
@subsection Purity annotations on lambda expressions

Purity annotations are required on lambda expressions
that call semipure or impure code.
Lambda expressions can be declared as @samp{semipure} or @samp{impure}
by including such an annotation
before the @samp{pred} or @samp{func} identifier in the lambda expression.
Such lambda expressions have
the corresponding @samp{semipure} or @samp{impure} higher-order type.
For example, the expression

@example
(impure func(X) = Y :- semipure get_max(Y), impure set_max(X))
@end example

@noindent
is an example of an impure function lambda expression
with type @samp{(impure func(int) = int)},
and the expression

@example
(impure pred(X::in, Y::out) is det :-
    semipure get_max(Y),
    impure set_max(X))
@end example
is an example of an impure predicate lambda expression
with type @samp{impure pred(int, int)}.

@node Purity annotations on higher-order calls
@subsection Purity annotations on higher-order calls

Any calls to impure or semipure higher-order terms
must be explicitly annotated as such.
For impure or semipure higher-order predicates, the annotation is indicated
by putting @samp{impure} or @samp{semipure} before the call.
For example:

@example
:- func foo(impure pred(int)) = int.
:- mode foo(in(pred(out) is det)) = out is det.

foo(ImpurePred) = X1 + X2 :-
    % Using higher-order syntax.
    impure ImpurePred(X1),
    % Using the call/N syntax.
    impure call(ImpurePred, X2).
@end example

For calling impure or semipure higher-order functions,
the notation is different than what you might expect.
In addition to using an @samp{impure} or @samp{semipure} operator
on the unification which invokes the higher-order function application,
you must also use @samp{impure_apply} or @samp{semipure_apply}
rather than using @samp{apply} or higher-order syntax.
@c XXX it would be nicer to change the implementation to support
@c     nice syntax, rather than documenting this hack
For example:

@example
:- func map(impure func(T1) = T2, list(T1)) = list(T2).

map(_ImpureFunc, []) = [].
map(ImpureFunc, [X | Xs]) = [Y | Ys] :-
    impure Y = impure_apply(ImpureFunc, X),
    impure Ys = map(ImpureFunc, Xs).
@end example

@node Solver types
@chapter Solver types

@menu
* The any inst::
* Abstract solver type declarations::
* Solver type definitions::
* Implementing solver types::
* Solver types and negated contexts::
@end menu

Solver types are an experimental addition to the language
supporting the implementation of constraint solvers.
A program may place constraints on and between variables of a solver type,
limiting the values those variables may take on before they are actually bound.
For example, if @code{X} and @code{Y} are variables
belonging to a constrained integer solver type,
we might place constraints upon them such that
@w{@code{X > 3 + Y}} and @w{@code{Y =< 7}}.
A later attempt to unify @code{Y} with @code{10} will fail
(it would violate the second constraint);
similarly an attempt to unify @code{X} with @code{5} and @code{Y} with @code{4}
would fail (it would violate the first constraint).

@node The any inst
@section The @samp{any} inst

Variables with solver types can have one of three possible insts:
@code{free}, @code{ground} or @code{any}.
A variable with a solver type with inst @code{any}
may not (yet) be semantically ground, in the following sense:
if a variable is semantically ground,
then the set of values it unifies with form an equivalence class;
if a variable is non-ground,
then the set of values it unifies with do not form an equivalence class.

More formally, @code{X} is ground
if for values @code{Y} and @code{Z} that unify with @code{X},
it is the case that @code{Y} and @code{Z} also unify with each other.
@code{X} is non-ground
if there are values @code{Y} and @code{Z} that unify with @code{X},
but which do not unify with each other.

A non-solver type value will have inst @code{any}
if it is constructed using one or more inst @code{any} values.

The builtin modes @code{ia} and @code{oa}
are equivalent to @code{in(any)} and @code{out(any)} respectively.

@node Abstract solver type declarations
@section Abstract solver type declarations

The type declarations

@example
:- solver type t1.
:- solver type t2(T1, T2).
@end example

@noindent
declare types @code{t1/0} and @code{t2/2} to be abstract solver types.
Abstract solver type declarations are
identical to ordinary abstract type declarations
except for the @code{solver} keyword.

@node Solver type definitions
@section Solver type definitions

A solver type definition takes the following form:

@example
@group
:- solver type @var{solver_type}
    where   representation   is @var{representation_type},
            ground           is @var{ground_inst},
            any              is @var{any_inst},
            constraint_store is @var{mutable_decls},
            equality         is @var{equality_pred},
            comparison       is @var{comparison_pred}.
@end group
@end example

The @code{representation} attribute is mandatory.
The @var{ground_inst} and @var{any_inst} attributes are optional
and default to @code{ground}.
The @code{constraint_store} attribute is mandatory:
@var{mutable_decls} must be either a single mutable declaration
(@pxref{Module-local mutable variables}),
or a comma separated list of mutable declarations in brackets.
The @code{equality} and @code{comparison} attributes are optional,
although a solver type without equality would not be very useful.
The attributes that are not omitted must appear in the order shown above.

The @var{representation_type}
is the type used to implement the @var{solver_type}.
A two-tier scheme of this kind is necessary for a number of reasons,
including
@itemize @bullet
@item
a semantic gap is necessary to accommodate the fact that
non-ground @var{solver_type} values
may be represented by ground @var{representation_type} values
(in the context of the corresponding constraint solver state);
@item
this approach greatly simplifies
the definition of equality and comparison predicates for the @var{solver_type}.
@end itemize

@c XXX The following paragraph describes the old support for automatic
@c     initialisation of solver types.  We no longer officially support
@c     automatic initialisation as part of the language but the compiler
@c     still contains the code necessary to implement it.
@c
@c The @code{initialisation_pred} is the name of a predicate defined in the
@c same module as the solver type, with the following signature:
@c
@c @example
@c :- pred initialisation_pred(solver_type::out(any)) is det.
@c @end example
@c
@c Calls to this predicate are inserted automatically by the compiler when a
@c @code{free} @code{solver_type} variable has to be given inst @code{any}.
@c (The initialisation predicate is responsible for registering the new,
@c unbound variable with the corresponding constraint solver state.)

The @var{ground_inst} is the inst associated with
@var{representation_type} values denoting @code{ground}
@var{solver_type} values.

The @var{any_inst} is the inst associated with @var{representation_type} values
denoting @code{any} @code{solver_type} values.

The compiler constructs four impure functions
for converting between @var{solver_type} values
and @var{representation_type} values
(@var{name} is the function symbol used to name @var{solver_type}
and @var{arity} is the number of type parameters it takes):

@example
:- impure func 'representation of ground @var{name}/@var{arity}'(@var{solver_type}) =
                        @var{representation_type}.
:-        mode 'representation of ground @var{name}/@var{arity}'(in) =
                        out(@var{ground_inst}) is det.

:- impure func 'representation of any @var{name}/@var{arity}'(@var{solver_type}) =
                        @var{representation_type}.
:-        mode 'representation of any @var{name}/@var{arity}'(in(any)) =
                        out(@var{any_inst}) is det.

:- impure func 'representation to ground @var{name}/@var{arity}'(@var{representation_type}) =
                        @var{solver_type}.
:-        mode 'representation to ground @var{name}/@var{arity}'(in(@var{ground_inst})) =
                        out is det.

:- impure func 'representation to any @var{name}/@var{arity}'(@var{representation_type}) =
                        @var{solver_type}.
:-        mode 'representation to any @var{name}/@var{arity}'(in(@var{any_inst})) =
                        out(any) is det.
@end example

These functions are impure because of the semantic gap issue mentioned above.

Solver types may be exported from their defining module,
but only in an abstract form.
This requires the full definition
to appear in the implementation section of the module,
and an abstract declaration like the following in its interface:

@example
:- solver type @var{solver_type}.
@end example

If a solver type is exported,
then its representation type,
and, if specified, its equality and/or comparison predicates
must also be exported from the same module.

If a solver type has no equality predicate specified,
then the compiler will generate an equality predicate
that throws an exception of type @samp{exception.software_error/0} when called.

Likewise, if a solver type has no comparison predicate specified,
then the compiler will generate a comparison predicate
that throws an exception of type @samp{exception.software_error/0} when called.

If provided,
any mutable declarations given for the @code{constraint_store} attribute
are equivalent to separate mutable declarations;
their association with the solver type is for the purposes of documentation.
That is,

@example
:- solver type t
    where @dots{},
          constraint_store is [ mutable(a, int, 42, ground, []),
                                mutable(b, string, "Hi", ground, [])
                               ],
          @dots{}
@end example

@noindent
is equivalent to

@example
:- solver type t
    where @dots{}
:- mutable(a, int, 42, ground, []).
:- mutable(b, string, "Hi", ground, []).
@end example

@node Implementing solver types
@section Implementing solver types

A solver type is an abstraction,
implemented using a combination of a private representation type
and a constraint store.

The constraint store is an (impure) piece of state
used to keep track of the extant constraints on variables of the solver type.
This will typically be implemented using foreign code.

It is important that changes to the constraint store
are properly trailed (see @ref{Trailing})
so that changes can be undone on backtracking.

The solver type implementation should provide functions and predicates
@itemize @bullet
@item
to construct and deconstruct solver type values,
@item
to place constraints on solver type variables,
@item
to convert @code{any} solver type variables to @code{ground} if possible
(this is obviously an impure operation --- see @ref{Impurity}),
@item
to convert solver type values to non-solver type values
(again, this is impure
and requires the argument solver type values be sufficiently ground),
@item
to ask questions about the extant constraints on solver type variables
without constraining them further
(this too is impure because the set of constraints on a variable
may change during execution of the program).
@end itemize

@node Solver types and negated contexts
@section Solver types and negated contexts

Mercury's negation and if-then-else goals
(and hence also inequalities and universal quantifications)
are implemented using @emph{negation as failure},
meaning that the failure to find a proof of one statement
is regarded as a proof of its negation.
Negation as failure is sound
provided that no non-local variable becomes further bound
during the execution of a goal which may be negated.
This includes negated goals themselves,
as well as the conditions of if-then-elses,
which are negated if and only if they fail without producing any solution,
and the bodies of pred or func expressions,
which may be called or applied in one of the other contexts,
or indeed in another pred or func expression.

Mercury checks that any solver variables that are used in the above contexts
are used in such a way that negation as failure remains sound.
In the case of negation and if-then-else goals,
if any non-local solver type variable or higher-order variable
with inst @code{any} is used in a negated context,
the goal must be placed inside
a @code{promise_pure}, @code{promise_semipure}, or @code{promise_impure} scope.
The first two promises assert that (among other things)
no solver variable becomes further bound in the negated context.
The third promise makes the weaker assertion that the goal satisfies
the requirements of all impure goals
(namely, that it does not interfere with the semantics of other pure goals).

In the case of pred and func expressions, Mercury allows three possibilities.
The higher-order value may be considered @code{ground},
which means that all non-local variables used in the body of the expression
(including those with other higher-order values) must themselves be ground.
Higher-order values that are ground
can be safely called or applied in any context, including negated contexts,
since none of their (ground) non-local variables
can become further bound by doing so.
Alternatively, the higher-order value
may be considered to have inst @code{any},
which allows non-local variables used in the body of the expression
to have inst @code{any}.
Calling or applying these values
@emph{may} further bind non-local variables,
so if this occurs in a negated context
then, as in the case of solver variables,
a promise will be required around the negation or if-then-else.

Pred and func expressions with inst @code{any}
are written using @code{any_pred} and @code{any_func}
in place of @code{pred} and @code{func}, respectively.

The third possibility is that
the higher-order value can be given an impure type
(@pxref{Higher-order impurity}).

@node Trace goals
@chapter Trace goals

Sometimes, programmers find themselves
needing to perform some side-effects in the middle of declarative code.
One example is an operation that takes so long that
users may think the program has gone into an infinite loop:
periodically printing a progress message can give them reassurance.
Another example is a program that is
too long-running for its behaviour to be analyzed via debuggers
and too complex for analysis via profilers;
a programmable logging facility generating data
for analysis by a specially-written program may be the best option.
However, inserting arbitrary side effects into declarative code
is against the spirit of Mercury.
Trace goals exist to provide a mechanism
to code these side effects in a disciplined fashion.

The format of trace goals is @code{trace @var{Params} @var{Goal}}.
@var{Goal} must be a valid goal;
@var{Params} must be a valid list of one or more trace parameters.
The following example shows all four of the available kinds of parameters:
@samp{compile_time}, @samp{run_time}, @samp{io} and @samp{state}.
(In practice, it is far more typical to have just one parameter, @samp{io}.)

@example
:- mutable(logging_level, int, 0, ground, []).

:- pred time_consuming_task(data::in, result::out) is det.

time_consuming_task(In, Out) :-
    trace [
        compile_time(flag("do_logging") or grade(debug)),
        run_time(env("VERBOSE")),
        io(!IO),
        state(logging_level, !LoggingLevel)
    ] (
        io.write_string("Time_consuming_task start\n", !IO),
        ( if !.LoggingLevel > 1 then
            io.write_string("Input is ", !IO),
            io.write(In, !IO),
            io.nl(!IO)
        else
            true
        )
    ),
    @dots{}
    % perform the actual task
@end example

The @samp{compile_time} parameter says under what circumstances
the trace goal should be included in the executable program.
In the example, at least one of two conditions has to be true:
either this module has to be compiled
with the option @samp{--trace-flag=do_logging},
or it has to be compiled in a debugging grade.

In general, the single argument of the @samp{compile_time} function symbol
is a boolean expression of primitive compile-time conditions.
Valid boolean operators in these expressions
are @samp{and}, @samp{or} and @samp{not}.
There are three kinds of primitive compile-time conditions.
The first has the form @samp{flag(@var{FlagName})},
where @var{FlagName} is an arbitrary name picked by the programmer;
this condition is true
if the module is compiled with the option @samp{--trace-flag=@var{FlagName}}.
The second has the form @samp{tracelevel(shallow)}, or @samp{tracelevel(deep)};
this condition is true (irrespective of grade)
if the module is compiled with at least the specified trace level.
The third has the form @samp{grade(GradeTest)}.
The supported @samp{GradeTest}s and their meanings are as follows.

@table @asis
@item @samp{debug}
True if the module is compiled with execution tracing enabled.
@item @samp{ssdebug}
True if the module is compiled with source-to-source debugging enabled.
@item @samp{prof}
True if the module is compiled with non-deep profiling enabled.
@item @samp{profdeep}
True if the module is compiled with deep profiling enabled.
@item @samp{par}
True if the module is compiled with parallel execution enabled.
@item @samp{trail}
True if the module is compiled with trailing enabled.
@c The next one is commented because the rbmm grades are not yet publicly
@c documented.
@c @item @samp{rbmm}
@c True if the module is compiled with region based memory management enabled.
@item @samp{llds}
True if the module is compiled with @samp{--highlevel-code} disabled.
@item @samp{mlds}
True if the module is compiled with @samp{--highlevel-code} enabled.
@item @samp{c}
True if the target language of the compilation is C.
@item @samp{csharp}
True if the target language of the compilation is C#.
@item @samp{java}
True if the target language of the compilation is Java.
@end table
@c (We may support the specification of other kinds of grades in the future.)

The @samp{run_time} parameter says under what circumstances the trace goal,
if included in the executable program, should actually be executed.
In this case, the environment variable @samp{VERBOSE} has to be set
when the program starts execution.
(It does not matter what value it is set to.)

In general, the single argument of the @samp{run_time} function symbol
is a boolean expression of primitive run-time conditions.
Valid boolean operators in these expressions
are @samp{and}, @samp{or} and @samp{not}.
There is just one primitive run-time condition.
It has the form @samp{env(@var{EnvVarName})},
this condition is true
if the environment variable @var{EnvVarName} exists
when the program starts execution.

The @samp{compile_time} and @samp{run_time} parameters
may not appear in the parameter list more than once;
programmers who want more than one condition
have to specify how (with what boolean operators)
their values should be combined.
However, it is ok for them not to appear in the parameter list at all.
If there is no @samp{compile_time} parameter,
the trace goal is always compiled into the executable;
if there is no @samp{run_time} parameter,
the trace goal is always executed (if it is compiled into the executable).

Since the trace goal may end up
either not compiled into the executable or just not executed,
it cannot bind any variables that occur in the surrounding code.
(If it were allowed to bind such variables,
then those variables would stay unbound
if either the compile time or the run time condition were false.)
This greatly restricts what trace goals can do.

The usual reason for including a trace goal
in a procedure body is to perform some I/O,
which requires access to the I/O state.
The @samp{io} parameter supplies this access.
Its argument must be the name of a state variable prefixed by @samp{!};
by convention, it is usually @samp{!IO}.
The language implementation supplies
the initial unique value of the I/O state
as the value of @samp{!.IO} at the start of the trace goal;
it requires the trace goal to give back
the final unique value of the I/O state
as the value of @samp{!.IO} current at the end of the trace goal.

Note that trace goals that wish to do I/O
must include this parameter in their parameter list
@emph{even if} the surrounding code already has access to an I/O state.
This is because otherwise,
doing any I/O inside the trace goal
would destroy the value of the current I/O state,
changing the instantiation state of the variable holding it,
and trace goals are not allowed to do that.

The @samp{io} parameter may appear in the parameter list at most once,
since it does not make sense to have
two copies of the I/O state available to the trace goal.

Besides doing I/O, trace goals may read and possibly write
the values of mutable variables.
Each mutable the trace goal wants access to should be listed
in its own @samp{state} parameter
(which may therefore appear in the parameter list more than once).
Each @samp{state} parameter has two arguments:
the first gives the name of the mutable,
while the second must be the name of a state variable prefixed by @samp{!},
e.g.@: @samp{!LoggingLevel}.
The language implementation supplies
the initial value of the mutable
as the value of (in this case) @samp{!.LoggingLevel}
at the start of the trace goal;
at the end of the trace goal,
it puts the value of @samp{!.LoggingLevel} current then
back into the mutable.

The intention here is that trace goals
should be able to access mutables that give them information
about the parameters within which they should operate.
The ability of trace goals to actually @emph{update} the values of mutables
is intended to allow the implementation of trace goals
whose actions depend on the actions executed by previous trace goals.
For example, a trace goal could test
whether the current input value is the same as the previous input value,
and if it is, then it can say so instead of printing the value out again.
Another possibility is a progress message
which is printed not for every item processed, but for every 1000th item,
reassuring users without overwhelming them with a deluge of output.

This kind of code is the @emph{only} intended use of this ability.
Any program in which the value of a mutable set by a trace goal
is inspected by code that is not itself within a trace goal
is explicitly violating the intended uses of trace goals.
Only the difficulty of implementing the required global program analysis
prevents the language design from outlawing such programs in the first place.

The compiler will not delete trace goals
from the bodies of the procedures containing them,
even though they are @samp{det} and have no outputs.
In their effect on program optimizations,
trace goals function as a kind of impure code,
but one with an implicit promise_pure around the clause in which they occur.
@c zs: I think saying the following is more likely to confuse than enlighten:
@c The trace goal scope itself acts
@c as a promise_pure scope for any impure code inside it.
Note that trace goals inside a procedure
do not prevent calls to that procedure from being optimized away.
For example,
if a predicate definition contains a single trace goal
in order to factor out the details of that goal,
calls to it may be optimized away.
This will render them ineffective;
the strict sequential semantics can be used
to inhibit such optimizations (@pxref{Formal semantics}).

@node Pragmas
@chapter Pragmas

The pragma declarations described below
are a standard part of the Mercury language,
as are the pragmas for controlling the foreign language interface
(@pxref{Foreign language interface}) and impurity (@pxref{Impurity}).
As an extension,
implementations may also choose to support additional pragmas
with implementation-dependent semantics
(@pxref{Implementation-dependent extensions}).

@menu
* Inlining::                    Pragmas can be used to suggest or prevent
                                procedure inlining.
* Type specialization::         Pragmas can be used to produce specialized
                                versions of polymorphic procedures.
* Obsolescence::                Library developers can declare old versions
                                of predicates or functions to be obsolete.
* No determinism warnings::     Pragmas can be used to suppress warnings
                                about too loose determinism declarations.
* No dead predicate warnings::  Pragmas can be used to suppress warnings
                                about unused predicates.
* Format calls ::               Pragmas for requesting extra checking of calls
                                to formatting predicates and functions.
* Source file name::            The @samp{source_file} pragma and
                                @samp{#@var{line}} directives provide support
                                for preprocessors and other tools that
                                generate Mercury code.
* Old pragma syntax::           Old and obsolete syntax for some pragmas.
@end menu

@node Inlining
@section Inlining

Declarations of these forms

@example
:- pragma inline(pred(@var{Name}/@var{Arity})).
:- pragma inline(func(@var{Name}/@var{Arity})).
@end example

@noindent
are a hint to the compiler that
all calls to the predicate or function
with name @var{Name} and arity @var{Arity} should be inlined.

The current Mercury implementation is smart enough
to inline simple predicates even without this hint.

Declarations of these forms

@example
:- pragma no_inline(pred(@var{Name}/@var{Arity})).
:- pragma no_inline(func(@var{Name}/@var{Arity})).
@end example

@noindent
tell the compiler not to inline the named predicate or function.
This may be used simply for performance concerns
(inlining can cause unwanted code bloat in some cases)
or to prevent possibly dangerous inlining when using low-level C code.

@node Type specialization
@section Type specialization

The overhead of polymorphism can in some cases be significant,
especially where polymorphic predicates make heavy use
of class method calls or the builtin unification and comparison routines.
To avoid this, the programmer can suggest to the compiler
that a specialized version of a procedure should be created
for a specific set of argument types.

@menu
* Syntax and semantics of type specialization pragmas::
* When to use type specialization::
* Implementation specific details::
@end menu

@node Syntax and semantics of type specialization pragmas
@subsection Syntax and semantics of type specialization pragmas

A declaration of the form

@example
:- pragma type_spec(pred(@var{Name}/@var{Arity}), @var{Subst}).
:- pragma type_spec(func(@var{Name}/@var{Arity}), @var{Subst}).
@end example

@noindent
suggests to the compiler that it should create a specialized version
of the predicate or function with name @var{Name} and arity @var{Arity}
with the type substitution given by @var{Subst} applied to the argument types.
The substitution is written as a conjunction of bindings
of the form @w{@samp{@var{TypeVar} = @var{Type}}},
for example @w{@samp{K = int}} or @w{@samp{(K = int, V = list(int))}}.
(The conjunction must have parentheses around it
if it contains two or more bindings.)

For example, the declarations

@example
:- pred map.lookup(map(K, V), K, V).
:- pragma type_spec(pred(map.lookup/3), K = int).
@end example

@noindent
give a hint to the compiler that
a version of @samp{map.lookup/3} should be created for integer keys.

Implementations are free to ignore @samp{pragma type_spec} declarations.
Implementations are also free to perform type specialization
even in the absence of any @samp{pragma type_spec} declarations.

The pragma also has a form that suggests specialization
of only one mode of the predicate or function, instead of all of them:

@example
:- pragma type_spec(@var{Name}(@var{m1}, ... @var{mn}), @var{Subst}).
:- pragma type_spec(@var{Name}(@var{m1}, ... @var{mn}) = @var{mr}, @var{Subst}).
@end example

where @var{m1} etc are the modes of the arguments.
If the @samp{= @var{mr}} part is present,
it gives the mode of the function result;
if it is absent, this indicates that @var{Name} is a predicate.

@node When to use type specialization
@subsection When to use type specialization

The set of types for which a predicate or function should be specialized
is best determined by profiling your application.
Overuse of type specialization will result in code bloat.

Type specialization of predicates or functions
which unify or compare polymorphic variables
is most effective when the specialized types
are builtin types such as @code{int}, @code{float} and @code{string},
or enumeration types,
since their unification and comparison procedures
are simple and can be inlined.

Predicates or functions which make use of type class method calls
may also be candidates for specialization.
Again, this is most effective
when the called type class methods are simple enough to be inlined.

@node Implementation specific details
@subsection Implementation specific details

The Melbourne Mercury compiler
performs user-requested type specializations
when invoked with @samp{--user-guided-type-specialization},
which is enabled at optimization level @samp{-O2} or higher.
However, for the Java back-end,
user-requested type specializations are ignored.

@node Obsolescence
@section Obsolescence

Declarations of the forms

@example
:- pragma obsolete(pred(@var{Name}/@var{Arity})).
:- pragma obsolete(func(@var{Name}/@var{Arity})).
:- pragma obsolete(pred(@var{Name}/@var{Arity}),
    [@var{ReplName1}/@var{ReplArity1}, ..., @var{ReplNameN}/@var{ReplArityN}]).
:- pragma obsolete(func(@var{Name}/@var{Arity}),
    [@var{ReplName1}/@var{ReplArity1}, ..., @var{ReplNameN}/@var{ReplArityN}]).
@end example

@noindent
declare that the predicate or function
with name @var{Name} and arity @var{Arity} is ``obsolete'':
they instruct the compiler to issue a warning
whenever the named predicate or function is used.
The forms with a second argument tell the compiler
to suggest the use of one of the listed possible replacements.

Declarations of the forms

@example
:- pragma obsolete_proc(@var{PredName}(@var{ArgMode1}, ..., @var{ArgModeN})).
:- pragma obsolete_proc(@var{PredName}(@var{ArgMode1}, ..., @var{ArgModeN}),
    [@var{ReplName1}/@var{ReplArity1}, ..., @var{ReplNameN}/@var{ReplArityN}]).
:- pragma obsolete_proc(@var{FuncName}(@var{ArgMode1}, ..., @var{ArgModeN}) = @var{RetMode}).
:- pragma obsolete_proc(@var{FuncName}(@var{ArgMode1}, ..., @var{ArgModeN}) = @var{RetMode},
    [@var{ReplName1}/@var{ReplArity1}, ..., @var{ReplNameN}/@var{ReplArityN}]).
@end example

@noindent
similarly declare that the predicate named @var{PredName} with arity @var{N},
or the function named @var{FuncName} with arity @var{N},
is obsolete when called in the specified mode.
These forms also allow the specification
of an optional list of possible replacements.

These declarations are intended for use by library developers,
to allow gradual (rather than abrupt) evolution of library interfaces.
If a library developer changes
the interface of a library predicate or procedure,
they should leave its old version in the library,
but mark it as obsolete using one of these declarations,
and, if possible, use the suggested replacements to steer users
to their replacements (either partial or total) in the new interface.
The users of the library will then get a warning if they use obsolete features,
and can consult the library documentation to determine how to fix their code.
Eventually, when the library developer
believes that users have had sufficient warning,
they can remove the old version entirely.

@node No determinism warnings
@section No determinism warnings

Declarations of the forms

@example
:- pragma no_determinism_warning(pred(@var{Name}/@var{Arity})).
:- pragma no_determinism_warning(func(@var{Name}/@var{Arity})).
@end example

@noindent
tell the compiler not to generate any warnings
about the determinism declarations of procedures of the predicate or function
with name @var{Name} and arity @var{Arity} not being as tight as they could be.

@samp{pragma no_determinism_warning} declarations are intended for use
in situations in which the code of a predicate has one determinism,
but the declared determinism of the procedure must be looser
due to some outside requirement.
One such situation is when a set of procedures are all possible values
of the same higher-order variable,
which requires them to have the same argument types, modes, and determinisms.
If (say) most of the procedures are @code{det}
but some are @code{erroneous} (that is, they always throw an exception),
the procedures that are declared @code{det}
but whose bodies have determinism @code{erroneous}
will get a warning saying their determinism declaration could be tighter,
unless the programmer specifies this pragma for them.

@node No dead predicate warnings
@section No dead predicate warnings

Declarations of the forms

@example
:- pragma consider_used(pred(@var{Name}/@var{Arity})).
:- pragma consider_used(func(@var{Name}/@var{Arity})).
@end example

@noindent
tells the compiler to consider the predicate or function
with name @var{Name} and arity @var{Arity} to be used,
and not generate any dead procedure/predicate/function warnings
either for the named predicate or function,
@emph{or} for the other predicates and functions that it calls,
either directly or indirectly.

@samp{pragma consider_used} declarations are intended for use in situations
in which the code that was intended to call such a predicate or function
is not yet written.

@node Format calls
@section Format calls

The @samp{format_call} pragma asks the compiler
to perform the same checks on calls to the named predicate or function
as it performs for calls to the following
formatting predicates and function in the Mercury standard library.

The Mercury standard library has a function and a predicate
to format a given sequence of values according to a format string,
like this:

@example
    string.format("Count = %d, Total = %5.2f, Average = %5.2f\n",
        [i(Count), f(Total), f(Total/float(Count))], FormatStr)

    FormatStr = string.format("Count = %d, Total = %5.2f, Average = %5.2f\n",
        [i(Count), f(Total), f(Total/float(Count))])
@end example

@noindent
and predicates that immediately output the resulting formatted string,
like this:

@example
    io.format(
        "Count = %d, Total = %5.2f, Average = %5.2f\n",
        [i(Count), f(Total), f(Total/float(Count))], !IO)

    io.format(OutputStream,
        "Count = %d, Total = %5.2f, Average = %5.2f\n",
        [i(Count), f(Total), f(Total/float(Count))], !IO)

    stream.string_writer.format(StringWriterStream,
        "Count = %d, Total = %5.2f, Average = %5.2f\n",
        [i(Count), f(Total), f(Total/float(Count))], !StringWriterState)
@end example

@noindent
These four predicates and one function
all take a format string argument, and a values_to_format argument.
The format string argument in all the examples above is
@example
        "Count = %d, Total = %5.2f, Average = %5.2f\n"
@end example
@noindent
while the values_to_format argument is
@example
        [i(Count), f(Total), f(Total/float(Count))]
@end example

In this example, @code{%d} means that the corresponding value
should be output as a signed integer, using as many characters as needed,
and @code{%5.2f} means that the corresponding value
should be formatted as a floating point number,
using five characters, of which two are after the decimal point.

Mercury does not allow
values of different types to be stored in the same list,
so the elements of that list are actually values of the @code{poly_type} type,
whose definition in the @code{string} module of the standard library is
@example
:- type poly_type
    --->    f(float)
    ;       i(int)
    ;       i8(int8)
    ;       i16(int16)
    ;       i32(int32)
    ;       i64(int64)
    ;       u(uint)
    ;       u8(uint8)
    ;       u16(uint16)
    ;       u32(uint32)
    ;       u64(uint64)
    ;       s(string)
    ;       c(char).
@end example
@noindent
As you can see, each function symbol in this type
wraps up a value of a builtin type,
and thus converts it to the same @code{poly_type} type.

With one exception,
every occurrence of a percent sign in the format string
is intended to start a @emph{conversion specifier},
which specifies how the corresponding entry in the value list
should be converted to a string.
(The exception is that the double percent sign @code{%%}
simply says that the output should be a single percent sign;
the doubling up escapes
the special conversion-introduction role of the character.)
Each conversion specifier consists of a percent sign,
zero or more non-alphabetic characters such as @samp{5},
and an alphabetic @emph{conversion character} such as @code{d} or @code{f}.

The format string and the values list must match.
There should be exactly one element in the value list
for each conversion specifier,
and the Nth value in the value list must satisfy
the requirements of the Nth conversion specifier in the format string.

For example, the call
@example
    string.format("Count = %d, Total = %5.2f, Average = %5.2f\n",
        [i(Count), f(Total/float(Count))], FormatStr)
@end example
@noindent
would not work, because the format string contains three conversion specifiers,
but the value list contains only two values.

The call
@example
    string.format("Count = %f, Total = %d, Average = %5.2f\n",
        [i(Count), f(Total), f(Total/float(Count))], FormatStr)
@end example
@noindent
would not work either,
because the first specifier's conversion character, @code{f},
requires the corresponding value to have the form @code{f(Float)},
but the corresponding value is @code{i(Count)}.
The conversion character in the second specifier, @code{d},
accepts any signed integer as the value,
so the second value could be any of @code{i(Int)}, @code{i8(Int8)},
@code{i16(Int16)}, @code{i32(Int32)}, or @code{i64(Int64)},
but not @code{f(Float)}.

For the full set of the rules, please see the documentation
of @code{string.format} in the Mercury Library Reference Manual;
the same rules apply to @code{io.format}
and @code{stream.string_writer.format} as well.

The Mercury compiler checks calls
to the formatting predicates and function in the Mercury standard library,
specifically

@table @asis
@item predicate @code{io.format/4}
@item predicate @code{io.format/5}
@item predicate @code{stream.string_writer.format/5}
@item function @code{string.format/2}
@item predicate @code{string.format/3}
@end table

for inconsistencies between the format string and the value list,
generating a report for each such inconsistency.
This improves both program reliability
(because programmers get the problem brought to their attention
even if the bad call is never executed)
and programmer productivity
(because programmers don't have to write test cases
just to force the execution of all calls to these predicates).
However, the compiler can perform this check
only if the predicate or function
containing the call to the formatting predicate or function
also constructs the format string and the value list.
Most calls to formatting predicates satisfy this condition, but not all.

The main exceptions are predicates intended for logging,
whose logic follows this general pattern:

@example
maybe_log_message(LogInfo, FormatString, Values, !IO) :-
    log_info_should_log(LogInfo, ShouldLog),
    (
        ShouldLog = no
    ;
        ShouldLog = yes,
        io.format(FormatString, Values, !IO),
        io.flush_output(!IO)
    ).
@end example

In this case, the compiler cannot check
the call to @code{io.format} in this predicate,
because the format string and the values list both come from the caller.
If some caller supplies format string and values list arguments
that are inconsistent with one another,
the call to @code{io.format} will throw an exception.

The purpose of the @samp{format_call} pragma is to remedy this.
The pragma
@example
:- pragma format_call(pred(maybe_log_message/5),
    format_string_values(2, 3)).
@end example
@noindent
tells the compiler that calls to this predicate
should be checked the same way
as calls to the four formatting predicates and one formatting function
listed the above,
with the format string in the second argument,
and the values in the list in the third.
This way, while @code{maybe_log_message} cannot ensure
that @code{FormatString} and @code{Values} will match,
its @emph{callers} can do so.

In general, this pragma may take these four forms.

@example
:- pragma format_call(pred(@var{Name}/@var{Arity}),
    format_string_values(@var{FormatStringArgNum}, @var{ValuesArgNum})).
:- pragma format_call(func(@var{Name}/@var{Arity}),
    format_string_values(@var{FormatStringArgNum}, @var{ValuesArgNum})).
:- pragma format_call(pred(@var{Name}/@var{Arity}),
    [format_string_values(@var{FS1}, @var{V1}), ...).
:- pragma format_call(func(@var{Name}/@var{Arity}),
    [format_string_values(@var{FS1}, @var{V1}), ...).
@end example

The first form is the one in the example above,
while the second is the function version.
The third and fourth forms differ from the first and second respectively
by having the second argument being not a single
@code{format_string_values(@var{FormatStringArgNum}, @var{ValuesArgNum})} term,
but a list of two or more such terms.
(One such term is also allowed,
but in that case, there is no need for the list.)
This latter capability is intended for use in situations
where the predicate or function concerned
contains several (possibly conditionally executed) calls
to formatting predicates or functions
whose format strings and values come from the caller,
like this:

@example
:- maybe_quad_log_message(LogInfo,
        FmtA, ValuesA1, ValuesA2,
        FmtB, ValuesB1, ValuesB2, !IO) :-
    ...
    io.format(FmtA, ValuesA1, !IO),
    ...
    io.format(FmtA, ValuesA2, !IO),
    ...
    io.format(FmtB, ValuesB1, !IO),
    ...
    io.format(FmtB, ValuesB2, !IO),
    ...
@end example

In such cases, programmers can include a @code{format_string_values} entry
describing the argument numbers that act as the sources
for each such call, like this:

@example
:- pragma format_call(pred(maybe_quad_log_message/9),
    [format_string_values(2, 3), format_string_values(2, 4),
    format_string_values(5, 6), format_string_values(5, 7)]).
@end example

This will ask the compiler to check the compatibility
of all four listed argument pairs at all call sites.

@node Source file name
@section Source file name

The @samp{source_file} pragma and @samp{#@var{line}} directives
provide support for preprocessors and other tools that generate Mercury code.
The tool can insert these directives into the generated Mercury code
to allow the Mercury compiler to report diagnostics
(error and warning messages) at the original source code location,
rather than at the location in the automatically generated Mercury code.

A @samp{source_file} pragma is a declaration of the form

@example
:- pragma source_file(@var{Name}).
@end example

@noindent
where @var{Name} is a string that specifies the name of the source file.

For example, if a preprocessor generated a file @file{foo.m}
based on an input file @file{foo.m.in},
and it copied lines 20, 30, and 31 from @file{foo.m.in},
the following directives would ensure that
any error or warnings for those lines copied from @file{foo.m}
were reported at their original source locations in @file{foo.m.in}.

@example
:- module foo.
:- pragma source_file("foo.m.in").
#20
% this line comes from line 20 of foo.m
#30
% this line comes from line 30 of foo.m
% this line comes from line 31 of foo.m
:- pragma source_file("foo.m").
#10
% this automatically generated line is line 10 of foo.m
@end example

Note that if a generated file contains some text
which is copied from a source file,
and some which is automatically generated,
it is a good idea
to use @samp{pragma source_file} and @samp{#@var{line}} directives
to reset the source file name and line number
to point back to the generated file for the automatically generated text,
as in the above example.

@node Old pragma syntax
@section Old pragma syntax

Many of the pragmas above specify the identity of a predicate or a function
as the entity to which the pragma applies.
Their documentation shows these pragmas to have this syntax:
@example
:- pragma @var{pragma_name}(pred(@var{Name}/@var{Arity})).
:- pragma @var{pragma_name}(func(@var{Name}/@var{Arity})).
@end example

New code should use this syntax.
However, old versions of the Mercury compiler
supported only a simpler version of this syntax, like this:
@example
:- pragma @var{pragma_name}(@var{Name}/@var{Arity}).
@end example

Since this syntax does not specify
whether the pragma is supposed to apply to a predicate or to a function,
it is ambiguous in the event that the program contains
both a predicate and a function with the given name and arity.

For backwards compatibility, the Mercury compiler still supports this syntax,
but it will now generate a warning when the program contains
both a predicate and a function with the given name and arity.
It can also be asked to generate a warning for all pragmas
that should specify whether they apply to a predicate or to a function
but do not do so.

A later version of Mercury will deprecate this syntax,
and a still later version will stop supporting it.

@c which each apply their hint to both a predicate @var{Name}/@var{Arity}
@c and a function @var{Name}/@var{Arity} if both exist.

@node Implementation-dependent extensions
@chapter Implementation-dependent extensions

The Melbourne Mercury implementation supports
the following extensions to the Mercury language:

@menu
* Fact tables::                 Support for very large tables of facts.
@c XXX STM
@c The documentation of STM is commented out because its support is
@c not yet complete. All such lines are preceded by XXX STM.
@c * Software Transactional Memory::
@c                              Support for synchronisation of threads without
@c                              explicit locking.
* Tabled evaluation::           Support for automatically recording previously
                                calculated results and detecting or avoiding
                                certain kinds of infinite loops.
* Termination analysis::        Support for automatic proofs of termination.
@c * Tail recursion check::        Require that a predicate is tail recursive.
* Feature sets::                Support for checking that optional features of
                                the implementation are supported at compile
                                time.
* Trailing::                    Undoing side-effects on backtracking.

@end menu

@node Fact tables
@section Fact tables

Large tables of facts can be compiled using a different algorithm
that is more efficient and can produce more efficient code.

Declarations of the forms

@example
:- pragma fact_table(pred(@var{Name}/@var{Arity}), @var{FileName}).
:- pragma fact_table(func(@var{Name}/@var{Arity}), @var{FileName}).
@end example

@noindent
tell the compiler that the predicate or function
with name @var{Name} and arity @var{Arity}
is defined by a set of facts in an external file @var{FileName}.
Defining large tables of facts in this way allows the compiler
to use a more efficient algorithm for compiling them.
This algorithm uses less memory
than would normally be required to compile the facts,
so much larger tables are possible.

Each mode is indexed on all its input arguments
so the compiler can produce very efficient code using this technique.

In the current implementation, the table of facts
is compiled into a separate C file named @file{@var{FileName}.c}.
The compiler will automatically generate
the correct dependencies for this file
when the command @samp{mmake @var{main_module}.depend} is invoked.
This ensures that the C file will be compiled to @file{@var{FileName}.o}
and then linked with the other object files
when @samp{mmake @var{main_module}} is invoked.

The main limitation of the @samp{fact_table} pragma
is that in the current implementation,
predicates or functions defined as fact tables
can only have arguments of types @code{string}, @code{int} or @code{float}.

Another limitation is that the @samp{--high-level-code} back-end
does not support @samp{pragma fact_table}
for procedures with determinism @code{nondet} or @code{multi}.

@c XXX STM
@c @node Software Transactional Memory
@c @section Software Transactional Memory
@c
@c (Note: Software Transactional Memory is still in development and many
@c aspects c documented here might change without notice.
@c Please use with caution.)
@c
@c Software Transactional Memory or STM
@c is an method of synchronising access to shared data
@c between concurrently running threads.
@c It is an alternative to the use of explicit locking.
@c
@c The way to synchronise threads using Software Transactional Memory
@c is through the use of the @samp{atomic} scope.
@c The syntax of an atomic scope is @code{atomic @var{Params} @var{Goal}}.
@c @var{Goal} must be a valid goal;
@c @var{Params} must be a list of atomic parameters
@c which must include the @samp{outer} and @samp{inner} parameters.
@c The following example shows the use of the atomic scope:
@c
@c @example
@c :- pred add_2_atomically(stm_var(int)::in, io::di, io::uo) is cc_multi.
@c
@c add_2_atomically(TVar, IO0, IO) :-
@c         atomic [ outer(IO0, IO1), inner(STM0, STM) ] (
@c                 read_stm_var(TVar, X, STM0, STM1),
@c                 Y = X + 2,
@c                 write_stm_var(TVar, Y, STM1, STM)
@c         ),
@c         io.write_string("Value of Y is ", IO1, IO2),
@c         io.write(Y, IO2, IO3),
@c         io.nl(IO3, IO).
@c @end example
@c
@c
@c The @samp{outer} parameter takes a pair of variables of type @samp{io.io}.
@c As the atomic scope can be seen as an operation which changes the I/O state,
@c the modes of these variables must be @code{di} and @code{uo} respectively.
@c
@c The @samp{inner} parameter takes a pair of variables of type @samp{stm}.
@c When the atomic scope is executed,
@c these variables supply and consume the @samp{stm} state
@c which can be used by the Software Transactional Memory primitives.
@c Calling these primitives requires threading the @samp{stm} state
@c in a way similar to I/O operations and,
@c as such, the modes of these variables must also be @code{di} and @code{uo}.
@c
@c The code within the atomic scope is restricted
@c in the same way as code which takes the I/O state.
@c The code within the atomic scope
@c must be either @code{det} or @code{cc_multi}.
@c Due to the way Software Transactional Memory provides synchronous behaviour,
@c it is likely that the goal will be executed more than once.
@c As it is unknown how many times (if any) the inner goal will be repeated,
@c only pure code or code which makes use of the @samp{stm} state
@c should be placed inside an atomic scope.
@c (Trace goals are permitted but should not be used for any action
@c that depends on the number of times the goal is executed).
@c
@c Using the atomic scope requires the program to explicitly import the modules
@c @samp{stm_builtin}, @samp{exception} and @samp{univ}.
@c This restriction will soon be dropped,
@c as the compiler itself will do the required imports.
@c
@c In STM systems, data shared between threads
@c is stored in @samp{Transaction Variables}.
@c This is the only form of shared data
@c which the atomic scope will synchronise.
@c @samp{Transaction Variables} can be operated on
@c using the following predicates:
@c
@c @example
@c :- pred new_stm_var(T::in, stm_var(T)::out, io::di, io::uo) is det.
@c
@c :- pred read_stm_var(stm_var(T)::in, T::out, stm::di, stm::uo) is det.
@c
@c :- pred write_stm_var(stm_var(T)::in, T::in, stm::di, stm::uo) is det.
@c @end example
@c
@c The @samp{new_stm_var} creates a new transaction variable
@c whose the type and initial value are given by the first argument,
@c and returns a reference to it.
@c Only one copy of the transaction variable exists in memory,
@c but references to it can be duplicated.
@c Unifications and tests of references
@c affect only the references themselves,
@c and do not affect the underlying transaction variables.
@c
@c To get or set the value of the actual transaction variable,
@c programs must call
@c the builtins @samp{read_stm_var} and @samp{write_stm_var}.
@c These calls take a reference to a transaction variable
@c and either set or return the value of the transaction variable.
@c @footnote{In actual fact, write_stm_var does not update the variable.
@c The update is instead written to a log,
@c and the real transaction variable is changed
@c only when the atomic goal has completed
@c and the whole log has been validated.}
@c As the calls to @samp{read_stm_var} and @samp{write_stm_var}
@c take a pair of @samp{stm} states,
@c they can only appear within an atomic scope.

@node Tabled evaluation
@section Tabled evaluation

(Note: ``Tabled evaluation'' has no relation
to the ``fact tables'' described above.)

Ordinarily, the results of each procedure call are not recorded;
if the same procedure is called with the same arguments,
then the answer(s) must be recomputed again.
For some procedures, this recomputation can be very wasteful.

With tabled evaluation, the implementation keeps a table
containing the previously computed results of the specified procedure;
this table is sometimes called the memo table
(since it ``remembers'' previous answers).
At each procedure call, the implementation will search the memo table
to check whether the answer(s) have already been computed,
and if so, the answers will be returned directly from the memo table
rather than being recomputed.
This can result in much faster execution,
at the cost of additional space to record answers in the table.

The implementation can also check at runtime for the situation
where a procedure calls itself recursively with the same arguments,
which would normally result in an infinite loop;
if this situation is encountered, it can (at the programmer's option)
either throw an exception,
or avoid the infinite loop
by computing solutions using a ``minimal model'' semantics.
(Specifically, the minimal model computed by our implementation
is the perfect model.)

When targeting the generation of C code,
the current Mercury implementation supports
three different pragmas for tabling, to cover these three cases:
@samp{loop_check}, @samp{memo}, and @samp{minimal_model}.
(None of these are supported
when targeting the generation of C# or Java code.)

@itemize @bullet
@item
The @samp{loop_check} pragma asks only for loop checking.
With this pragma, the memo table will map each distinct set of input arguments
only to a single boolean saying whether
a call with those arguments is currently active or not;
the pragma's only effect is to cause the predicate to throw an exception
if this boolean says that the current call has the same arguments
as one of its ancestors, which indicates an infinite recursive loop.

Note that loop checking for nondet and multi predicates assumes that
calls to these predicates generate all their solutions and then fail.
If a caller asks them only for some solutions
and then cuts away all later solutions
(e.g.@: via a quantification
that only asks whether a solution satisfying a particular test exists),
then the cut-away call never gets a chance
to record the fact that it is no longer active.
The next call to that predicate with the same arguments
will therefore think that the previous call is still active,
and will consider this call to be an infinite loop.
@item
The @samp{memo} pragma asks for both loop checking and memoization.
With this pragma, the memo table will map each distinct set of input arguments
either to the set of results computed previously for those arguments,
or to an indication that the call is still active
and thus those results are still being computed.
This predicate will thus look for infinite recursive loops
(and throw an exception if and when it finds one)
but it will also record all its solutions in the memo table,
and will avoid recomputing solutions
that are already available in the memo table.
@item
The @samp{minimal_model} pragma asks
for the computation of a ``minimal model'' semantics.
These differ from the @samp{memo} pragma in that
the detection of what appears to be an infinite recursive loop is not fatal.
The implementation will consider
the apparently infinitely recursive calls to fail
if the call concerned has no way of computing
any solutions it has not already computed and recorded,
and if it does have such a way,
then it switches the execution to explore those ways
before coming back to the apparently infinitely recursive call.

Minimal model evaluation is applicable
only to procedures that can succeed more than once,
and only in grades that explicitly support it.
@end itemize

The syntax for each of these declarations is

@example
:- pragma memo(pred(@var{Name}/@var{Arity})).
:- pragma memo(pred(@var{Name}/@var{Arity}), [@var{list of tabling attributes}]).
:- pragma loop_check(pred(@var{Name}/@var{Arity})).
:- pragma loop_check(pred(@var{Name}/@var{Arity}), [@var{list of tabling attributes}]).
:- pragma minimal_model(pred(@var{Name}/@var{Arity})).
:- pragma minimal_model(pred(@var{Name}/@var{Arity}), [@var{list of tabling attributes}]).
@end example

@noindent
and the corresponding versions in which
@samp{pred} is replaced with @samp{func}.
The @samp{pred(@var{Name}/@var{Arity})} or @samp{func(@var{Name}/@var{Arity})}
part specifies the predicate or function to which the declaration applies.
At most one of these declarations may be specified
for any given predicate or function.

Pragmas using the above syntax specify
a declaration that applies to all the modes of a predicate or function.
Programmers can request the application of tabling
to only one particular mode of a predicate or function,
via declarations such as these:

@example
:- pragma memo(@var{Name}(in, in, out)).
:- pragma memo(@var{Name}(in, in, out), [@var{list of tabling attributes}]).
:- pragma loop_check(@var{Name}(in, out)).
:- pragma loop_check(@var{Name}(in, out), [@var{list of tabling attributes}]).
:- pragma minimal_model(@var{Name}(in, in, out, out)).
:- pragma minimal_model(@var{Name}(in, in, out, out), [@var{list of tabling attributes}]).
@end example

All the above example pragmas are for predicates.
For functions, the first argument of the pragma would include
the mode of the function result as well, like this:

@example
:- pragma memo(@var{Name}(in, in) = out, [@var{list of tabling attributes}]).
@end example

Because all variants of tabling record the values of input arguments,
and all except @samp{loop_check} also record the values of output arguments,
you cannot apply any of these pragmas to procedures
whose arguments' modes include any unique component.

Tabled evaluation of a predicate or function
that has an argument whose type is a foreign type
will result in a run-time error,
unless the foreign type is one for which
the @samp{can_pass_as_mercury_type} and @samp{stable} assertions
have been made (@pxref{Using foreign types from Mercury}).

The optional list of attributes allows programmers
to control some aspects of the management of the memo table(s)
of the procedure(s) affected by the pragma.

The @samp{allow_reset} attribute asks the compiler
to define a predicate that, when called, resets the memo table.
The name of this predicate will be ``table_reset_for'',
followed by the name of the tabled predicate, followed by its arity,
and (if the predicate has more than one mode) by the mode number
(the first declared mode is mode 0, the second is mode 1, and so on).
These three or four components are separated by underscores.
The reset predicate takes a di/uo pair of I/O states as arguments.
The presence of these I/O state arguments in the reset predicate,
and the fact that tabled predicates cannot have unique arguments
together imply that a memo table cannot be reset
while a call using that memo table is active.

The @samp{statistics} attribute asks the compiler
to define a predicate that, when called,
returns statistics about the memo table.
The name of this predicate will be ``table_statistics_for'',
followed by the name of the tabled predicate, followed by its arity,
and (if the predicate has more than one mode) by the mode number
(the first declared mode is mode 0, the second is mode 1, and so on).
These three or four components are separated by underscores.
The statistics predicate takes three arguments.
The second and third are a di/uo pair of I/O states,
while the first is an output argument that contains information
about accesses to and modifications of the procedure's memo table,
both since the creation of the table,
and since the last call to this predicate.
The type of this argument is defined in the file table_builtin.m
in the Mercury standard library.
That module also contains a predicate for printing out this information
in a programmer-friendly format.

As mentioned above, the Mercury compiler implements tabling
only when targeting the generation of C code.
@c ... and when the gc method is not accurate,
@c but since no-one actually uses accurate gc ...
In other grades, the compiler normally generates a warning
for each tabling pragma that it is forced to ignore.
The @samp{disable_warning_if_ignored} attribute tells the compiler
not to generate such a warning for the pragma it is attached to.
Since the @samp{loopcheck} and @samp{minimal_model} pragmas
affect the semantics of the program,
and such changes should not be made silently,
this attribute may not be specified for them.
But this attribute may be specified for @samp{memo} pragmas,
since these affect only the program's performance, not its semantics.

The remaining two attributes, @samp{fast_loose} and @samp{specified},
control how arguments are looked up in the memo table.
The default implementation
looks up the @emph{value} of each input argument,
and thus requires time proportional to
the number of function symbols in the input arguments.
This is the only implementation allowed for minimal model tabling,
but for predicates tabled with the @samp{loop_check} and @samp{memo} pragmas,
programmers can also choose some other tabling methods.

The @samp{fast_loose} attribute asks the compiler to generate code
that looks up only the @emph{address} of each input argument in the memo table,
which means that the time required
is linear only in the @emph{number} of input arguments, not their @emph{size}.
The tradeoff is that @samp{fast_loose}
does not recognize calls as duplicates
if they involve input arguments that are logically equal
but are stored at different locations in memory.
The following declaration calls for this variant of tabling.

@example
:- pragma memo(@var{Name}(in, in, in, out),
        [allow_reset, statistics, fast_loose]).
@end example

The @samp{specified} attribute allows programmers
to choose individually, for each input argument,
whether that argument should be looked up in the memo table
by value or by address,
or whether it should be looked up at all:

@example
:- pragma memo(@var{Name}(in, in, in, out), [allow_reset, statistics,
        specified([value, addr, promise_implied, output])]).
@end example

The @samp{specified} attribute should have an argument which is a list,
and this list should contain one element
for each argument of the predicate or function concerned
(if a function, the last element is for the return value).
For output arguments, the list element should be @samp{output}.
For input arguments, the list element may be
@samp{value}, @samp{addr} or @samp{promise_implied}.
The first calls for tabling the argument by value,
the second calls for tabling the argument by address,
and the third calls for not tabling the argument at all.
This last course of action promises that any two calls
that agree on the values of the value-tabled input arguments
and on the addresses of the address-tabled input arguments
will behave the same regardless of the values of the untabled input arguments.
In most cases, this will mean that the values of the untabled arguments
are implied by the values of the value-tabled arguments
and the addresses of the address-tabled arguments,
though the promise can also be fulfilled
if the table predicate or function does not actually use
the untabled argument for computing any of its output.
(It is ok for it to use the untabled argument
to decide what exception to throw.)

@c Experimental:
@c The @samp{specified} attribute can additionally take an argument after
@c the list, which is either @samp{hidden_arg_value} or @samp{hidden_arg_addr}.
@c If @samp{hidden_arg_addr} is specified, extra arguments introduced by the
@c compiler will be tabled by address, otherwise they are tabled by value.
@c @samp{hidden_arg_value} is assumed if neither is present.

If the tabled predicate or function has only one mode,
then a declaration like this can also be specified
without giving the argument modes:

@example
:- pragma memo(pred(@var{Name}/@var{Arity}), [allow_reset, statistics,
        specified([value, addr, promise_implied, output])]).
@end example

Note that a @samp{pragma minimal_model} declaration
changes the declarative semantics of the specified predicate or function:
instead of using the completion of the clauses as the basis for the semantics,
as is normally the case in Mercury,
the declarative semantics that is used is a ``minimal model'' semantics,
specifically, the perfect model semantics.
Anything which is true or false in the completion semantics
is also true or false (respectively) in the perfect model semantics,
but there are goals for which the perfect model specifies
that the result is true or false,
whereas the completion semantics leaves the result unspecified.
For these goals, the usual Mercury semantics requires the
implementation to either loop or report an error message,
but the perfect model semantics requires a particular answer to be returned.
In particular, the perfect model semantics says that
any call that is not true in @emph{all} models is false.

Programmers should therefore use a @samp{pragma minimal_model} declaration
only in cases where their intended interpretation for a procedure
coincides with the perfect model for that procedure.
Fortunately, however, this is usually what programmers intend.

@c XXX give an example

For more information on tabling, see K. Sagonas's PhD thesis
@c XXX this citation doesn't come out properly in DVI format
@cite{The SLG-WAM: A Search-Efficient Engine for Well-Founded Evaluation
of Normal Logic Programs.} @xref{[4]}.
The operational semantics
of procedures with a @samp{pragma minimal_model} declaration
corresponds to what Sagonas calls ``SLGd resolution''.

In the general case,
the execution mechanism required by minimal model tabling is quite complicated,
requiring the ability to delay goals and then wake them up again.
The Mercury implementation uses a technique
based on copying relevant parts of the stack to the heap when delaying goals.
It is described in
@c XXX this citation may not come out properly in DVI format
@cite{Tabling in Mercury: design and implementation}
by Z. Somogyi and K. Sagonas,
Proceedings of the Eight International Symposium
on Practical Aspects of Declarative Languages.

@cartouche
@strong{Please note:}
the current implementation of tabling does not support
all the possible compilation grades
(see the ``Compilation model options'' section of the Mercury User's Guide)
allowed by the Mercury implementation.
In particular, minimal model tabling is incompatible with
high level code and the use of trailing.
@c and accurate garbage collection.
@end cartouche

@node Termination analysis
@section Termination analysis

The compiler includes a termination analyser
which can be used to prove termination of predicates and functions.
Details of the analysis are available
in ``Termination Analysis for Mercury''
by Chris Speirs, Zoltan Somogyi and Harald Sondergaard.
@xref{[1]}.
@c XXX this citation doesn't come out properly in DVI format

The analysis is based on
an algorithm proposed by Gerhard Groger and Lutz Plumer
in their paper ``Handling of mutual recursion in
automatic termination proofs for logic programs.'' @xref{[2]}.
@c XXX this citation doesn't come out properly in DVI format

For an introduction to termination analysis for logic programs,
please refer to ``Termination Analysis for Logic Programs'' by Chris Speirs.
@c XXX this citation doesn't come out properly in DVI format
@xref{[3]}.

Information about the termination properties of a predicate or function
can be given to the compiler.
Pragmas are also available to require the compiler
to prove termination of a given predicate or function,
or to give an error message if it cannot do so.

The analyser is enabled by the option @samp{--enable-termination},
which can be abbreviated to @samp{--enable-term}.
When termination analysis is enabled,
any predicates or functions
with a @samp{check_termination} pragma defined on them
will have their termination checked,
and if termination cannot be proved,
the compiler will emit an error message
detailing the reason that termination could not be proved.

The option @samp{--check-termination},
which may be abbreviated to @samp{--check-term} or @samp{--chk-term},
forces the compiler to check the termination of all predicates in the module.
It is common for the compiler to be unable to prove
termination of some predicates and functions
because they call other predicates
which could not be proved to terminate
or because they use language features (such as higher-order calls)
which cannot be usefully analysed.
In this case, the compiler only emits a warning
for these predicates and functions
if the @samp{--verbose-check-termination} option is enabled.
For every predicate or function
that the compiler cannot prove the termination of,
a warning message is emitted, but compilation continues.
The @samp{--check-termination} option
implies the @samp{--enable-termination} option.

The accuracy of the termination analysis is substantially degraded
if intermodule optimization is not enabled.
Unless intermodule optimization is enabled,
the compiler must assume that any imported predicate may not terminate.

By default, the compiler assumes that
a procedure defined using the foreign language interface
will terminate for all input if it does not call Mercury.
If it does call Mercury then by default the compiler will assume
that it may not terminate.

The foreign code attributes @samp{terminates}/@samp{does_not_terminate}
may be used to force the compiler to treat a foreign_proc
as terminating/non-terminating irrespective of whether it calls Mercury.
As a matter of style,
it is preferable to use foreign code attributes for foreign_procs
rather than the termination pragmas described below.

The following declarations can be used to inform the compiler
of the termination properties of a predicate or function.

@example
:- pragma terminates(pred(@var{Name}/@var{Arity})).
:- pragma terminates(func(@var{Name}/@var{Arity})).
@end example

This declaration may be used to inform the compiler
that this predicate or function is guaranteed to terminate for any input.
This is useful when the compiler cannot prove
termination of some predicates or functions
which are in turn preventing the compiler
from proving termination of other predicates or functions.
This declaration affects not only the predicate specified
but also any other predicates that are mutually recursive with it.

@example
:- pragma does_not_terminate(pred(@var{Name}/@var{Arity})).
:- pragma does_not_terminate(func(@var{Name}/@var{Arity})).
@end example

This declaration may be used to inform the compiler
that this predicate or function may not terminate.
This declaration affects not only the predicate or function specified
but also any other predicates and/or functions
that are mutually recursive with it.

@example
:- pragma check_termination(pred(@var{Name}/@var{Arity})).
:- pragma check_termination(func(@var{Name}/@var{Arity})).
@end example

This pragma tells the compiler
that it should try to prove the termination of this predicate or function,
and if it fails, then it should quit with an error message.

@c XXX TO DO!
@c @node Compile-time garbage collection
@c @section Compile-time garbage collection
@c
@c The compiler includes a compile-time garbage collection system (CTGC). This
@c system consists of a structure sharing analysis, followed by a structure
@c reuse analysis.
@c
@c @node Structure sharing analysis
@c @subsection Structure sharing analysis
@c
@c The compiler includes a structure sharing analysis system.
@c
@c @node Structure reuse analysis
@c @subsection Structure reuse analysis
@c
@c The compiler includes a structure reuse analysis system.
@c

@c @node Tail recursion check
@c @section Tail recursion check
@c
@c The @samp{require_tail_recursion} pragma can be used to enable and disable
@c warnings or errors for predicates and functions that contain recursive
@c calls which are not @emph{tail} recursive.
@c
@c XXX add pred() and func() wrappers
@c @example
@c :- pragma require_tail_recursion(@var{Name}/@var{Arity}, @var{Options}).
@c :- pragma require_tail_recursion(@var{Name}(@var{Modes}), @var{Options}).
@c :- pragma require_tail_recursion(@var{Name}(@var{Modes}) = @var{ReturnMode},
@c     @var{Options}).
@c @end example
@c
@c This pragma affects all modes of a predicate or function (in the first form)
@c or a specific mode of a predicate or function (the second and third forms).
@c These pragmas can be used to enable or inhibit warnings for non tail
@c recursive code.
@c
@c When tail recursion warnings are enabled using the
@c @samp{--warn-non-tail-recursion} compiler option (see the user's guide),
@c the compiler may emit warnings for predicates that the developer knows and
@c accepts aren't tail recursive.
@c These can be suppressed using the @samp{none} option in the
@c @samp{require_tail_recursion} pragma.
@c
@c @example
@c :- pragma require_tail_recursion(foo/3, [none]).
@c @end example
@c
@c When the @samp{--warn-non-tail-recursion} compiler option is not enabled
@c then the pragma can be used to explicitly enable the tail recursion check
@c for a predicate or function.
@c If you think that a predicate or function will probably recurse deeply,
@c and may exhaust the stack unless its recursive calls are all tail recursive,
@c then use this pragma on that predicate
@c to get a warning or an error
@c if any of those recursive calls are not tail recursive.
@c You may also wish to enable this warning
@c if you expect the predicate or function to be called many times,
@c even if those calls are very unlikely to exhaust the stack,
@c simply because tail recursion is more efficient than non-tail recursion.
@c
@c @example
@c :- pragma require_tail_recursion(map/3).
@c @end example
@c
@c The following options may be given:
@c
@c @table @code
@c
@c @item warn
@c Non tail recursive code should generate a compiler warning.
@c This is the default.
@c This option is incompatible with @samp{error} and @samp{none}.
@c
@c @item error
@c Non tail recursive code should generate a compiler error.
@c This option is incompatible with @samp{warn} and @samp{none}.
@c
@c @item none
@c Disable the tail recursion check for this predicate or function.
@c This option is incompatible with every other option.
@c
@c @item self_or_mutual_recursion
@c Allow the recursive calls to be self or mutually recursive.
@c The compiler will generate warnings or errors for recursive calls that are
@c not tail calls (and not later followed by a recursive call that @emph{is} a
@c tail call).
@c This is the default.
@c This option is incompatible with @samp{self_recursion_only} and @samp{none}.
@c
@c @item self_recursion_only
@c Require that all recursive calls are self-recursive.
@c In addition to @code{self_or_mutual_recursion},
@c this option causes the compiler to generate a warning or error
@c when a mutually recursive call is a @emph{tail} call, even if it can
@c optimize the tail call.
@c Some backends can optimize self recursion but not mutual recursion,
@c or mutual recursion is less efficient.
@c This option can be used to alert the programmer of code that isn't tail
@c recursive on these backends.
@c This option is incompatible with @samp{self_or_mutual_recursion} and
@c @samp{none}.
@c
@c @end table
@c
@c Note that the compiler cannot analyse recursion across module boundaries
@c or through higher-order calls.
@c Therefore inter-module and higher-order calls are considered to be
@c non-recursive.
@c
@c This pragma has no effect with @samp{--no-optimize-tailcalls}.
@c
@c Issuing the pragma more than once for the same predicate or function, or a
@c mode off that predicate or function, will cause undefined behaviour.

@node Feature sets
@section Feature sets

The Melbourne Mercury implementation supports
a number of optional compilation model features,
such as @ref{Trailing} or @ref{Tabled evaluation}.
Feature sets allow the programmer to assert that a module requires
the presence of one or more optional features in the compilation model.
These assertions can be made
using a @samp{pragma require_feature_set} declaration.

The @samp{require_feature_set} pragma declaration has the following form:
@example
:- pragma require_feature_set(@var{Features}).
@end example

where @var{Features} is a (possibly empty) list of features.

The supported features are:
@table @asis

@item @samp{concurrency}
This specifies that the compilation model
must support concurrent execution of multiple threads.

@item @samp{single_prec_float}
This specifies that the compilation model
must use single precision floats.
This feature cannot be specified
together with the @samp{double_prec_float} feature.

@item @samp{double_prec_float},
This feature specifies that
the compilation model must use double precision floats.
This feature cannot be specified
together with the @samp{single_prec_float} feature.

@item @samp{memo}
This feature specifies that the compilation model must support memoisation
(see @ref{Tabled evaluation}).

@item @samp{parallel_conj}
This feature specifies that the compilation model
must support parallel execution of conjunctions.
This feature cannot be specified together with the @samp{trailing} feature.

@item @samp{trailing}
This feature specifies that the compilation model
must support trailing, see @ref{Trailing}.
This feature cannot be specified
together with the @samp{parallel_conj} feature.

@item @samp{strict_sequential}
This feature specifies that
a semantics that is equivalent to the strict sequential operational semantics
must be used.

@item @samp{conservative_gc}
This feature specifies that a module
requires conservative garbage collection.
This feature is only checked when using the C backends;
it is ignored by the non-C backends.

@end table

When a module containing a @samp{pragma require_feature_set} declaration
is compiled,
the implementation checks to see that
the specified features are supported by the compilation model.
It emits an error if they are not.

A @samp{pragma require_feature_set} may only occur
in the implementation section of a module.

A @samp{pragma require_feature_set} affects
only the module in which it occurs;
in particular it does not affect any submodules.

If a module contains multiple @samp{pragma require_feature_set} declarations,
then the implementation should emit an error
if any of them specifies a feature
that is not supported by the compilation model.

@node Trailing
@section Trailing

In certain compilation grades
(see the ``Compilation model options'' section of the Mercury User's Guide),
the Melbourne Mercury implementation supports trailing.
Trailing is a means of having side-effects,
such as destructive updates to data structures, undone on backtracking.
The basic idea is that during forward execution,
whenever you perform a destructive modification
to a data structure that may still be live on backtracking,
you should record whatever information is necessary to restore it
on a stack-like data structure called the ``trail''.
Then, if a computation fails,
and execution backtracks to before those updates were performed,
the Mercury runtime engine will traverse the trail
back to the most recent choice point,
undoing all those updates.

The interface used is a set of C functions
(which are actually implemented as macros) and types.
Typically these will be called from C code
within @samp{pragma foreign_proc} or @samp{pragma foreign_code} declarations
in Mercury code.

For an example of the use of this interface,
see the module @file{extras/trailed_update/tr_array.m}
in the Mercury extras distribution.

@menu
* Choice points::
* Value trailing::
* Function trailing::
* Delayed goals and floundering::
* Avoiding redundant trailing::
@end menu

@node Choice points
@subsection Choice points

A ``choice point'' is a point in the computation
to which execution might backtrack when a goal fails or throws an exception.
The ``current'' choice point is the one that was most recently encountered;
that is also the one to which execution will branch
if the current computation fails.

When you trail an update,
the Mercury engine will ensure that
if execution ever backtracks to the choice point
that was current at the time of trailing,
then the update will be undone.

If the Mercury compiler determines that
it will never need to backtrack to a particular choice point,
then it will ``prune'' away that choice point.
If a choice point is pruned,
the trail entries for those updates will not necessarily be discarded,
because in general they may still be necessary
in case we backtrack to a prior choice point.

@node Value trailing
@subsection Value trailing

The simplest form of trailing is value trailing.
This allows you to trail updates to memory
and have the Mercury runtime engine automatically undo them on backtracking.

@table @b
@item @bullet{} @code{MR_trail_value()}
Prototype:
@example
void MR_trail_value(MR_Word *@var{address}, MR_Word @var{value});
@end example

Ensures that if future execution backtracks to the
current choice point, then @var{value} will be placed in @var{address}.
@sp 1
@item @bullet{} @code{MR_trail_current_value()}
Prototype:
@example
void MR_trail_current_value(MR_Word *@var{address});
@end example

Ensures that if future execution backtracks to the
current choice point, the value currently in @var{address}
will be restored.

@samp{MR_trail_current_value(@var{address})} is equivalent to
@samp{MR_trail_value(@var{address}, *@var{address})}.

@end table

Note that @var{address} must be word aligned
for both @code{MR_trail_current_value()} and @code{MR_trail_value()}.
(The addresses of Mercury data structures
that have been passed to C via the foreign language interface
are guaranteed to be appropriately aligned.)

@node Function trailing
@subsection Function trailing

For more complicated uses of trailing,
you can store the address of a C function on the trail
and have the Mercury runtime call your function back
whenever future execution backtracks to the current choice point or earlier,
or whenever that choice point is pruned,
because execution commits to never backtracking over that point,
or whenever that choice point is garbage collected.

Note the garbage collector in the current Mercury implementation
does not garbage-collect the trail;
this case is mentioned
only so that we can cater for possible future extensions.

@table @b
@item @bullet{} @code{MR_trail_function()}
Prototype:
@example
typedef enum @{
        MR_undo,
        MR_exception,
        MR_retry,
        MR_commit,
        MR_solve,
        MR_gc
@} MR_untrail_reason;

void MR_trail_function(
        void (*@var{untrail_func})(void *, MR_untrail_reason),
        void *@var{value}
);
@end example
@noindent
A call to @samp{MR_trail_function(@var{untrail_func}, @var{value})}
adds an entry to the function trail.
The Mercury implementation ensures that
if future execution ever backtracks to the current choice point,
or backtracks past the current choice point to some earlier choice point,
then @code{(*@var{untrail_func})(@var{value}, @var{reason})} will be called,
where @var{reason} will be @samp{MR_undo}
if the backtracking was due to a goal failing,
@samp{MR_exception} if the backtracking was due
to a goal throwing an exception,
or @samp{MR_retry} if the backtracking was due
to the use of the ``retry'' command in @samp{mdb}, the Mercury debugger,
or any similar user request in a debugger.
The Mercury implementation also ensures that
if the current choice point is pruned
because execution commits to never backtracking to it,
then @code{(*@var{untrail_func})(@var{value}, MR_commit)} will be called.
It also ensures that
if execution requires that the current goal be solvable,
then @code{(*@var{untrail_func})(@var{value}, MR_solve)} will be called.
This happens in calls to @code{solutions/2}, for example.
(@code{MR_commit} is used for ``hard'' commits,
i.e.@: when we commit to a solution and prune away the alternative solutions;
@code{MR_solve} is used for ``soft'' commits,
i.e.@: when we must commit to a solution
but do not prune away all the alternatives.)

MR_gc is currently not used ---
it is reserved for future use.

@end table

Typically if the @var{untrail_func} is called
with @var{reason} being @samp{MR_undo}, @samp{MR_exception},
or @samp{MR_retry},
then it should undo the effects of the update(s) specified by @var{value},
and then free any resources associated with that trail entry.
If it is called with @var{reason} being @samp{MR_commit} or @samp{MR_solve},
then it should not undo the update(s);
instead, it may check for floundering (see the next section).
In the @samp{MR_commit} case it may, in some cases, be possible
to also free resources associated with the trail entry.
If it is called with anything else (such as @samp{MR_gc}),
then it should probably abort execution with an error message.

Note that the address of the C function passed as the first argument of
@code{MR_trail_function()} must be word aligned.

@node Delayed goals and floundering
@subsection Delayed goals and floundering

Another use for the function trail is to check for floundering
in the presence of delayed goals.

Often, when implementing certain kinds of constraint solvers,
it may not be possible
to actually solve all of the constraints at the time they are added.
Instead, it may be necessary
to simply delay their execution until a later time,
in the hope the constraints may become solvable
when more information is available.
If you do implement a constraint solver with these properties,
then at certain points in the computation
--- for example, after executing a negated goal ---
it is important for the system to check
that there are no outstanding delayed goals which might cause failure,
before execution commits to this execution path.
If there are any such delayed goals, the computation is said to ``flounder''.
If the check for floundering was omitted,
then it could lead to unsound behaviour,
such as a negation failing
even though logically speaking it ought to have succeeded.

The check for floundering can be implemented using the function trail,
by simply calling @samp{MR_trail_function()} to add a function trail entry
whenever you create a delayed goal,
and putting the appropriate check for floundering
in the @samp{MR_commit} and @samp{MR_solve} cases of your function.
The Mercury extras distribution includes an example of this:
see the @samp{ML_var_untrail_func()} function
in the file @file{extras/trailed_update/var.m}.)
If your function does detect floundering,
then it should print an error message and then abort execution.

@node Avoiding redundant trailing
@subsection Avoiding redundant trailing

If a mutable data structure is updated multiple times,
and each update is recorded on the trail using the functions described above,
then some of this trailing may be redundant.
It is generally not necessary to record enough information
to recover the original state of the data structure
for @emph{every} update on the trail;
instead, it is enough to record
the original state of each updated data structure just once
for each choice point occurring after the data structure is allocated,
rather than once for each update.

The functions described below provide a means to avoid redundant trailing.

@table @b
@item @bullet{} @code{MR_ChoicepointId}
Declaration:
@example
typedef @dots{} MR_ChoicepointId;
@end example

The type @code{MR_ChoicepointId} is an abstract type
used to hold the identity of a choice point.
Values of this type can be compared
using C's @samp{==} operator or using @samp{MR_choicepoint_newer()}.
@sp 1
@item @bullet{} @code{MR_current_choicepoint_id()}
Prototype:
@example
MR_ChoicepointId MR_current_choicepoint_id(void);
@end example

@code{MR_current_choicepoint_id()} returns a value
indicating the identity of the most recent choice point;
that is, the point to which execution would backtrack
if the current computation failed.
The value remains meaningful if the choice point is pruned away by a commit,
but is not meaningful
after backtracking past the point where the choice point was created
(since choice point ids may be reused after backtracking).
@sp 1
@item @bullet{} @code{MR_null_choicepoint_id()}
Prototype:
@example
MR_ChoicepointId MR_null_choicepoint_id(void);
@end example

@code{MR_null_choicepoint_id()} returns a ``null'' value that is distinct
from any value ever returned by @code{MR_current_choicepoint_id}.
(Note that @code{MR_null_choicepoint_id()}
is a macro that is guaranteed to be suitable for use as a static initializer,
so that it can for example be used
to provide the initial value of a C global variable.)
@sp 1
@item @bullet{} @code{MR_choicepoint_newer()}
Prototype:
@example
bool MR_choicepoint_newer(MR_ChoicepointId, MR_ChoicepointId);
@end example

@code{MR_choicepoint_newer(@var{x}, @var{y})} returns true
iff the choice point indicated by @var{x}
is newer than (i.e.@: was created more recently than)
the choice point indicated by @var{y}.
The null ChoicepointId is considered older than any non-null ChoicepointId.
If either of the choice points have been backtracked over,
the behaviour is undefined.

@end table

The way these functions are generally used is as follows.
When you create a mutable data structure,
you should call @code{MR_current_choicepoint_id()}
and save the value it returns
as a @samp{prev_choicepoint} field in your data structure.
When you are about to modify your mutable data structure,
you can then call @code{MR_current_choicepoint_id()} again
and compare the result from that call
with the value saved in the @samp{prev_choicepoint} field in the data structure
using @code{MR_choicepoint_newer()}.
If the current choice point is newer, then you must trail the update,
and update the @samp{prev_choicepoint} field with the new value;
furthermore, you must also take care that on backtracking the
previous value of the @samp{prev_choicepoint} field in your data
structure is restored to its previous value, by trailing that update too.
But if @code{MR_current_choicepoint_id()}
is not newer than the @code{prev_choicepoint} field,
then you can safely perform the update to your data structure
without trailing it.

If your mutable data structure is a C global variable,
then you can use @code{MR_null_choicepoint_id()}
for the initial value of the @samp{prev_choicepoint} field.
If on the other hand your mutable data structure
is created by a predicate or function that uses tabled evaluation
(@pxref{Tabled evaluation}),
then you @emph{should} use @code{MR_null_choicepoint_id()}
for the initial value of the field.
Doing so will ensure that the data will be reset to its initial value
if execution backtracks to a point
before the mutable data structure was created,
which is important because this copy of the mutable data structure
will be tabled and will therefore be produced again
if later execution attempts to create another instance of it.

For an example of avoiding redundant trailing, see the sample module below.

Note that there is a cost to this ---
you have to include an extra field in your data structure
for each part of the data structure which you might update,
you need to perform a test for each update to decide
whether or not to trail it,
and if you do need to trail the update,
then you have an extra field that you need to trail.
Whether or not the benefits from avoiding redundant trailing
outweigh these costs will depend on your application.

@example
:- module trailing_example.
:- interface.

:- type int_ref.

    % Create a new int_ref with the specified value.
    %
:- pred new_int_ref(int_ref::uo, int::in) is det.

    % update_int_ref(Ref0, Ref, OldVal, NewVal).
    % Ref0 has value OldVal and Ref has value NewVal.
    %
:- pred update_int_ref(int_ref::mdi, int_ref::muo, int::out, int::in)
    is det.

:- implementation.

:- pragma foreign_decl("C", "

typedef struct @{
    MR_ChoicepointId prev_choicepoint;
    MR_Integer data;
@} C_IntRef;

").

:- pragma foreign_type("C", int_ref, "C_IntRef *").

:- pragma foreign_proc("C",
    new_int_ref(Ref::uo, Value::in),
    [will_not_call_mercury, promise_pure],
"
    C_IntRef *x = malloc(sizeof(C_IntRef));
    x->prev_choicepoint = MR_current_choicepoint_id();
    x->data = Value;
    Ref = x;
").

:- pragma foreign_proc("C",
    update_int_ref(Ref0::mdi, Ref::muo, OldValue::out, NewValue::in),
    [will_not_call_mercury, promise_pure],
"
    C_IntRef *x = Ref0;
    OldValue = x->data;

    /* Check whether we need to trail this update. */
    if (MR_choicepoint_newer(MR_current_choicepoint_id(),
        x->prev_choicepoint))
    @{
        /*
        ** Trail both x->data and x->prev_choicepoint,
        ** since we're about to update them both.
        */
        assert(sizeof(x->data) == sizeof(MR_Word));
        assert(sizeof(x->prev_choicepoint) == sizeof(MR_Word));
        MR_trail_current_value((MR_Word *)&x->data);
        MR_trail_current_value((MR_Word *)&x->prev_choicepoint);

        /*
        ** Update x->prev_choicepoint to indicate that
        ** x->data's previous value has been trailed
        ** at this choice point.
        */
        x->prev_choicepoint = MR_current_choicepoint_id();
    @}
    x->data = NewValue;
    Ref = Ref0;
").

@end example

@c @item @code{void MR_untrail_to(MR_TrailEntry *@var{old_trail_ptr}, MR_untrail_reason @var{reason});}
@c
@c Apply all the trail entries between @samp{MR_trail_ptr} and
@c @var{old_trail_ptr}, using the specified @var{reason}.
@c
@c This function is called by the Mercury engine after backtracking,
@c after a commit, or after catching an exception.
@c There is probably little need for user code to call this function,
@c but it might be needed if you're doing certain low-level things
@c such as implementing your own exception handling.

@node Bibliography
@chapter Bibliography

@menu
* [1]::         Speirs, Somogyi, and Sondergaard,
                @cite{Termination Analysis for Mercury}.
* [2]::         Groger and Plumer, @cite{Handling of mutual recursion in
                automatic termination proofs for logic programs}.
* [3]::         Speirs, @cite{Termination Analysis for logic programs}.
* [4]::         Sagonas, @cite{The SLG-WAM: A Search-Efficient Engine
                for Well-Founded Evaluation of Normal Logic Programs}.
* [5]::         Demoen and Sagonas, @cite{CAT: the copying approach to tabling}.
@end menu

@node [1]
@unnumberedsec [1]
Chris Speirs, Zoltan Somogyi and Harald Sondergaard, @cite{Termination
Analysis for Mercury}.  In P. Van Hentenryck, editor, @cite{Static
Analysis: Proceedings of the 4th International Symposium}, Lecture
Notes in Computer Science. Springer, 1997.  A longer version is
available for download from
@uref{https://www.mercurylang.org/documentation/papers/mu_97_09.ps.gz}.

@node [2]
@unnumberedsec [2]
Gerhard Groger and Lutz Plumer, @cite{Handling of mutual recursion in
automatic termination proofs for logic programs.}  In K. Apt, editor,
@cite{The Proceedings of the Joint International Conference and Symposium on
Logic Programming}, pages 336--350.  MIT Press, 1992.

@node [3]
@unnumberedsec [3]
Chris Speirs, @cite{Termination Analysis for Logic Programs},
Technical Report 97/23, Department of Computer Science, The University
of Melbourne, Melbourne, Australia, 1997.  Available from
@uref{https://www.mercurylang.org/documentation/papers/mu_97_23.ps.gz}.

@node [4]
@unnumberedsec [4]
K. Sagonas, @cite{The SLG-WAM: A Search-Efficient Engine
for Well-Founded Evaluation of Normal Logic Programs},
PhD thesis, SUNY at Stony Brook, 1996.  Available from
@uref{https://user.it.uu.se/~kostis/Thesis/thesis.ps.gz}.

@node [5]
@unnumberedsec [5]
B. Demoen and K. Sagonas, @cite{CAT: the Copying Approach to Tabling},
In C. Palamidessi, H. Glaser and K. Meinke, editors, @cite{Principles of
Declarative Programming, 10th International Symposium, PLILP'98},
Lecture Notes in Computer Science, Springer, 1998.
Available from @uref{https://user.it.uu.se/~kostis/Papers/cat.ps.gz}.
@bye