Summary of types ---------------- - byte unsigned char 0-255 - cstring Sequence of non-zero bytes terminated by zero-byte. XXX: May change this later to allow embedded zero-bytes in strings. - short 2 bytes interpreted as an unsigned short. MSB is first byte read. 2's complement. - int 4 bytes interpreted as a signed int MSB read first. 2's complement - float XXX: not yet supported but presumably.. 4 bytes interpreted as float. MSB read first. Must be IEEE float format. - list of T - contiguous sequence of T - determinism one byte interpreted as follows - 0 = det - 1 = semidet - 2 = multidet - 3 = nondet - 4 = cc_multidet - 5 = cc_nondet - 6 = erroneous - 7 = failure - tag is one of: % XXX Need explanation of all these. - 0 (byte) (simple tag) - primary (byte) - 1 (byte) (complicated tag) - primary (byte) - secondary (int) - 2 (byte) (complicated constant tag) - primary (byte) - secondary (int) - 3 (byte) (enum tag) Enumeration of pure constants. - 4 (byte) (no_tag) - cons_id (constructor id) is one of: % Note that not all of these alternatives are % meaningful in all bytecodes that have arguments of % type cons_id. XXX: Specify exactly which cases % are meaningful. - 0 (byte) (cons) - functor name (cstring) - arity (short) - tag (tag) - 1 (byte) (int const) - integer constant (int) - 2 (byte) (string const) - string constant (cstring) XXX: no '\0' in strings! - 3 (byte) (float const) - float constant (float) XXX: not yet supported - 4 (byte) (pred const) - module id (cstring) - predicate id (cstring) - arity (short) - procedure id (byte) - 5 (byte) (code addr const) - module id (cstring) - predicate id (cstring) - arity (short) - procedure id (byte) - 6 (byte) (base type info const) - module id (cstring) - type name (cstring) - type arity (byte) - op_arg (argument to an operator) is one of: - 0 (byte) - variable slot (short) - 1 (byte) - integer constant (int) - 2 (byte) - float constant (float) XXX: not yet supported - dir (direction of information movement in general unification) is one of: - 0 (byte) to_arg - 1 (byte) to_var - 2 (byte) to_none Summary of Bytecodes -------------------- % Note: Currently we specify only the static layout of bytecodes. % We also need to specify the operational semantics of the bytecodes, % which can be done by specifying state transitions on the abstract % machine. That is, to specify the meaning of a bytecode, we simply % say how the state of the abstract machine has changed from before % interpreting the bytecode to after interpreting the bytecode. - enter_pred (0) - predicate name (cstring) - number of procedures in predicate (short) - endof_pred (1) - enter_proc (2) - procedure id (byte) XXX: should use short instead? procedure id is used to distinguish the procedures in a predicate. - determinism of the procedure (determinism) - label count (short) Number of labels in the procedure. Used for allocating a table of labels in the interpreter. - temp count (short) Number of temporary variables needed for this procedure. (?) - length of list (short) Number of items in next arg - list of - Variable info (cstring) XXX: we should also have typeinfo for each variable. - endof_proc (3) - label (4) - Code label. (short) Used for jumps, switches, if-then-else, etc. - enter_disjunction (5) - label id (short) Label refers to the label immediately after the disjunction. - endof_disjunction (6) - enter_disjunct (7) - label id (short) Label refers to label for next disjunct. - endof_disjunct (8) - label id (short) Label refers to label for next disjunct.(?) Is -1 if there is no next disjunct in this disjunction. - enter_switch (9) - variable in slots on which we are switching (short) - label immediately after the switch (short) We jump to the label after we've performed the switch. label refers to label immediately after corresponding endof_switch. - endof_switch (10) - enter_switch_arm (11) - constructor id (cons_id) - label id (short) label refers to label for next switch arm. - endof_switch_arm (12) - label id (short) Label id refers to label immediately before next switch arm. (?) - enter_if (13) - else label id (short) - follow label id (short) label refers to label at endof_if Note that we must've pushed a failure context before entering the enter_if. If the condition fails, we follow the failure context. - frame pointer tmp (short) XXX: hmm... dunno.. - enter_then (14) - frame pointer temp (short) XXX: what's this for? XXX: should have flag here? [I wrote this note in a meeting. What in hell did I mean?] - endof_then (15) XXX: enter_else is a better name. - follow label (short) XXX: label just before endof_if ??? - endof_if (16) - enter_negation (17) - label id (short) label refers to label at endof_negation. Note: As with if-then-else, we must push a failure context just before entering enter_negation. If the negation fails, we follow the failure context. - endof_negation (18) - enter_commit (19) - temp (short) XXX: what's this for? % XXX: how does this work? - endof_commit (20) - temp (short) XXX: what's this for? - assign (21) - Variable A in slots (short) - Variable B in slots (short) A := B. Copy contents of slot B to slot A. - test (22) - Variable A in slots (short) - Variable B in slots (short) Used to test atomic values (int, float, etc). Before entering test, a failure context must be pushed. If the test fails, the failure context is followed. - construct (23) - variable slot (short) - constructor id (cons_id) - list length of next arg (short) - list of: - variable slot (short) Apply constructor to list of arguments (in list of variable slots) and store result in a variable slot. - deconstruct (24) - variable slot Var (short) - constructor id (cons_id) - list length of next arg (short) - list of: - variable slot (short) If cons_id is: - a functor applied to some args, then remove functor and put args into variable slots. - an integer constant, then check for equality of the constant and the value in the variable slot - a float constant, then check for equality of the constant and the value in the variable slot. - anything else, then makes no sense and interpreter should raise error. XXX: correct? Note: We must push a failure context before entering deconstruct. If the deconstruct fails (i.e. functor of Var isn't the same as cons_id, or ints are not equal, or floats are not equal), then we must follow the failure context. - complex_construct (25) - var (short) - cons id (cons_id) - list length (short) - list of: - var (short) - direction (dir) This used for general unification using partially instantiated terms. This is made possible by bromage's aliasing work. - complex_deconstruct (26) - variable slot (short) - constructor id (cons_id) - list length of next arg (short) - list of - variable slot (short) - direction (dir) Note: This is a generalised deconstruct. The directions specify which way bindings move. XXX: This is still not 100% crystal clear. - place_arg (27) - register number (byte) XXX: Do we have at most 256 registers? - variable number (short) Move number from variable slot to register. (Note: See notes for pickup_arg.) XXX: We will need to #include imp.h from ther Mercury runtime, since this specifies the usage of registers. For example, we need to know whether we're using the compact or non-compact register allocation method for parameter passing. (The compact method reuses input registers as output registers. In the non-compact mode, input and output registers are distinct.) - pickup_arg (28) - register number (byte) - variable number in variable slots (short) Move argument from register to variable slot. (Note: We currently don't make use of floating-point registers. The datatype for pickup_arg in the bytecode generator allows for distinguishing register `types', that is floating-point register or normal registers. We may later want to spit out another byte `r' or `f' to identify the type of register.) - call (29) - module id (cstring) - predicate id (cstring) - arity (short) - procedure id (byte) XXX: If we call a Mercury builtin, the module name is `mercury_builtin'. What if the user has a module called mercury_builtin? - higher_order_call (30) - var (short) - input variable count (short) - output variable count (short) - determinism (determinism) - builtin_binop (31) - binary operator (byte) This single byte is an index into a table of binary operators. - argument to binary operator (op_arg) - another argument to binary operator (op_arg) - variable slot which receives result of binary operation (short) XXX: Floating point operations must be distinguished from int operations. In the interpreter, we should use a lookup table that maps bytes to the operations. - builtin_unop (32) - unary operator (byte) An index into a table of unary operators. - argument to unary operator (op_arg) - variable slot which receives result of unary operation (short) - builtin_bintest (33) - binary operator (byte) An index into a table of binary operators. - argument to binary test (op_arg) - another argument to binary test op_arg) Note we must first push a choice point which we may follow should the test fail. - builtin_untest (34) - unary operator (byte) An index into a tabler of unary operators. - argument to unary operator (op_arg) Note we must first push a choice point which we may follow should the test fail. - semidet_succeed (35) - semidet_success_check (36) - fail (37) - context (38) - line number in Mercury source that the current bytecode line corresponds to. (short) XXX: Still not clear how we should implement `step' in a debugger since a single context may have other contexts interleaved in it. - not_supported (39) Some unsupported feature is used. Inline C in Mercury code, for instance. Any procedure thatr contains inline C (or is compiled Mercury?) must have the format: enter_pred ... not_supported endof_pred